36
When Healthcare Meets Data Science Anastasiia Kornilova

NLP approach for medical translation task

Embed Size (px)

Citation preview

Page 1: NLP approach for medical translation task

When Healthcare Meets Data Science

Anastasiia Kornilova

Page 2: NLP approach for medical translation task

http://www.slideshare.net/WebCongress/mars-one-bas-lansdorp

Page 3: NLP approach for medical translation task

http://www.slideshare.net/WebCongress/mars-one-bas-lansdorp

Page 4: NLP approach for medical translation task

The Medicine of the Future

Page 5: NLP approach for medical translation task

http://www.healthbizdecoded.com/2013/05/hies-meeting-the-sustainability-challenge/

Page 6: NLP approach for medical translation task

http://www.wellcentive.com/blog/the-relationship-between-hie-and-population-health-management/

Page 7: NLP approach for medical translation task

http://graphics.wsj.com/infectious-diseases-and-vaccines/

Page 8: NLP approach for medical translation task
Page 9: NLP approach for medical translation task

«One or two patient died per week in a certain smallish town because of the lack of information flow between the hospital’s emergency room and the nearby mental health clinic»

[«Doing Data Science», O’Neil ]

Page 10: NLP approach for medical translation task

60% of US doctors still use paper medical records

Page 11: NLP approach for medical translation task

Let’s create our own EHR standard

Page 12: NLP approach for medical translation task

Patient gender Code

Male 0

Female 1

Patient gender Code

Male 1

Female 0

Patient gender Code

Male M

Female F

Unknown U

Let’s code gender

Standart A

Standart B

Standart Cx

x

Page 14: NLP approach for medical translation task

There 5 important data standards

ICD - diagnostic, billing , world-wide

CPT - procedures, billing , US-specific, classification

LOINC - lab tests and observations, world-wide

NDC - medication, US-specific, classification

SNOMED - medicine

Page 15: NLP approach for medical translation task

… and a lot of custom dictionaries

Page 16: NLP approach for medical translation task

Even within one data standard:ICD-9

174 malignant neoplasm of female breast

174.1 malignant neoplasm of central portion of female breast

ICD-10

C50 malignant neoplasm of breast

C50.1 malignant neoplasm of central portion of breast

C50.111 malignant neoplasm of central portion of right female breast

C50.112 malignant neoplasm of central portion of left female breast

Page 17: NLP approach for medical translation task

You have to be a doctor to handle them

Page 18: NLP approach for medical translation task

Problem summary

Source 1

Source 2

Source N

medicine expertisea lot of (expensive) hours

Knowledge

Page 19: NLP approach for medical translation task

Standards are changing

Page 20: NLP approach for medical translation task

Artificial Intelligence Way

Feed a lot of medical texts to «medical doctor»

Use NLP power

Make it unsupervised

Page 21: NLP approach for medical translation task

Key idea:

«Semantically similar words occurs in similar contents» Harris, 1954 «You shall know a word by the company it keeps», Firth, 1957

Page 22: NLP approach for medical translation task

«It was the year when Udacity, Coursera and edX, the three leading MOOC companies, took the education world by storm and promised a lot» [Huffington Post]

«Many places offer MOOCs, and many more will. But Coursera, Udacity and edX are the leading providers.» [NYTimes]

Page 23: NLP approach for medical translation task

Distributed Vectors Representation

Two layer neural network

Input: text corpus

Output: set of vectors

Group the vectors of similar words together in vector space (detects similarities matematically)

Page 24: NLP approach for medical translation task

Predict a word using content

All

youneed

love

is

Page 25: NLP approach for medical translation task

Resulting vectors

Page 26: NLP approach for medical translation task

All you

need is

love

[0.2, 0.11, 087, 0.9, … , 0.2] [0.1, 0,98, 01, 0.26, …, 0.82] [0.7, 0.22, 0.3, 0.1, …, 0.45]

[0.5, 0.21, 0,67, 0.82,…, 0.49] [0.6, 034, 0.21, 0.45,…, 0.2]

Page 27: NLP approach for medical translation task

Vectors Relationships

Page 28: NLP approach for medical translation task

Vectors Relationships

Page 29: NLP approach for medical translation task

http://nlp.stanford.edu/projects/glove/images/company_ceo.jpg

Page 30: NLP approach for medical translation task

http://nlp.stanford.edu/projects/glove/images/comparative_superlative.jpg

Page 31: NLP approach for medical translation task
Page 32: NLP approach for medical translation task
Page 33: NLP approach for medical translation task

ICD-9

174 malignant neoplasm of female breast

174.1 malignant neoplasm of central portion of female breast

ICD-10

C50 malignant neoplasm of breast

C50.1 malignant neoplasm of central portion of breast

C50.111 malignant neoplasm of central portion of right female breast

C50.112 malignant neoplasm of central portion of left female breast

Page 34: NLP approach for medical translation task

Summary

Page 35: NLP approach for medical translation task

LinksEfficient Estimation of Word Representation in Vector Space (Mikolov)

Distributed representation of words and phrases and their compositionality (Mikolov)

word2vec Parameter Learning Explaining (Rong)

Page 36: NLP approach for medical translation task

Questions?