Mining biomedical texts

Preview:

Citation preview

Lars Juhl Jensen

>10 km

Mining biomedical texts

exponential growth

some things are constant

~45 seconds per paper

information retrieval

find the relevant texts

still too much to read

computer

as smart as a dog

teach it specific tricks

named entity recognition

identify the concepts

comprehensive lexicon

small molecules

proteins

cellular components

organisms

diseases

orthographic variation

“black list”

Reflect.ws

augmented browsing

browser add-on

Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009O’Donoghue et al., Journal of Web Semantics, 2010

Firefox

Internet Explorer

Google Chrome

Safari

Utopia Documents

web services

~150 years of publishing

dead wood

dead e-wood

added value

collaboration

SciVerse application

STITCH

Kuhn et al., Nucleic Acids Research, 2010

curated knowledge

drug targets

pathways

Letunic & Bork, Trends in Biochemical Sciences, 2008

experimental data

physical interactions

Jensen & Bork, Science, 2008

text mining

co-mentioning

NLPNatural Language Processing

abstracts

full text

restricted access

collaboration

electronic patient journals

a hard problem

in Danish

no lexicon

by busy doctors

acronyms

typos

about psychiatric patients

delusions

domain specific system

F20

F200

Negation

Family

diagnoses

patient stratification

Roque et al., PLoS Computational Biology, 2011

disease comorbidity

Roque et al., PLoS Computational Biology, 2011

medication

adverse drug events

pharmacovigilance

phenotype

genotype

Reflect.wsSune Frankild

Heiko HornEvangelos Pafilis

Michael KuhnReinhardt Schneider

Sean O’Donoghue

SciVerse appJuan-Carlos Silla-Castro

Sean O’Donoghue

EPJ-miningFrancisco S RoquePeter B JensenRobert ErikssonHenriette SchmockMarlene DalgaardMassimo AndreattaThomas HansenKaren SøebySøren BredkjærAnders JuulThomas WergeSøren Brunak

Thank you!

larsjuhljensen

Recommended