Upload
lars-juhl-jensen
View
160
Download
3
Embed Size (px)
DESCRIPTION
Citation preview
Making gene networks through data integration
Lars Juhl Jensen
association networks
guilt by association
molecular networks
proteins
string-db.org
small molecules
stitch-db.org
non-coding RNAs
compartments
compartments.jensenlab.org
tissues
tissues.jensenlab.org
diseases
data integration
computational predictions
gene neighborhood
Korbel et al., Nature Biotechnology, 2004
TargetScan
experimental data
gene expression
protein interactions
Jensen & Bork, Science, 2008
miRTarBase
curated knowledge
metabolic pathways
Letunic & Bork, Trends in Biochemical Sciences, 2008
signaling pathways
many databases
different formats
different identifiers
variable quality
not comparable
hard work
(Ph.D. students)
common identifiers
quality scores
von Mering et al., Nucleic Acids Research, 2005
score calibration
von Mering et al., Nucleic Acids Research, 2005
homology-based transfer
Franceschini et al., Nucleic Acids Research, 2013
missing most of the data
text mining
>10 km
too much to read
computer
as smart as a dog
teach it specific tricks
named entity recognition
comprehensive lexicon
let-7a-3p
let-7a*
flexible matching
let-7a
let7a
name expansions
let-7a
miR-let-7a
“black list”
SDS
co-mentioning
counting
within documents
within paragraphs
within sentences
high recall
high precision
fuzzy associations
NLPNatural Language Processing
Gene and protein namesCue words for entity recognitionVerbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
extract stated facts
high precision
poor recall
Jensen et al., Nature Reviews Genetics, 2006
questions?