Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Adjudication+of+Variants/Genomes+for+Clinical+Relevance+
NCDS%Leadership%Summit,%Chapel%Hill,%NC,%%April%23rd,%2013%Martin%G.%Reese%
Co@Founder,%President%and%Chief%Scientific%Officer%
+%+
Genomes+from+Sequence+Pioneers+
Venter
Caucasian
Caucasian
Watson
Quake
Korean Chinese
African African*
African*
Helicos
Roche/454
Life SOLiD
Complete Genomics
Sanger
Illumina Illumina
Life SOLiD
Illumina
Life SOLiD
3,161,108
2,794,412
3,251,526
3,075,865
3,274,361
3,439,112 3,074,102
4,082,273 4,241,046
4,192,785
PG1011 Life SOLiD 2,972,853
Church Complete Genomics 3,003,498
Tutu Life SOLiD 3,624,334
Bushman Roche/454 4,053,781
d =
U Ns – ( Ns NL)
Ns
PG1012 Illumina 3,540,984
Reese et al, Genet Med. 2011 Mar;13(3):210-7!
Distribution+of+Non>Synonymous+Variants+of+Genomic+Sequence+Pioneers+within+disease+categories+
0
100
200
300
400
500
600
AGIN
G
CARD
IOVA
SCUL
AR
DENT
AL
DRUG
S, C
LINIC
AL P
HARM
ACOL
OGY
AND
ENVI
RONM
ENT
ENDO
CRIN
OLOG
ICAL
AND
MET
ABOL
IC
GAST
ROIN
TEST
INAL
HEMI
C AN
D LY
MPHA
TIC
IMMU
NOLO
GICA
L, CO
NNEC
TIVE
TIS
SUE
AND
JOIN
TS
INFE
CTIO
US D
ISEA
SE
KIDN
EY A
ND U
RINA
RY T
RACT
NEON
ATAL
NEUR
OLOG
ICAL
NUTR
ITIO
N
ONCO
LOGI
CAL
OTHE
R
PSYC
HIAT
RIC
RESP
IRAT
ORY
SIGH
T
SOUN
D
Yoruba (NA18507/Illumina)
Yoruba (NA18507/SOLiD)
Yoruba (NA19240)
Han Chinese (18987735)
Chinese (with INDELS)
Korean (19470904)
Korean (19470904_INDELS)
Caucasian (NA07022)
CEPH/Utah (NA12878)
J. Craig Venter (17803354)
James D. Watson (18421352)
Stephen Quake (19668243)
PG14
Reese et al, Genet Med. 2011 Mar;13(3):210-7!
Providing+Clinical+Context+for+Genomic+Data+
4
• Compare+sequence+vs.+references,+disease+genes+&+variants+
• Integrate+research+studies,+clinical+records,+contextual+info++
• Apply+rules,+assess,+prioritize+and+report+• Customized+rules,+views,+reporting+for+research+and+clinical+applications+• VAAST+reports+• Genome+database+comparisons+
• Accurate+variant+calling+
• High+quality+clinical+mutation+databases+
• Interpretation+of+whole+genomes+to+identify+few+relevant+variants+for+patients+
• Routine+whole+genome+diagnosis+and+reporting:+green,+yellow+and+red+
Four Challenges of Clinical Genome Interpretation
• Issues
– SNVs called with >99%, Indels >95% accuracy – CNVs still poorly called from NGS data – Exome coverage < 95% of true exome – Low complexity regions and gene families called poorly
• Clinically relevant examples: – 2D6 gene – HDL region
– Genetic heterozygosity in tumor sample sequencing • Suggested Solutions
– Family-based sequencing and variant calling – Phased genome sequencing – Population-based variant calling – Technology development driven through 1000Genomes project – Quality standards to measure accuracy – Big Data: Better, but more expensive variant calling
1. Accurate Variant Calling
Clinical Grade
• Issues – OMIM limited to 20,000 variants – still the best! – HGMD ~100,000 public, 137,000 private variants (at least 10% of
variants are not pathogenic) – LSDBs not centralized and of varying quality – High quality, trust-worthy source missing – dbGaP difficult to get access to full patient data due to access
restrictions
• Suggested Solutions – ClinVar database and Genetic Testing Registry (GTR) – Community annotation -> Learn from Gene Ontology (GO)?
• Tightly controlled by small leadership group (M. Ashburner, S. Lewis, R. Apweiler, J. Blake)
– U41 grant “A unified clinical genomics database” (Ledbetter, Martin, Mitchel, Nussbaum, Rehm)
– Functional annotations and algorithms needed that do not rely on known disease mutations
2.+High+quality+clinical+mutation+databases+
A"TEST"CASE:"MILLER"SYNDROME"
A Test Case: Miller Syndrome
G->R!
G->A!
G->A!
G->R!
G->A!
G->R!
M! F!
B! S!
R->Q!
R->!
R->!
R->Q!
R->!
R->Q!
M! F!
B! S!
*!
*! *!
CHR 16: DHODH! CHR 5: DNAH5!
Ng et al, Nature Genetics 42, 30–35, 2010!Roach, et al, Science , 328 636, 2010!
Alleles Responsible for MILLER SYNDROME in Utah Kindred*
VAAST+results+on+Miller+Syndrome+
• Issues – ~4 Million variants to interpret, where to look first – Disease panel interpretation not sufficient – missing variants
outside panels – Shall we interpret only known variants?
• What about a stop codon in a significant genes such as APOE, or a frame shift in BRCA1/2?
– When to stop looking • Howard Green: “If a Bioinformatician keeps looking, he will always find something”
– Very time consuming (1 hour to 200+ hours per genome) • ~20,000 genomes in 2012 => 1M genomes in 2017
– Pathogenic in one patient but not in another • Interpretation only possible within genomic background
• Suggested Solutions – Integrated analysis solutions needed
• Better algorithms trained on larger datasets • Genomic load or burden tests for entire phenotypes and diseases
– Interactive, expandable, exploratory systems needed – Disease-specific genomic expert groups
• Developing standards
Interpretation+of+whole+genomes+to+identify+few+relevant+variants+for+patients+
Highest+scoring+variants:+TruSight+Inherited+Diseases+
Highest+scoring+variants:+Evidence+only+
Highest+scoring+variants:+SIFT/Polyphen+
Highest+scoring+variants:+Omicia+Score+
• Issues – What is the right candidate gene set for a disease? Have we looked
everywhere? – Interactive reporting systems needed that deal with varying
interpretation workflows for whole genomes – Storage of whole-genome information – Do we need clinical trials for every panel, every indication?
• Suggested Solutions – Family-based sequencing and analysis for early childhood diseases
• Intramural ClinSEQ project (Biesecker): >25% of diagnosis for unknown disease patients
– Cancer panel sequencing • Toronto cancer hospital (McPhearson/Stein): >25% cancer patient get genomic-informed treatment
recommendation
– CLARITY-like community experiments needed – Storage:
• EMR: Genome Interpretation reports • Biobanks/personal banking systems: FASTQ files (150Gb), VCF files (500Mb)
Routine+whole+genome+diagnosis+and+reports:+green,+yellow+and+red+
Cardiomyopathy panel (content by H. Rehm)
Omicia, Inc.
Omicia Lab Accession: 2014242200 Powell Street, Suite 525, Emeryville, CA 94608 Patient: Affected Daughter NA12879Phone: 510-595-0800 Fax: 510-588-4523 Sex: Not Specifiedhttps://app.omicia.com Ethnicity: Not Specified
Omicia Panel ReportPanel: Cardiomyopathy Rep. Date: 2013/03/18
RESULT SUMMARY
Enter summary here...
PATHOGENIC VARIANTS
Gene Variant VariantClass
Effect VariantFreq
OmiciaScore
Condition Description References
TTN c.568_568delGp.Glu190fsHeterozygous
LikelyPathogenic
frameshiftdeletion
- 0.869 Dilated Cardiomyopathy TTN encodes the sarcomeric protein Titin. Two mutationswere found in the Z-line region of Titin in patients withDilated Cardiomyopathy. The mutations led to adecreased affinity of Titin to Z-line proteins. Titinmutations cause Autosomal Dominant DilatedCardiomyopathy.
OMIM:188840#0009
VARIANTS OF UNKNOWN SIGNIFICANCE
Gene Variant VariantClass
Effect VariantFreq
OmiciaScore
Condition Description References
CASQ2 c.926A>Gp.Asp309GlyHeterozygous
UnknownSignificance A
non-synon - 0.867 Catecholamine-InducedPolymorphic VentricularTachycardia
CASQ2 encodes the protein Calsequestrin 2.Catechnolamine-induced Polymorphic VentricularTachycardia can be found in both Autosomal Recessiveand Dominant forms. The Recessive form is associatedwith CASQ2, in which a missense mutations havesegregated with the disease. A deletion has also beenassociated with the development of the disease.
CAV3 c.311T>Cp.Val104AlaHeterozygous
UnknownSignificance A
non-synon - 0.882 Hypertrophic Cardiomyopathy CAV3 is the Caveolin-3 gene. One missense mutation inthe CAV3 was found to cause a case of HypertrophicCardiomyopathy. The Caveolin-3 protein is responsiblefor negatively regulating endothelial Nitric OxideSynthase. In mouse studies, it is believed that themoderate increase of eNOS due to mutated Caveolin-3may lead to the development of HypertrophicCardiomyopathy.
DSC2 c.2687_2688insGAp.Glu896fsHeterozygous
UnknownSignificance A
frameshiftinsertion
- 0.731 Arrhythmogenic RightVentricularDysplasia/Cardiomyopathy
DSC2 encodes Desmocollin-2, a desmosomal cadherinprotein. Heterozygous mutations were discovered incases of Arrhythmogenic Right VentricularDysplasia/Cardiomyopathy. Frameshifts and prematuretruncations are the common mutation forms for thisdisease-gene association.
TTN c.96872G>Ap.Arg32291GlnHeterozygous
UnknownSignificance A
non-synon C:99%T:1%
0.869 Dilated Cardiomyopathy TTN encodes the sarcomeric protein Titin. Two mutationswere found in the Z-line region of Titin in patients withDilated Cardiomyopathy. The mutations led to adecreased affinity of Titin to Z-line proteins. Titinmutations cause Autosomal Dominant DilatedCardiomyopathy.
PANEL DESCRIPTION
The Cardiomyopathy Panel includes genes that are strongly tied to inherwited Cardiomyopathies. Expertly curated by Dr. Heidi Rehm and her team at Harvard�s Laboratory of Molecular Medicine,these target genes have been associated with Hypertrophic Cardiomyopathy (HCM), Dilated Cardiomyopathy (DCM), Restrictive Cardiomyopathy (RCM), Arrhythmogenic Right VentricularCardiomyopathy (ARVC), Catecholaminergic Polymorphic Ventricular Tachycardia (CPVT), and Left Ventricular Non-Compaction (LVNC). Also covered in this panel are other syndromes that presentCardiomyopathy as a symptom - such as Danon and Fabry diseases, Barth syndrome, and Transthyretin Amyloidosis. The gene selection was based upon findings from scientific literature andopinions from the Harvard-affiliated Laboratory of Molecular Medicine, which, as of 2012, has nine years of genetic testing experience for inherited Cardiomyopathies. Their work has since beenconsolidated into a publication in the Journal of Molecular Diagnostics (PMID: 23274168). Omicia, Inc. has enhanced this gene list by adding detailed descriptions of these gene-diseaserelationships.
Gene Condition Description
ABCC9 Idiopathic Dilated Cardiomyopathy Genetic analysis of individuals with Idiopathic Dilated Cardiomyopathy identified mutations in the conserved regions of the ABCC9 gene.Frameshift and missense mutations were discovered, with neither being present in unrelated control individuals. ABCC9 encodes aregulatory subunit of the cardiac Potassium-ATP channel, and is essential in maintaining cellular homeostasis under stress. Thedisruption of the ABCC9 leads to disregulated channel gating, and thereby may lead to conduction abnormalities in the heart as well.
ACTC1 Familial Hypertrophic Cardiomyopathy ACTC1 encodes a sarcomeric protein that is involved in the development of two forms of Cardiomyopathy - Idiopathic DilatedCardiomyopathy and Familial Hypertrophic Cardiomyopathy. Mutations in ACTC1 that affect sarcomere contraction result in FamilialHypertrophic Cardiomyopathy. There have been both inherited and de novo cases of ACTC1 mutations. Familial HypertrophicCardiomyopathy is inherited in an Autosomal Dominant disorder.
ACTC1 Idiopathic Dilated Cardiomyopathy ACTC1 encodes a sarcomeric protein that is involved in the development of two forms of Cardiomyopathy - Idiopathic DilatedCardiomyopathy and Familial Hypertrophic Cardiomyopathy. Mutations in ACTC1 that affect force transmission within the sarcomereresult in Idiopathic Dilated Cardiomyopathy. Missense mutations were found in conserved regions of the protein - namely theattachment points of the Z bands and the intercalated discs.
ACTN2 Dilated Cardiomyopathy The ACTC2 gene produces the protein alph-Actinin 2. Mutations in ACTC2 in individuals with Dilated Cardiomyopathy have beendiscovered. The ACTC2 mutation disrupts its interaction with Muscle LIM protein (MLP).
ACTN2 Hypertrophic Cardiomyopathy The ACTC2 gene produces the protein alph-Actinin 2. Although mutations in ACTC2 have been associated with HypertrophicCardiomyopathy, the genetic causes of the disease is still unclear. Genome analysis of families affected by HCM pinpointed a number ofACTC2 mutations involved in the pathogenesis of Hypertrophic Cardiomyopathy.
ANKRD1 Dilated Cardiomyopathy ANKRD1 encodes the Cardiac Ankyrin Repeat protein. The gene has been associated with Dilated Cardiomyopathy, accounting for 1.9%of DCM cases. The majority of cases expressed ANKRD1 mutations in a heterozygous manner.The have been found in both sporadic and
• Accurate+variant+calling+–+2+years+
• High+quality+disease>oriented+clinical+mutation+databases+–+3>5+years+
• Interpretation+of+whole+genomes+to+identify+few+relevant+variants+for+patients+–++3>10+years+– Early+childhood+disease+and+cancer+leading+
• Routine+whole+genome+diagnosis:+green,+yellow+and+red+–+<20++years+
Four Challenges of Clinical Genome Interpretation
This work was supported by NIH SBIR grants R44HG3667, R44HG2992, and R41HL83571 to Omicia and RC2HG5619 to Yandell/Omicia all administered by the NHGRI and NHLBI.
Acknowledgements
Mark Yandell Lab Lynn Jorde Lab
Gabor Marth Lab
Francisco de la Vega
Karen Eilbeck
Carlos Bustamante Lab