53
Translating exome and whole genome sequencing to the clinic Winter School in Mathematical and Computational Biology Institute for Molecular Bioscience, University of Queensland 9 July 2014 Winter School in Mathematical and Computational Biology Institute for Molecular Bioscience, University of Queensland 9 July 2014 Marcel Dinger Head of Clinical Genomics & Genome Informatics Garvan Institute of Medical Research Sydney Marcel Dinger Head of Clinical Genomics & Genome Informatics Garvan Institute of Medical Research Sydney

Translating exome and whole genome sequencing to the clinic

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Translating exome andwhole genome sequencing to the clinic

Winter School in Mathematical and Computational BiologyInstitute for Molecular Bioscience, University of Queensland 

9 July 2014

Winter School in Mathematical and Computational BiologyInstitute for Molecular Bioscience, University of Queensland 

9 July 2014

Marcel DingerHead of Clinical Genomics & Genome Informatics

Garvan Institute of Medical ResearchSydney

Marcel DingerHead of Clinical Genomics & Genome Informatics

Garvan Institute of Medical ResearchSydney

• Kinghorn Centre for Clinical Genomics

• Clinical applications of genomic medicine

• Implementation challenges

• The future

Overview

Kinghorn Centre for Clinical GenomicsKCCG was established at Garvan Institute of Medical Research in October 2012.

The service is delivered in collaboration with the neighboring St Vincent’s Hospital and their pathology service (SydPath).

Multidisciplinary team of 22 (and growing!) comprising laboratory scientists, bioinformaticians, software developers, geneticists and PhD students.

Kinghorn Centre for Clinical Genomics

Two Illumina HiSeq 2500s capable of sequencing ~2 terabases per week (~350 exomes at 150X mean coverage).

Dedicated high-performance computing cluster (1400 CPU cores, 10 TB of memory, 1 PB of storage)

NATA accreditation (ISO15189 -medical testing) for exome sequencing and cancer enrichment panel scheduled for late 2014.

In January 2014, KCCG became one of the world’s first sites to order the Illumina HiSeq X Ten (more on that later!)

Why the genome?The human genome provides the instruction for our development and function.

Understanding the genetic basis for disease is a critical component not only in diagnosing and selecting treatment, but is also crucial for the design of new therapies.

Mutations in the sequence that we inherit from our parents or that accumulate during life are the basis of the majority of human diseases, including cancer.

Genomic sequencing has the potential to impact tremendously on health care treatment and prevention.

Clinical Genomics: what are the opportunities?

Clinical genomics will (initially) have a major impact in three key areas:

1. Accurate diagnosis of inherited diseases, including rare diseases and intellectual impairment

2. Molecular stratification of cancer to direct treatment pathways

3. To optimise drug choices and drug dosages based on an individual’s genotype (pharmacogenomics)

Diagnosis of inherited disease

Genetic testing has traditionally been done on a “per gene” basis.

Humans have ~21,000 protein-coding genes -mutations in any of ~4,500 of these genes has been associated with an inherited disease.

Rare diseases can be especially difficult to diagnose and clinicians are left to make informed guesses as to which gene test to order.

With testing taking weeks to perform (often overseas) and costing upto $2,500 per test - such diagnoses can be extremely time-consuming and expensive.

Whole genome (and exome) sequencing

Mendelian disordersRare (but not collectively) monogenic diseases - many thousands solved, >2,000 left.

Increased access to genomics has led to rapid identification of Mendelian disease genes.

0

23

45

68

90

3/2009 1/2010 3/2010 1/2011 3/2011 1/2012 3/2012 1/2013

Mendelian Disease Genes Published per Quarter

0.

22.5

45.

67.5

90.

3/2009 1/2010 3/2010 1/2011 3/2011 1/2012 3/2012 1/2013

Impact factor of journals of associated publications

Mendelian gene discovery: Gene discovery is accelerating to 1 new Mendelian disease gene/day

At currents rates, the primary allele for most recognized Mendelian disorders will be identified in 5 years

Example of simple diagnosis by WESThin Basement Membrane (Kidney)

Disease

normal TBMD

4 typical culprits:COL4A3COL4A4COL4A5AVPR2

COL4A3: p.Gly637Arg, c.1909G>A

G/A G/GG/A G/A

G/G

Mark Cowley, Tim Furlong

Intellectual Disability (23)Epileptic encephalopathy (11)Skeletal (5)Immune (4)Syndromic (3)Eye (3)Haematological (3)Neurological (seizures) (1)Metabolic (1)

Paediatric inherited disease cohort

Tony Roscioli and Lisa Ewans,Sydney Children’s Hospital Network

53 families from Sydney Children’s Hospital

Diverse phenotypes representative of a typical case-load for a clinical geneticist in a paediatric hospital

Majority of cases had tested negative for routine genetic tests.

Total of 122 exomes sequenced - mixture of trios, parent/child and individual.

Let’s look at a few illustrative examples…

Phenotype• Consanguineous family• Siblings – brother and sister• ID and severe seizure

disorder

Analyse exome data with HomozygosityMapper

Sibling 1

Sibling 2

Combined regions on Chr12 flagged

Example I: A case for homozygosityExome sequencing of

brother and sister

Tony Roscioli and Lisa Ewans

Example I: A case for homozygosityIdentify genes in homozygous region on Chromosome 12

Genes in homozygous region on Chr 12

Determine genes with damaging mutations using CADD (Combined Annotation Dependent

Depletion)

Conclude variant AGAP2 is the likely causative mutation

Literature evaluation of candidates

• AGAP2 (Centaurin family gene) participates in the prevention of neuronal apoptosis by enhancing PI3 kinase activity.

• Highly expressed in brain

CADD is a highly sensitive tool to distinguish between pathogenic and benign variants

Tony Roscioli and Lisa Ewans

Phenotype• 2 year 11 month old boy• Global developmental delay• Stopped standing, walking• Saudi family not known to be

consanguineous• Hypotonia, weakness• Feeding difficulties• Weight loss• Brisk deep tendon reflexes• Impaired upgaze: Niemann-Pick considered;

not confirmed on skin biopsy• 6 months later: his brother presented with

the same features

Example 2: A diagnostic odyssey

Michell Farrar, Tony Roscioli and Lisa Ewans

Investigations• MRI revealed cerebellar atrophy • No cardiac or eye features noted• POLG, SURF1, common and rare mitochondrial

mutations not identified• Urine: massive increase in dopamine metabolites

(degenerating leaky neurones)• Neurophysiology: Motor axonal neuropathy, active

denervation• Muscle biopsy revealed:

Denervation of small fibre groupsLarge re-inervated fibres

Tony Roscioli and Lisa Ewans

Homozygous regions extracted with HomozygosityMapper

Sibling 1

Sibling 2

Combined regions: Peak on Chrom 22

Exome sequencing of brother and sister

GEMINI analysis

Only one homozygous inframe codon deletion within a homozygous region

Make sure you look at

PLA2G6!

Expert advice

Paediatric neurologist,Dr Michelle Farrar

Mutations confirmed in IGV

Diagnostic Odyssey ended for this family -Prenatal / Pre-implantation diagnosis now possible and

family is confident to have a healthy child.

Extra Evidence: Same mutation described in the literature

Family Phenotype Inheritance Consanguinity Candidate gene? Likelihood of real result

1a and b ID, severe seizures AR Yes Centaurin family gene High/Novel

2 plus trio ID polymicrogyria De novo AD? No GRIN2B Medium‐High

3 Severe developmental delay, trigonocephaly

De novo AD? AR? None known (KANSL1) Low‐Medium

4 Hermansky‐Pudlak syndrome, oculocutaneous albinism

AR Yes ?NCOA3?PLS1 ??VOPP1

Low‐Medium/Novel

5 Cone‐rod dystrophy X‐linked recessive No KCND1 Unknown

6 Retinitis pigmentosa AD No Unknown

7 Retinitis pigmentosa AD No Possible SNRNP200 Unknown  ‐ patient limited exome after consent

8 plus trio Dysmorphic, ID, CdL like De novo AD? AR?  No SMC1A High

9 plus trio Severe DD, microcephaly, chylothoraces

De novo AD? AR? X‐linked? 

No paralog of ARHGEF6, known MR gene

Medium /Novel

10a and b Moderate ID, cerebellar hypoplasia

X‐linked No unclear unclear

11a and b Mild‐mod MR, cognitive decline, glove‐stocking weakness

X‐linked No unclear unclear

12a and b Mild‐mod ID X‐linked No unclear unclear

13a and b Severe ID, absent speech, hypotonia, microcephaly

X‐linked? AR? No unclear unclear

14a and b Neonatal Arthrogryposis AR comp het No RYR1 High

15a and b Neuronal Axonal Dystrophy AR Yes PLA2G6 High

Diagnostic yield

Cohort of 53 families:20/53 probable (reportable) diagnosis

(40%)16/53 possible novel variants (30%)

Varying degrees of success…

Is it economical?

SMN1 molecular testing $690Myotonic dystrophy DNA test on mother $506Neurological appt for assessment mother

$110Myasthenia Gravis DNA testing

$2,8002 Micoarrays –both babies

$1,2003 Pathologists opinion both babies incl UK

$720Laminin A molecular testing

$1,000Postmortem $3,000Muscle biopsy $144Total

$10,170

Cost summary: Case I

MRI brain $441Muscle biopsy $144Skin biopsy $50Surgical Session 4 h

$2,651Anaesthetist session 4h

$1,884Day Stay $1,946ICU 24h non ventilated $326EM concord $380Dry ice to Melbourne $80Mito analysis $1,400SNP arrays $1,200Nerve conduction $200Total $10,703

Cost summary: Case 2

Average per family for two exomes: ~$,2000

Assuming would only get to a diagnosis 40% of the time:Still a saving of $6,000 per family

Assume that half of the costs for medical care are still requiredStill an average saving of $3,000 per family

1,000 exomes per year would save $3 million per year

Whole genome sequencing will further increase diagnostic yields

Diagnostic yields for ID with WGS estimated >60%

Nature, June 2014

What would your genome tell you about yourself?

Genome sequencing can provide vast information on health risks and disease susceptibilities.

Of immediate benefit is carrier status of:(i) recessive disease for family planning(ii) cancer susceptibility genes (e.g. BRCA1/2)(iii) genes with known drug interactions (pharmacogenomics)

Direct-to-consumer genetic testing (e.g. 23andMe) is growing rapidly. However, test information is relatively limited and offers minimal clinical information.

“4.” “Wellness” or “fitness” genomics

Genome sequencing can provide vast information on health risks and disease susceptibilities.

Of immediate benefit is carrier status of:(i) recessive disease for family planning(ii) cancer susceptibility genes (e.g. BRCA1/2)(iii) genes with known drug interactions (pharmacogenomics)

Direct-to-consumer genetic testing (e.g. 23andMe) is growing rapidly. However, test information is relatively limited and offers minimal clinical information.

Clinical interpretation of whole genome or exome sequences is more valuable, but remains very time-consuming.

Many challenges remain before sequencing of newborns or well adults becomes clinically valuable.

What makes clinical genomics interesting to a researcher?

Recruit patient cohortSequence candidate

genesResearch Genetic test

8-10 years

Patient Diagnosis

Little interaction between research and implementation.

aditional model of research translation:

The Opportunity and the Challenge:Bilateral Translational Research

Genotype-PhenotypeDatabasePatient Genome Sequencing

Diagnosis

Research

Implementation of low-cost clinical whole genome sequencing will test whether this model can become reality.

Challenge 1. Clinical laboratory ≠ Research laboratory

Return of results to patients requires that sequencing and analyses are performed to a clinical standard.Most countries require a form of accreditation (e.g. CLIA/CAP in USA, NATA in Australia and New Zealand).Accreditation requires demonstration of clinical utility, precision and accuracy of the test.

Many variables including operator, instrument, reagent batch need to be routinely measured. Essentially all sources of variation and bias need to be accounted for and monitored. Other factors, such as persistent storage of data, independent validation, reporting and qualified expert interpretation also need to be considered.

The combination of these overheads place considerable additional cost in the delivery of genomic data to the clinic.

For clinical delivery, we need to massively streamline the process from end to end…

The journey from consult to report is long and complex…

manage this journey in a clinical environment in the fast moving field of genomics?

Software Development for the Clinic

Historically upgrades and change management are stressful and scary in a clinical setting

Our process is not complete but strives for continuous improvement while retaining accuracy, documentation and accountability

We build everything around the idea of constant change

Continuous Integration

Tracking and attribution of all commits and failures, with JIRA integration.

Genotype-Phenotype Database

Challenge 3. Clinically robust genotype-phenotype database

Development of federated database recording genotype-phenotype relationships - allow “Patients Like Mine” searches.

Requires international collaboration between sequencing centers and careful recording of clinical phenotypes.

The Human Gene Mutation Database (HGMD) is perhaps the gold-standard database for association of genotype with literature-annotated phenotypes (~7,000 entries).

Controlled vocabulary for phenotyping is essential. Absence of characteristics can be just as important as presence of characteristics for identification of disease-causing variants.

However, much of the literature is pre-genomic era: many annotations are incorrect. A mutation in HGMD cannot be assumed clinically relevant - extensive professional scrutiny is still required to make a clinical diagnosis.

Challenge 4. Delineation between clinic and research

Many tools and databases are not suitable for clinical use….

PROVEAN

clinical delivery will require a clearer delineation between clinic and research

Homozygosity Mapper

The arrival of the “$1,000” genome

Illumina HiSeq X Ten

Fleet of 10 instruments. First 6 installed -remaining 4 to come by October.16 whole human genomes (>30X ~2 Tbases) every 3 days. Full capacity is 350 genomes per week or 18,000 per year.Real-world deliverable cost of 1,600 AUD per genome (interpretation costs will vary!)

Possibility for population-scale sequencing and implementation into routine healthcare.

Currently provided as an international service for research purposes - clinical accreditation targeted for 2015.

Slides from Jay Flatley @ Goldman Sachs Healthcare Conference18/1/2014

What’s different about HiSeq X?

HiSeqX TenSix @ KCCG

Machine Yield_G PCT_Q30

ST‐E00141 928.4 80.95

ST‐E00141 991.4 79.2

ST‐E00118 827.6 81.65

ST‐E00118 1035.4 87.8

ST‐E00118 986.6 84.6

ST‐E00118 1014.6 87.65

ST‐E00110 936 86.25

ST‐E00110 794.2 82.85

ST‐E00106 977.2 88.2

ST‐E00106 954 87.8

1 failed sample

40x30x

PERFORMANCE AT KCCG

runs post June 2014-firmware upgrade2 bad quality samples/preps

30/9/2014

PERFORMANCE AT KCCG

runs post June 2014-firmware upgrade2 bad quality samples/preps

30/9/2014

So what are we going to sequence?

SummaryMany of the practical barriers for implementation of genomic medicine have been solved. Major limitations today are regulatory and societal (e.g. insurance).

WES/WGS for diagnosis of inherited disorders is now practical and valuable with diagnostic yields approaching 50%.

Genomic medicine blurs the lines between clinical and research. Many challenges remain in translating genomics in the clinic - particularly in the repurposing of research-grade software and tools for clinical applications.

Clinical genomics represents a unique opportunity for bilateral translational medicine - mutually benefiting both the clinical research realms.

Clinical-grade bioinformatics is tricky - but it presents many valuable lessons for researchers. High up-front investment, but many incidental errors can be avoided.

High quality data relating genotype to phenotype is scarce. This vastly limits diagnostic accuracy without a phenotype (i.e. in well individuals) - lots of false positives.

Tony RoscioliLisa Ewans

Michael BuckleyScott Mead

Michelle FarrarGlenda MullanGeorge ElakisCorrina Walsh

Tony RoscioliLisa Ewans

Michael BuckleyScott Mead

Michelle FarrarGlenda MullanGeorge ElakisCorrina Walsh

Acknowledgements

Sydney Children’s Hospital and SEALS Pathology

Warren KaplanMark CowleyMark McCabe

Kerith-Rae DiasPaula Morris

Jiang TaoAga Borcz

Dahlia SaroufimClaire Horvat

Liviu ConstantinescuPeter BuddKevin YingDerrick Lin

Shanny DyerRussell HowardBronwyn TerrillAmber Johns

Warren KaplanMark CowleyMark McCabe

Kerith-Rae DiasPaula Morris

Jiang TaoAga Borcz

Dahlia SaroufimClaire Horvat

Liviu ConstantinescuPeter BuddKevin YingDerrick Lin

Shanny DyerRussell HowardBronwyn TerrillAmber Johns

AcknowledgementsAcknowledgements

Kinghorn Centre for Clinical Genomics