Upload
alexandra-day
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
The State of Whole Genome, Exome and Transcriptome Sequencing
September-2011(WGS/WES/RNAseq)
Information Sources
dbGAP EGA GEO ArrayExpress NIH-RePORTER (all grants 2007-2011) BROAD data portal Websites of other major sequencing centers
Microsoft Office Excel WorksheetData Tracking File
Cancer
Available in dbGAP• TCGA -Tumor vs. Matched Normal
⁻ http://cancergenome.nih.gov⁻ 4694 in consent group⁻ WGS/WES – no statement as to how many sequenced
• National Cancer Institute Cancer Genome Characterization Initiative (CGCI)⁻ http://cgap.nci.nih.gov/cgci.html⁻ Medulloblastoma
– 22 pediatric + 1 matched normal– Targeted sequencing of protein coding genes and miRNA
⁻ Non-Hodgkin Lymphoma– WGS or WES of 1 Follicular Lymphoma and 13 diffuse large B-cell lymphoma + matched constitutional
DNA– RNAseq of 92 DLBCL, 12 FL and 8 B-cell NHL cases with other histologies and 10 DLBCL-derived cell
lines
⁻ Not yet available– HIV + Tumor Molecular Characterization Project – Tumor vs. matched normal – 100 cases - WGS
» Diffuse large B-cell lymphoma » Lung » HPV-related Cancer
• Towards a Genomic Understanding of Myeloma⁻ Multiple Myeloma – DNA from tumor cells and normal peripheral blood⁻ 38 consented (23 WGS, 16 WES, 1 both)
• Discovery of Non-ETS Gene Fusions in Human Prostate Cancer using Next Generation RNA Sequencing
⁻ Prostate Cancer⁻ RNAseq of 25 samples
Cancer
Available in EGA• Human Colorectal Cancer Exome Sequencing
⁻ Study ID: EGAS00001000077 ⁻ Data Provider: Wellcome Trust⁻ WES of germline DNA from 70 individuals with colorectal cancer. Targeted sequencing of tumors
• Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma
⁻ Study ID: EGAS00001000006 ⁻ Data Provider: Wellcome Trust⁻ WES of 25 Renal Carcinoma samples
• Complex Landscapes of Somatic Rearrangements in Human Breast Cancer Genomes⁻ Study ID: EGAS00000000062 ⁻ Data Provider: Wellcome Trust⁻ WGS of 24 breast cancer genomes
• Exome sequencing of hyperplastic polyposis patients⁻ Study ID: EGAS00001000040 ⁻ Data Provider: Wellcome Trust⁻ WES of germline and tumors from 10? (20 samples available) patients with hyperplastic polyposis
• Mutations in Endometriosis-Associated Ovarian Carcinomas⁻ Study ID: EGAS00000000075 ⁻ Data Provider: Ovarian Cancer Research ⁻ RNAseq of 18 ovarian clear-cell carcinomas and one TOV21G ovarian clear-cell carcinoma cell
line
Cancer
Available in EGA• Massive Genomic Rearrangment Acquired in a Single Catastrophic Event
During Cancer Development⁻ Study ID: EGAS00000000029 ⁻ Data Provider: Wellcome Trust⁻ WGS of 16 Chronic Lymphocytic Leukemia samples
• Familial Melanoma Sequencing⁻ Study ID: EGAS00001000017 ⁻ Data Provider: Wellcome Trust⁻ WES of 15 individuals from eight families who have familial melanoma
• The patterns and dynamics of genomic instability in metastatic pancreatic cancer
⁻ Study ID: EGAS00000000064 ⁻ Data Provider: Wellcome Trust⁻ WGS of 13 patients with pancreatic cancer
• CLL Genome⁻ Study ID: EGAS00000000092 ⁻ Data Provider: Hospital Clinic Barcelona /The International Cancer Genome
Consortium ⁻ WGS (?) of two sets of 10 and 11 Chronic Lymphocytic Leukemia samples
Cancer
Available in GEO• GSE22260
⁻ Title: Comparative transcriptomic analysis of prostate cancer and matched normal tissue using RNA-seq⁻ RNAseq of 20 prostate tumors, 10 matched normals⁻ Kannan K, Wang L, Wang J, Ittmann MM et al. Recurrent chimeric RNAs enriched in human prostate
cancer identified by deep sequencing. Proc Natl Acad Sci U S A 2011 May 31;108(22):9172-7. PMID: 21571633
• GSE25183⁻ Title: Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated
lincRNA implicated in disease progression⁻ RNAseq of 21 prostate cell lines. Data also on 102 prostate tissues that will be made available in dbGAP⁻ Prensner JR, Iyer MK, Balbin OA, Dhanasekaran SM et al. Transcriptome sequencing across a prostate
cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat Biotechnol 2011 Jul 31;29(8):742-9. PMID: 21804560
• GSE27003⁻ Title: Deep Sequence Analysis of the Relationship between Gene Expression, CpG Island Methylation, and
Gene Copy Number in Breast Cancer Cells⁻ RNAseq/methylation/copy number in 8 breast cancer cell lines (ER+/ER-)⁻ Sun Z, Asmann YW, Kalari KR, Bot B et al. Integrated analysis of gene expression, CpG island methylation,
and gene copy number in breast cancer cells by deep sequencing. PLoS One 2011 Feb 25;6(2):e17490. PMID: 21364760
• GSE20156⁻ Title: RNA-Seq of melanoma short-term cultures and cell lines⁻ RNAseq of 8 melanoma short-term cultures and 3 cell lines⁻ Berger MF, Levin JZ, Vijayendran K, Sivachenko A et al. Integrative analysis of the melanoma
transcriptome. Genome Res 2010 Apr;20(4):413-27. PMID: 20179022
Cancer
Sequencing Grants (no data yet)• The BCM Tumor Genome Characterization Center
⁻ Project #: 1U24CA143843-01 ⁻ NCI⁻ PI: Andrew Wheeler – Baylor Genome Center⁻ WGS + RNAseq of 20-25 tumor types from 500 patients (+matched controls)⁻ Project end: July 2014
• Genome-Wide Discovery of Molecular Alterations in Head and Neck Cancer ⁻ Project #: 5RC2DE020957-02⁻ NIDCR – 2009 ARRA GO grant⁻ PI: David Sidransky - Johns Hopkins⁻ WES on 24 oral cancers. Tumor specific mutations will be analyzed in 48 new tumors + methylation,
RNAseq⁻ Project end: August 2012
• Whole Genome Sequencing of Myeloplastic Syndroms⁻ Project #: 5RC2HL102927-02 ⁻ NHLBI – 2009 ARRA⁻ PI: Timothy Graubert – Wash U⁻ WGS of 10 paired tumor/normal samples from patients with intermediate risk de novo MDS⁻ Project end: August 2012
• Comprehensive Prostate Cancer Characterization by Genomic and Transcriptomic Prof ⁻ Project #: 1R01CA152057-01A1⁻ NCI⁻ PI: Mark Rubin – Cornell⁻ RNA-seq and CNV on 100 frozen PCA samples and 25 paired benign prostate tissues⁻ Project end: July 2014
Nervous system disease - Autism
Sequencing Grants (no data yet)• Elucidating the Genetic Architecture of Autism by Deep Genomic Sequencing
⁻ Project #: 5R01MH089208-02 ⁻ NIMH - 2009 ARRA⁻ Collaboration between these sequencing centers: Broad, Vanderbilt, MSSM, UPenn, Baylor⁻ Initial step, deep sequencing of 1000 genes followed, by WGS. Does not say how many
subjects⁻ Project end: September 2012
• NIMH – 2009 ARRA (can’t find grant in NIH RePORTER)⁻ Children's Hospital Boston, the Broad Institute, and Harvard Medical School ⁻ WES of 85 middle eastern patients with a recessive form of autism
• Human Autism Genetics and Activity Dependent Gene Activation⁻ Project #: 5RC2MH089952-02 ⁻ NIMH – 2009 ARRA⁻ PI: Christopher Walsh - Children's hospital of Boston⁻ WES on 85 and WGS on 35 consanguineous individuals⁻ Project end: August 2012
• Genomic Identification of Autism Loci⁻ Project #: 5R01HD065285-02 ⁻ NICHD – 2009 ARRA⁻ PI: Evan Eichler – UW⁻ WGS on 20 individuals⁻ Targeted sequencing of 1320 simplex and multiplex families⁻ Project end: August 2012
Nervous system disease - Schizophrenia
Sequencing Grants (no data yet)• Whole-Genome Sequencing for Rare Highly Penetrant Gene Variants in Schizophrenia
⁻ Project #: 5RC2MH089915-02 ⁻ NIMH – 2009 ARRA⁻ PI: David Goldstein – Duke⁻ WGS of 100 individuals with a high genetic burden – Potentially associated genetic markers will
be genotyped in affected and unaffected family members ⁻ Project end: August 2011
• Whole Genome Sequencing of Bipolar Disorders and Schizophrenia ⁻ Project #: ⁻ NIMH – 2009 ARRA⁻ PI: Pamela Sklar – BROAD/MSSM⁻ WGS of SCZ (n=150), BP (n=150) and controls (n=100)⁻ Follow-up sequencing results in 3000 individuals and genotyping specific variants in over 18,000
individuals ⁻ Project end: August 2011
Available in GEO• GSE12297
⁻ Title: mRNA Sequencing Reveals Altered Synaptic Vesicular Transport in Post-Mortem Cerebellum
⁻ RNAseq of post-mortem cerebellar cortices of 14 patients and six matched controls⁻ Mudge J, Miller NA, Khrebtukova I, Lindquist IE et al. Genomic convergence analysis of
schizophrenia: mRNA sequencing reveals altered synaptic vesicular transport in post-mortem cerebellum. PLoS One 2008;3(11):e3625. PMID: 18985160
Nervous system disease - Other
Sequencing Grants (no data yet)• Gene Discovery in Recessive Structural Brain Disorders Through Whole Exome Sequencing
⁻ Project #: 5RC2NS070477-02 ⁻ NINDS – 2009 ARRA⁻ PI: Murat Gunel – Yale⁻ WES of 250 independent families from consanguineous unions⁻ Project end: August 2011
• Full Human Genome Sequencing in ALS⁻ Project #: 5RC2NS070342-02 ⁻ NINDS – 2009 ARRA⁻ PI: Robert Brown – Umass⁻ WGS of 40 individuals (20 sporadic, 20 familial) - genotype ranked variants in cohorts of 1,000 cases and 1,000
controls ⁻ Project end: August 2012
• Whole-Genome Sequencing in Multiplex Epilepsy Families⁻ Project #: 1RC2NS070344-01 ⁻ NINDS – 2009 ARRA GO grant⁻ PI: David Goldstein – Duke⁻ WGS of individuals with non-acquired epilepsy from families with an average of 3.5 affected (does not say how
many)⁻ Project end August: 2011
• A Haplotype Map for Multiple Sclerosis⁻ Project#: 2R01NS049477-06A1 ⁻ NINDS ⁻ PI: Stephen Hauser – UCSF⁻ WES of 120 pools of 50 DNA samples (MS subjects and controls) – will be done in the later parts of the project⁻ Project end: May 2015
Nervous system disease - Other
Available in EGA• Paroxysmal neurological disorders
⁻ Study ID: EGAS00001000048 ⁻ Data Provider: Wellcome Trust⁻ WES of 96 cases + controls of paroxysmal neurological
disorders focusing on migraine and epilepsy
Diabetes
Sequencing Grants (no data yet)• Low-Pass Sequencing and High-Density SNP Genotyping for
Type 2 Diabetes ⁻ Project #: 5RC2DK088389-02 ⁻ NIDDK – 2009 ARRA⁻ PI: Michael Boehnke – U Michigan⁻ WGS (4x) of 3,000 T2D case-control samples from the DGI,
FUSION, and WTCCC GWA sets⁻ Project end: August 2011
• Multiethnic Study of Type 2 Diabetes Genes ⁻ Project #: 5U01DK085526-04 ⁻ NIDDK⁻ PI: David Altschuler – BROAD⁻ WGS of multiethnic samples from the Jackson Heart Study,
Framingham Heart Study, Multi-Ethnic Cohort Study, and Diabetes Prevention Program (does not say how many)
⁻ Project end: July 2014
NHLBI GO Exome Sequencing Project (ESP)(https://esp.gs.washington.edu/drupal/)
2009 ARRA Groups participating
• Seattle GO - University of Washington, Seattle, WA• BroadGO - Broad Institute of MIT and Harvard, Cambridge, MA• WHISP - Ohio State University Medical Center, Columbus, OH• Lung GO - University of Washington, Seattle, WA• WashU GO - Washington University, St. Louis, MO• Heart GO - University of Virginia Health System, Charlottesville, VA• ChargeS GO - University of Texas Health Sciences Center at Houston
Populations• Women's Health Initiative (WHI)• Framingham Heart Study (FHS)• Jackson Heart Study (JHS)• Multi-Ethnic Study of Atherosclerosis (MESA)• Atherosclerosis Risk in Communities (ARIC)• Coronary Artery Risk Development in Young Adults (CARDIA)• Cardiovascular Health Study (CHS)• Genomic Research on Asthma in the African Diaspora (GRAAD)• Lung Health Study (LHS)• Pulmonary Arterial Hypertension (PAH) population• Acute Lung Injury (ALI) cohort• Cystic Fibrosis (CF) cohort
NHLBI GO Exome Sequencing Project (ESP)Available in dbGAP
NHLBI GO-ESP: Early-Onset Myocardial Infarction (Broad EOMI) Extremely early MI – Broad + UW Only for CVD research 219 in consent group
NHLBI GO-ESP: Lung Cohorts Exome Sequencing Project (Pulmonary Arterial Hypertension) Hypertension – UW + John’s Hopkins No use restriction 96 in consent group
NHLBI GO-ESP: Women's Health Initiative Exome Sequencing Project (WHI) – WHISP Heart, Lung, Blood disorders IRB approval required – General use 961 in consent group
NHLBI GO-ESP: Lung Cohorts Exome Sequencing Project (Cystic Fibrosis): Genetic modifiers of Pseudomonas aeruginosa (Pa) lung infection acquisition in cystic fibrosis CF – UW + John’s Hopkins Only CF research 91 in consent group
NHLBI GO-ESP: Lung Cohorts Exome Sequencing Project (Lung Health Study of Chronic Obstructive Pulmonary Disease) Mild lung function impairment – John’s Hopkins General use 337 in consent group
Building on GWAS for NHLBI-diseases: the U.S. CHARGE consortiumhttp://web.chargeconsortium.com
2009 ARRA• WGS of 230 African-American ARIC participants selected
from the top and bottom 10% of the HDL-cholesterol distribution
• PI: Eric Boerwinkle - University of Texas• Project end: July 2012
Other large medical sequencing projects
The Wellcome trust 500 genomes• http://www.well.ox.ac.uk/aug-11-genomes-of-500-people-to-
be-sequenced-in-full-detail• WGS of 500 people with a range of diseases• Collaboration with Illumina
The Human Medical Sequencing Program (MSP)• http://www.genome.gov/15014882#al-1• Initiatives
⁻ Uncloned, Mapped Autosomal Mendelian Diseases⁻ Uncloned X-Linked Diseases⁻ Allelic Spectrum in Common Disease⁻ Center Initiated Projects
• Approved projects⁻ http://www.genome.gov/20019648
• It is unclear if any of these are generating WGS or WES data
Other large medical sequencing projects
Therapeutic Applicable Research to Generate Effective Treatments (TARGET)• http://target.cancer.gov/• Discovery of valid therapeutic targets in childhood cancers• Data and collaborators
Additional Studies in dbGAP
SardiNIA Medical Sequencing Discovery Project• Blood lipid levels/personality in a Sicilian population –
WGS/WES• IRB approval – No use restrictions• 121 in consent group
Additional Grants (not in dbGAP)
Genomic Approaches for Elucidating Novel Targets for Pain and Symptom Management• Project #: 1ZIANR000015-05 • NINR - Intramural• PI: Raymond Dionne• WGS of individuals with unique phenotypes such as capsaicin
non-sensitive patients (does not say how many) IDENTIFICATION OF DISEASE-CAUSING MUTATIONS
IN SCID USING EXOME-WIDE SEQUENCING • Project #: 5RC1HL099617-02 • NHLBI – 2009 ARRA• PI: Joseph Roberts – Duke• WGS of causal mutations in 29 (?) SCID patients of unknown
etiology• Project end: August 2012
Additional Studies in EGA
Exome sequencing in patients with cardiac arrhythmias• Study ID: EGAS00001000063 • Data Provider: Wellcome Trust• WES of 20 patients with cardiac arrhythmias