DNA copy number variation and cancer risk

Preview:

DESCRIPTION

DNA copy number variation and cancer risk. John F Pearson. Canterbury Statistics Open Day University of Canterbury 2/10/2012. Breast Cancer . Foulkes WD. N Engl J Med 2008; 359:2143-2153. Missing heritability. TA Manolio et al. Nature 461 , 747 - 753 (2009) doi:10.1038/nature08 494. - PowerPoint PPT Presentation

Citation preview

DNA copy number variation and cancer risk

John F Pearson

Canterbury Statistics Open DayUniversity of Canterbury 2/10/2012

2

Breast Cancer

(Foulkes WD. N Engl J Med 2008; 359:2143-2153)Foulkes WD. N Engl J Med 2008; 359:2143-2153

3

Missing heritability

TA Manolio et al. Nature 461, 747-753 (2009) doi:10.1038/nature08494

4

Evan E. Eichler.

5

Copy number variation Allele 1Allele 2

Copy number loss Copy number gain

Whole gene

Partial gene

Contiguous genes

Regulatory effects

6

Copy number variants (CNVs) 16,000 copy number variant loci cover >50% of the human genome

CNVs are associated with cancer risk• Rare CNVs detected in ~50% of familial cancer genes

eg. BRCA1, BRCA2

• Genome-wide association studies of cancer

• prostate cancer, hepatocarcinoma, nasopharyngeal carcinoma, and neuroblastoma

• Increased CNV load

• Li Fraumeni Syndome (cancer related genes?)

• breast cancer (TP53 pathway, ESR1 pathway)

7

SNP arrays

𝑅=𝑋+𝑌𝜃=

2𝜋 arctan

𝑋𝑌

LRR = log2(Robserved/Rexpected)

The B Allele Frequency (BAF) is a somewhat confusing term that actually refers to a normalized measure of relative signal intensity ratio of the B and A allelesWang et al Genome Res. 2007 November; 17(11): 1665–1674.

8

Genomic location

9

Copy number

AA

AB

BB

NormalCopy neutral LOH

Copy number loss

10

Copy number gainCopy number

gain

AAAAABABBBBB

11

Illumina bead arrays.o CNVision (workflow software)o Gnosiso PennCNVo QuantiSNPo CNV Partition

CNV calling

CNV calling algorithms

12

Hidden Markov Model

Estimate copy number at each SNP from• Log R ratio • B allele frequency • transition probability at previous SNP.

PennCNV, QuantiSNP

13

PennCNV

14

PennCNVri LRRbi BAF at SNP i. ( 1 ≤ i ≤ M )zi copy number state The likelihood of the observed data is:

15

PennCNVri LRRbi BAF at SNP i. ( 1 ≤ i ≤ M )zi copy number state The likelihood of the observed data is:

LRR emission probability model includes a term for chemical fluctuations and misannotation/assembly

BAF emission probability complicated mixture model

16

PennCNVri LRRbi BAF at SNP i. ( 1 ≤ i ≤ M )zi copy number state Transmission probabilities between 2 adjacent SNPs i -1 and i.with copy numbers zi and zi-1 at distance di.

D = 100Mb for state 4, 100kb for other states.p are unknowns, estimated by the Baum-Welch algorithm.

17

PennCNVri LRRbi BAF at SNP i. ( 1 ≤ i ≤ M )zi copy number state • Baum-Welch used to train the model• Viterbi algorithm used to infer most likely path• CNV called whenever a stretch of states is different from

normal ( usually state 3 or 4)

18

Copy number gainCopy number

gain

AAAAABABBBBB

19

Noisy data

20

Breast cancerA characteristic of breast tumour cells is genomic instability

BRCA1, BRCA2

21

BRCA1: known large deletions

Sample ID BRCA1 mutationEMB0001242 del exons 2-24EMB0001532 del exons 3-19EMB0001222 del exons 1-23EMB0001425 del exons1-21EMB0001439 del exons 1-23EMB0001458 del exons 1-23EMB0001477 del exons1-21GEM0002463 del exons 16-23PAD0005718 del exons 9-19EMB0001770 del exons 1-17EMB0001057 del exons 1-17KCO0003228 del exons 1-17EMB0001082 del exons 8-13GEM0002430 del exons 8-13

Sample ID BRCA1 mutationEMB0001530 del exons 3-19EMB0001689 del exons 1-17

Detected Not detected

CNV prediction summary:• cnvPartition - 25% (4/16) • GNOSIS - 19% (3/16)• PennCNV - 88%

(14/16)• QuantiSNP - 81%

(13/16)

22

CNV calling by 4 algorithms

QC(1) – GWAS criteria

Endometrial cancer1343 cases

ANECS, SEARCH655 female controls

Hunter Community Study

Case vs. control analyses

1279 cases 619 controls

1210 cases 612 controls

Want to find:

1. CNVs overlapping known susceptibility genes

2. novel CNVs in the mismatch repair pathway

3. common or rare CNVs associations

23

CNV frequency: all  Case Control Difference P  1,210 612    

Total CNVs 26.7 26.5 0.2 NSDeletions 17.7 18.1 -0.4 NSDuplications 8.9 8.4 0.5 NSExons 7.1 6.9 0.2 NSMean CNV per sample

24

CNV frequency: rare (< 1%)  Case Control Difference P  1,210 612    

Total CNVs 6 3.3 2.7 4.0E-05Deletions 3.8 1.4 2.4 3.0E-06Duplications 2.2 1.9 0.3 NSExons 6 3.3 2.7 2.0E-04Mean rare CNV per sample

25

CNV frequency: rare (< 1%)  Case Control Difference P  1,210 612    

Total CNVs 6 3.3 2.7 4.0E-05Deletions 3.8 1.4 2.4 3.0E-06Duplications 2.2 1.9 0.3 NSExons 6 3.3 2.7 2.0E-04Mean rare CNV per sample

26

Association study

  Case ControlP 

adjustedChr 0 1 3 4 0 1 3 4X 0 1 0 0 0 57 0 0 0.000X 0 30 7 0 0 78 0 0 0.000X 0 2 0 0 0 34 0 0 0.000X 0 0 0 0 0 24 0 0 0.0006 9 10 0 0 4 35 0 0 0.00016 0 125 127 0 0 10 19 0 0.000X 0 0 0 0 0 14 0 0 0.0016 812 203 438 20 477 184 276 14 0.0032 0 2 2 0 0 14 16 0 0.0067 0 0 0 0 0 12 4 0 0.00611 0 38 32 0 0 1 3 0 0.010X 0 1 0 0 0 0 11 0 0.016

CNV Regions

27

Association studyCNV overlapping genes

  Case ControlP 

adjustedChr 0 1 3 4 0 1 3 4X 0 2 0 0 0 53 0 0 0.0001 0 37 2 0 0 0 0 0 0.0041 0 35 2 0 0 0 0 0 0.0047 0 0 1 0 0 13 5 0 0.0041 0 36 2 0 0 0 0 0 0.0041 0 36 2 0 0 0 0 0 0.0041 0 34 2 0 0 0 0 0 0.0051 0 33 2 0 0 0 0 0 0.0081 0 31 1 0 0 0 0 0 0.0111 0 31 1 0 0 0 0 0 0.0117 0 4 32 2 0 0 0 0 0.011X 0 22 6 0 0 36 0 0 0.021

28

29

AcknowledgementsUniversity of Otago• Gemma Moir-Meyer• Logan Walker• Mackenzie Cancer Research Group

Queensland Institute of Medical Research• Mandy Spurdle• Felicity Lose• Yen Tan• Alex Metcalf• Australian National Endometrial Cancer

Study• Bryony Thompson

University of Cambridge• Deborah Thompson • Paul Pharoah• Alison Dunning • Douglas Easton• Studies of Epidemiology and Risk

Factors in Cancer Heredity (SEARCH)

University of Newcastle• Rodney Scott• Mark McEvoy• John Attia• Elizabeth Holliday• The Hunter Community Study

CIMBA consortiumMAYO clinic• Fergus Couch

Recommended