32
enome representation variant identificat Deanna M. Church, NCBI

Genome representation and variant identification Deanna M. Church, NCBI

Embed Size (px)

DESCRIPTION

The Reference Assembly is NOT Static NCBI35 (hg17) NCBI36 (hg18) GRCh37 (hg19) GRCh37.p9

Citation preview

Page 1: Genome representation and variant identification Deanna M. Church, NCBI

Genome representation and variant identification

Deanna M. Church, NCBI

Page 2: Genome representation and variant identification Deanna M. Church, NCBI
Page 3: Genome representation and variant identification Deanna M. Church, NCBI

The Reference Assembly is NOT Static

NCBI35 (hg17)NCBI36 (hg18)GRCh37 (hg19)GRCh37.p9

Page 4: Genome representation and variant identification Deanna M. Church, NCBI

Image credit: http://www.tohlejokes.com

Page 5: Genome representation and variant identification Deanna M. Church, NCBI

http://genomereference.org

Page 6: Genome representation and variant identification Deanna M. Church, NCBI

Resolved: 716Open: 697

Page 7: Genome representation and variant identification Deanna M. Church, NCBI

http://www.ncbi.nlm.nih.gov/dbvar

Page 8: Genome representation and variant identification Deanna M. Church, NCBI

Studies

Variant Regions

Variant Calls

Variant Region nsv531833 type: CNV

Variant Calls: nssv577112 type: copy number gain Method: Oligo aCGH Analysis: Probe signal intensity phenotype: Autism; etc. Clinical: Pathogenic Copy Number: 3

Variant Calls: nssv580124 type: copy number loss Method: Oligo aCGH Analysis: Probe signal intensity phenotype: Autism. Clinical: Pathogenic Copy Number: 1

MethodsAnalysis

PublicationsSamples

Submitted assembly

Page 9: Genome representation and variant identification Deanna M. Church, NCBI

Variant Call Ambiguitystart stop

Inner start Inner stop

Outer start Outer stop

Probes with decreased signal intensityProbes with expected signal intensity

breakpoint breakpoint

Inner start Inner stop

Page 10: Genome representation and variant identification Deanna M. Church, NCBI

Variant Call AmbiguityOuter start Outer stop

Fosmid clone (40 Kb +/- 1 Kb)

20Kb Clone has an insertionrelative to the genome

Clone has a deletionrelative to the genome 60 Kb

Page 11: Genome representation and variant identification Deanna M. Church, NCBI

Assembly, Mis-assembly, Biology and Variant Interpretation

Page 12: Genome representation and variant identification Deanna M. Church, NCBI

BAC insertBAC vector

Shotgun sequence

Assemble

GAPS

“finishers” go in to manually fill the gaps, often by PCR

Page 13: Genome representation and variant identification Deanna M. Church, NCBI

NCBI36 (hg18)

GRCh

37 (h

g19)

Page 14: Genome representation and variant identification Deanna M. Church, NCBI

NCBI35 (hg17)

GRCh37 (hg19)

AL139246.20

AL139246.21

Page 15: Genome representation and variant identification Deanna M. Church, NCBI

Build sequence contigs based on contigs defined in TPF (Tiling Path File).

Check for orientation consistenciesSelect switch pointsInstantiate sequence for further analysis

Switch point

Consensus sequence

Page 16: Genome representation and variant identification Deanna M. Church, NCBI

NCBI36

Page 17: Genome representation and variant identification Deanna M. Church, NCBI

nsv832911 (nstd68) Submitted on NCBI35 (hg17)

Page 18: Genome representation and variant identification Deanna M. Church, NCBI

NCBI35 (hg17) Tiling Path

GRCh37 (hg19) Tiling Path

Gap Inserted

Moved approximately 2 Mb distal on chr15

NC_0000015.8 (chr15)

NC_0000015.9 (chr15)

Removed from assembly

Added to assembly

HG-24

Page 19: Genome representation and variant identification Deanna M. Church, NCBI

Sequences from haplotype 1Sequences from haplotype 2

Old Assembly model: compress into a consensus

New Assembly model: represent both haplotypes

Page 20: Genome representation and variant identification Deanna M. Church, NCBI

AC074378.4AC079749.5

AC134921.2AC147055.2

AC140484.1AC019173.4

AC093720.2AC021146.7

NCBI36 NC_000004.10 (chr4) Tiling Path

Xue Y et al, 2008

TMPRSS11E TMPRSS11E2

GRCh37 NC_000004.11 (chr4) Tiling Path

AC074378.4AC079749.5

AC134921.1AC147055.2

AC093720.2AC021146.7

TMPRSS11E

GRCh37: NT_167250.1 (UGT2B17 alternate locus)

AC074378.4AC140484.1

AC019173.4AC226496.2

AC021146.7

TMPRSS11E2

nsv532126 (nstd37)

Page 21: Genome representation and variant identification Deanna M. Church, NCBI

GRCh37

Page 22: Genome representation and variant identification Deanna M. Church, NCBI

81 FIX Patches71 NOVEL Patches

GRCh37.p9

Page 23: Genome representation and variant identification Deanna M. Church, NCBI

Dennis et al., 2012

1q32 1q21 1p21

1p21 patch alignment to chromosome 1

Page 24: Genome representation and variant identification Deanna M. Church, NCBI

Finding the data

Page 25: Genome representation and variant identification Deanna M. Church, NCBI

How dbVar* manages data

*and most other NCBI databases too

Object Method Analysis Clinical assertion

NCBI36 location

Etc…

nsv1000 Oligo aCGH Probe signal intensity

None Location Etc…

nsv2000 Sequencing Paired end analysis

None Location Etc…

nsv3000 Sequencing Read Depth

Benign Location Etc..

… … … … … …

Search Term

Page 26: Genome representation and variant identification Deanna M. Church, NCBI
Page 27: Genome representation and variant identification Deanna M. Church, NCBI
Page 28: Genome representation and variant identification Deanna M. Church, NCBI

Variant submitted on NCBI35 (hg17)Failed to remap to NCBI36 (hg18)Successful remap to GRCh37 (hg19)

Page 29: Genome representation and variant identification Deanna M. Church, NCBI
Page 30: Genome representation and variant identification Deanna M. Church, NCBI

No results in ‘normal’ dbVar searchGenome Sensor predicts this is a location -> points to dbVar Genome Browser

Page 31: Genome representation and variant identification Deanna M. Church, NCBI
Page 32: Genome representation and variant identification Deanna M. Church, NCBI

Acknowledgements

dbVar

John LopezTim HefferonJohn GarnerChao ChenGeorge ZhouVictor Ananiev

NCBI

Collaborators

DGVaDGV

GRCNCBI

Valerie SchneiderNathan BoukHsiu-Chuan Chen

Collaborators

TGI-WUWTSIEBI

ISCANCBI Genomes, Viewers and Variation groups