39
Maize Genetics, Genomics, Bioinformatics workshop Doreen Ware ARS USDA Cold Spring Harbor Laboratory

Maize Genetics, Genomics, Bioinformatics workshop

Embed Size (px)

DESCRIPTION

Maize Genetics, Genomics, Bioinformatics workshop. Doreen Ware ARS USDA Cold Spring Harbor Laboratory. The Plan. Acknowledgements Maize Sequences Massaging of data sets Maize FPC, MMP unigene alignments Comparative map views - PowerPoint PPT Presentation

Citation preview

Page 1: Maize Genetics, Genomics, Bioinformatics workshop

Maize Genetics, Genomics, Bioinformatics workshop

Doreen Ware ARS USDA

Cold Spring Harbor Laboratory

Page 2: Maize Genetics, Genomics, Bioinformatics workshop

The Plan

• Acknowledgements• Maize Sequences • Massaging of data sets

– Maize FPC, MMP unigene alignments

• Comparative map views– Pair wise alignments of maize genetic map t to

the rice pseudomolecule, Gramene CMap views

• How might this be useful?

Page 3: Maize Genetics, Genomics, Bioinformatics workshop

CSHL

Wei Zhao

Kiran Ratnapu

Lincoln Stein

?LennyTeytelman

KenClark

?

Page 4: Maize Genetics, Genomics, Bioinformatics workshop

Cornell

Noel YapSusan McCouch

Page 5: Maize Genetics, Genomics, Bioinformatics workshop

Maize sequences

Red genomic, Blue expressed, Green clustered, Black genetic map Pink FPC

Page 6: Maize Genetics, Genomics, Bioinformatics workshop

ESTs (Expressed Sequence Tag)

• Clusters of ESTs– Tug- Plantgdb (www.plantgdb.org) – GI-TIGR (www.tigr.org)– Unigenes (Maizegdb) private sequences

included deposited in Genbank– NCBI has unigenes

Page 7: Maize Genetics, Genomics, Bioinformatics workshop

Genomic Sequence

• Complete BACs

• Genomic Survey Sequences (GSS)– BAC ends– 100 skim bacs – Methyl filtered– Hi-Cot filtered

Page 8: Maize Genetics, Genomics, Bioinformatics workshop

Rice and Maize Synteny Analysis public data sets

• Maize Mapping Project (MMP)– AGI Maize FPC map Cone et al., 2002

• www.maizemap.org, www.genome.arizona.edu/FPC/maize/

• IBM 2 neighbors Map– www.maizegdb.org

• International Rice Sequencing Project (IRGSP) – Rice Genome Sequence

• rgp.dna.affrc.go.jp/IRGSP/

– TIGR rice assembly• www.tigr.org

Page 9: Maize Genetics, Genomics, Bioinformatics workshop

• Rice sequenced based maps• Cereal genetic and FPC maps • Establish common anchor points between

the genetic and physical maps• Extend information available from the

genetic maps of each species to the physical maps (leverage work of genetic systems)– Quantitative Trait Loci (QTL) and Mutants

Leverage synteny of cereal genomes in the absences of

complete sequence

Page 10: Maize Genetics, Genomics, Bioinformatics workshop

Maize FPC map

• Identify features to use – Maize unigene overgos (MMP unigenes)– Maize genetic markers

• Identify high confidence features for correspondence to rice.

• Assign a position for the feature on a contig

Page 11: Maize Genetics, Genomics, Bioinformatics workshop

• FPC maps consist of contigs of clustered BACs

• The BACs are represented by imaged bands

• Bands represent restriction enzyme digest fragments of the BAC clones• cccG/ATATCcccgga…ccggatcaG/ATATCacc

• Features are anchored to BACs in the contigs

Finger Print Contig Maps

Page 12: Maize Genetics, Genomics, Bioinformatics workshop

What is an overgo?

• ~42 bp oligo sequence used for hybridization

• Derived from the MMP unigenes

• Gene specific?

Page 13: Maize Genetics, Genomics, Bioinformatics workshop

MMP overgo positions on the maize FPC map

contig

BACsovergos

Page 14: Maize Genetics, Genomics, Bioinformatics workshop

How many MMP overgo positions are on the maize FPC map?

• All MMP Unigene overgo positions – 15,574

contig

overgos

4 overgo positions on two contigs using 3 overgo probes

Contig A Contig B

Page 15: Maize Genetics, Genomics, Bioinformatics workshop

How many FPC contigs does an overgo hit?

Page 16: Maize Genetics, Genomics, Bioinformatics workshop

# of contigs and MMP overgo hybridizes to in the maize FPC map

0

1000

2000

3000

4000

1 3 5 7 9 11 13 15 17 19 21

# of MMP overgos

# o

f B

AC

co

nti

gs

Series1

Page 17: Maize Genetics, Genomics, Bioinformatics workshop

How many BACs does an overgo hit in a contig?

Page 18: Maize Genetics, Genomics, Bioinformatics workshop

# of BACs a MMP overgo hybridized to in a FPC contig

0

5000

10000

15000

1 3 5 7 9 11 13 15 17 19

# of BACS hybridized to in a contig

# o

f M

MP

o

verg

os

Series1

Page 19: Maize Genetics, Genomics, Bioinformatics workshop

Select overgos with at least two BACs hybridized in an FPC contig.

• To remove potential false positives, require an overgo be found on more than one BAC in a Maize FPC contigs.

• Red and pink are accepted blue is rejected– 8864

contig

BACsovergos

Page 20: Maize Genetics, Genomics, Bioinformatics workshop

Generate a position for the overgo on the FPC contig

• To establish a rough order of the overgo on the contig the position of the BACs the overgo hit in the FPC contig is used. This establishes a relative order of the overgos within the contig– Maximum start position of a BAC

– Minimum end position of a BAC

contig

BACsovergos

Page 21: Maize Genetics, Genomics, Bioinformatics workshop

Align MMP clusters to the rice genome

• Sequenced based alignments of features to the rice genome using BLAT.

• Single best match in the rice genome

• 6,771 maize unigenes (63%)

Page 22: Maize Genetics, Genomics, Bioinformatics workshop

Filter based upon match-lengthDistribution of maize unigene cluster hit length

0

200

400

600

800

1000

1200

0-10

0

150-

200

250-

300

350-

400

450-

500

550-

600

650-

700

750-

800

hit length

# o

f fe

atu

re h

its

Series1

Matches with less than 150bp match-length were removed leaving 7,770 hits.

Page 23: Maize Genetics, Genomics, Bioinformatics workshop

0

100

200

300

400

500

600

700

800

900

1000

43 58 62 65 68 71 74 77 80 83 86 89 92 95 98

Percent Identity

Ma

pp

ed

Co

rns

en

su

s U

nig

en

es

Count

Distribution of percent identity of Maize Cornsensus unigenes mapped to rice

The hits represent 6,692 unique Clusters (62% of the total 10,678).

Page 24: Maize Genetics, Genomics, Bioinformatics workshop

0

200

400

600

800

1000

1200

1400

1 2 3 4 5 6 7 8 9 10 11 12

Rice Chromosome

Mai

ze C

orn

sen

sus

Un

igen

es

MAPPED TO RICE ON MAIZE FPC UNIGENES PER 10MB

What is the distribution of the maize unigenes across the rice

genome?

Page 25: Maize Genetics, Genomics, Bioinformatics workshop

Rice

Maize

Calculate adjacent distances between pairs of unigenes on maize contigs and

their distance of the rice genome

Page 26: Maize Genetics, Genomics, Bioinformatics workshop

50Kb_1MB

Unigene clusters pairs found on maize contigs and the distances within 50Kbp and 1Mbp on

rice chromsome 1 ,4 and 10

050

100150200250300350400450

5000

0

1500

00

2500

00

3500

00

4500

00

5500

00

6500

00

7500

00

8500

00

9500

00

distance of pairs in bps on rice

# o

f m

aiz

e c

on

tig

pa

irs

Colinear gene pairs 400,000 bps or lower

Page 27: Maize Genetics, Genomics, Bioinformatics workshop

Unigene clusters pairs found on maize contigs found within 200kb on rice chromosome 1, 4,

10

020406080

100120140160180200

1000

0

3000

0

5000

0

7000

0

9000

0

1100

00

1300

00

1500

00

1700

00

1900

00

distance of pairs in bps on rice

# o

f m

aiz

e c

on

tig

pai

rs

90% of colinear maize overgos contig-pairs fall under 400,000 bps are found within

165,000 bps on rice

Page 28: Maize Genetics, Genomics, Bioinformatics workshop

Rice

Maize

A maize contig span will be defined as syntenic if it contains a unigenes pair that is less than 400 kb apart on the rice genome

Page 29: Maize Genetics, Genomics, Bioinformatics workshop

Maize Contig 417

CTG 417 Anchored to the maize chromosome 8 by marker UMC1905

Page 30: Maize Genetics, Genomics, Bioinformatics workshop

Plot the colinear rice and maize spans

that contain genetic marker information • What regions of the rice genome are

syntenic with maize?

• What regions of the rice genome have no synteny with maize?

• What regions of the maize genome have no synteny to rice?

Page 31: Maize Genetics, Genomics, Bioinformatics workshop

Maize-Rice Colinear intervals with anchored maize genetic position

Page 32: Maize Genetics, Genomics, Bioinformatics workshop

Genetic loci with sequence mapped to rice genome

Page 33: Maize Genetics, Genomics, Bioinformatics workshop

Blue - Genetic markers Red -CommonGreen- Colinear FPC contigs

Page 34: Maize Genetics, Genomics, Bioinformatics workshop

What can rice do for maize?

• Where colinear regions exist to rice– Provide potential genetic neighborhood of unanchored

maize contigs– Provide link to other cereal genetic and physical maps– Provisional order of cereals features where no orientation

is known. (Wheat deletion bin map, Maize filtered reads?)– Candidate sequence for marker screening development

• Where limited or no colinear regions exist to rice– Rethink candidate gene approach?

Page 35: Maize Genetics, Genomics, Bioinformatics workshop

Provide potential genetic neighborhood of unanchored maize contigs chromosome 8 (red), 3 (blue) and unknown (black)

Page 36: Maize Genetics, Genomics, Bioinformatics workshop

Comparative maps with other cereals

• Wheat EST deletion map (new)

• Sorghum RFLP map from Paterson (new)

• Rice TIGR assembly (new)

• Maize curated FPC

Page 37: Maize Genetics, Genomics, Bioinformatics workshop

Rice chromosome 1 to Wheat deletion map,

Sorghum genetic map, and Maize FPC map

Page 38: Maize Genetics, Genomics, Bioinformatics workshop

Provisional order of cereals features where no orientation is known.

Page 39: Maize Genetics, Genomics, Bioinformatics workshop

SummaryThe maize physical map can provide a provisional

order for the maize sequences that have been anchored.

In syntenic regions the rice sequence can serve as an anchor to define contact points between cereal genomes

In syntenic regions the rice sequence can provide a provisional order to cereal sequences