On Computing the Breakpoint Reuse Rate in Rearrangement Scenarios

Preview:

DESCRIPTION

On Computing the Breakpoint Reuse Rate in Rearrangement Scenarios. Anne Bergeron Julia Mixtacki Jens Stoye. Paris, le 14 octobre 2008. 1. An inspiring figure Mammalian chromosome 17. 2. Classical view of breakpoint reuse rate Operations that make 2 cuts. - PowerPoint PPT Presentation

Citation preview

On Computing the Breakpoint Reuse Ratein Rearrangement Scenarios

Anne Bergeron Julia Mixtacki

Jens Stoye

Paris, le 14 octobre2008

1. An inspiring figureMammalian chromosome 17

2. Classical view of breakpoint reuse rate Operations that make 2 cuts

3. New view of breakpoint reuse rate Operations that make 0 or 1 cut

4. Understanding reuse rate behaviorThe adjacency graphLooking for extreme scenarios

5. Conclusions and open problems

Reconstructing the Genomic Architecture of Ancestral Mammals: Lessons From Human, Mouse, and Rat GenomesGuillaume Bourque, Pavel A. Pevzner and Glenn Tesler (2004)

Mammalian chromosome 17

Reconstructing the Genomic Architecture of Ancestral Mammals: Lessons From Human, Mouse, and Rat GenomesGuillaume Bourque, Pavel A. Pevzner and Glenn Tesler (2004)

Mammalian chromosome 17

Reconstructing the Genomic Architecture of Ancestral Mammals: Lessons From Human, Mouse, and Rat GenomesGuillaume Bourque, Pavel A. Pevzner and Glenn Tesler (2004)

Mammalian chromosome 17

1

2

3

4

1. An inspiring figureMammalian chromosome 17

2. Classical view of breakpoint reuse rate Operations that make 2 cuts

3. New view of breakpoint reuse rate Operations that make 0 or 1 cut

4. Understanding reuse rate behaviorThe adjacency graphLooking for extreme scenarios

5. Conclusions and open problems

Classical view of breakpoint reuse rate

Assuming that each operation makes 2 cuts, the breakpoint reuse rate is defined as:

r = 2D / b

Let D be the minimum number of rearrangement operations that transform Genome A into Genome B, and b the number of adjacencies in Genome B that are not adjacencies in Genome A.

Note: the value of r lies between 1 and 2.

1 2No reuse:each cut repairsone adjacency

Max reuse:two cut repairone adjacency

Classical view of breakpoint reuse rate

Assuming that each operation makes 2 cuts, the breakpoint reuse rate is defined as:

r = 2D / b

What are the consequences of the 2 cuts assumption?

Let D be the minimum number of rearrangement operations that transform Genome A into Genome B, and b the number of adjacencies in Genome B that are not adjacencies in Genome A.

... and do not change the number of chromosomes.

Translocation

Operations that make 2 cuts

Inversion

Inversion

... and change the number of chromosomes,

Operations that make 2 cuts

Fusion

Fission

Fusion

Fission

always involve circular chromosomes.

Classical view of breakpoint reuse rate

Assuming that each operation makes 2 cuts, the breakpoint reuse rate is defined as:

r = 2D / b

What are the consequences of this assumption?

• If the genomes are linear, then they must have the same number of chromosomes.

• The chromosomes must also be co-tailed.

Let D be the minimum number of rearrangement operations that transform Genome A into Genome B, and b the number of adjacencies in Genome B that are not adjacencies in Genome A.

Classical view of breakpoint reuse rate

Let D be the minimum number of rearrangement operations that transform Genome A into Genome B, and b the number of adjacencies in Genome B that are not adjacencies in Genome A.

Assuming that each operation makes 2 cuts, the breakpoint reuse rate is defined as:

r = 2D / b

What is done in practice?

• Empty chromosomes are added as necessary.• Caps are added to make the chromosomes co-tailed. (such as in GRIMM)• Chromosomes are ‘circularized’. (Alekseyev and Pevzner)

Cap

5 61 2 3 4

Genome A

Genome B

641 3 5 2

61 3 4 52

61 3 4 2 5

1 2 6534

D = 5, Cuts = 10, b = 6Breakpoint reuse rate = 10/6 = 1,66

241 3 5 6

Classical view of breakpoint reuse rate

[Source: Linda Ashworth, LLNL]DOE Human Genome Program Report

From 24 chromosomes

To 21 chromosomes

Human and mouse whole genome comparison data (Pevzner et Tesler, 2003.):

QuickTime™ and a decompressor

are needed to see this picture.

Classical view of breakpoint reuse rate

D = 246, Cuts = 492, b = 300Breakpoint reuse rate = 492/300 = 1,64

1. An inspiring figureMammalian chromosome 17

2. Classical view of breakpoint reuse rate Operations that make 2 cuts

3. New view of breakpoint reuse rate Operations that make 0 or 1 cut

4. Understanding reuse rate behaviorThe adjacency graphLooking for extreme scenarios

5. Conclusions and open problems

Linear chromosomes

Semi translocation

Operations that make 0 or 1 cut

Semi inversion

Fission

Fusion

Linear and circular chromosomes

Operations that make 0 or 1 cut

Circularization

Linearization

Fusion

Fission

New view of breakpoint reuse rate

Let D be the minimum number of rearrangement operations that transform Genome A into Genome B, and b the number of adjacencies and telomeres in Genome B that are not adjacencies or telomeres in Genome A.

Given a rearrangement scenario that needs to cut Genome A C times, the breakpoint reuse rate is defined as:

r = C / b

Note: the value of r now lies between 0 and 2.

0 2No cutsneeded(fusions)

Max reuse:two cut repairone adjacency

1No reuse:each cut repairsone adjacency

5 61 2 3 4

Genome A

Genome B

641 3 5 2

61 3 4 52

61 3 4 2 5

1 2 6534

241 3 5 6

D = 5, Cuts = 9, b = 6Breakpoint reuse rate = 9/6 = 1,50

No capNew view of breakpoint reuse rate

5 61 2 3 4

Genome A

Genome B

351 2 4 6

41 2 3 65

61 2 3 5 4

1 2 6543

241 3 5 6

D = 5, Cuts = 7, b = 6Breakpoint reuse rate = 7/6 = 1,17

No capNew view of breakpoint reuse rate

Maximumreuse

Noreuse

No capsScenario 2

No capsScenario 1

With caps

Is going from high breakpoint reuse to low breakpoint reuse an artifact of the particular example we constructed ?

New view of breakpoint reuse rate

Human and mouse whole genome comparison data (Pevzner et Tesler, 2003.):

QuickTime™ and a decompressor

are needed to see this picture.

New view of breakpoint reuse rate

D = 246, Cuts = 453, b = 300Breakpoint reuse rate = 453/300 = 1,51

Scenario 1

D = 246, Cuts = 267, b = 300Breakpoint reuse rate = 267/300 = 0,89

Scenario 2

Maximumreuse

Noreuse

No capsScenario 2

No capsScenario 1

With caps

The drop is even more pronounced with realistic data !

New view of breakpoint reuse rate

1. An inspiring figureMammalian chromosome 17

2. Classical view of breakpoint reuse rate Operations that make 2 cuts

3. New view of breakpoint reuse rate Operations that make 0 or 1 cut

4. Understanding reuse rate behaviorThe adjacency graphLooking for extreme scenarios

5. Conclusions and open problems

The adjacency graph

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

• A BB-path.

The adjacency graph

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

• A BB-path.• A BB-path.• A cycle.

The adjacency graph

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

• A BB-path.• A cycle.• A BB-path.• A cycle.• An AB-path.

The adjacency graph

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

• A BB-path.• A cycle.• An AB-path.

• A BB-path.• A cycle.• An AB-path.• An AA-path.

The adjacency graph

5 6

14 6

1 2 3 4

23 5Genome A

Genome B

• A BB-path.• A cycle.• An AB-path. Another AB-path.• An AA-path.

• A BB-path.• A cycle.• An AB-path.• An AA-path.

Looking for extreme scenarios: cycles

Cycles of length 2L always need L-1 DCJ operationsto be sorted, thus have a constant contribution to thereuse rate. Long cycles yield high breakpoint reuse.

Genome A

Genome B

Looking for extreme scenarios: AB-paths

Genome A

Genome B

The A extremity of an AB-path always corresponds to a DCJ operation that makes 1 cut. The length of the AB-path is shortened by two.

Looking for extreme scenarios: AB-paths

Genome A

Genome B

Thus one cut repairs one adjacency, and the path canbe sorted without breakpoint reuse.

Looking for extreme scenarios: AA-paths

Genome A

Genome B

Both extremities of an AA-path correspond to DCJ operations that make 1 cut. The length of the AA-path is shortened by two.

Looking for extreme scenarios: AA-paths

Genome A

Genome B

Thus one cut repairs one adjacency, and the path canbe sorted without breakpoint reuse.

Looking for extreme scenarios: BB-paths

Genome A

Genome B

A BB-path can always be transformed into two AB-pathswith a fission.

Looking for extreme scenarios: BB-paths

Genome A

Genome B

Thus the path can be sorted without breakpoint reuse.

Statistics for the human mouse comparison

24 cycles of length 412 AB-paths: lengths 3 to 5112 AA-paths: lengths 2 to 4615 BB-paths: lengths 2 to 22

1 cycle of length 61 cycle of length 81 cycle of length 10

These cycles implies that6 breakpoints are reused,to repair 12 adjacencies

The remaining 288 adjacenciesand telomeres can be repaired without any breakpoint reuse.

1. An inspiring figureMammalian chromosome 17

2. Classical view of breakpoint reuse rate Operations that make 2 cuts

3. New view of breakpoint reuse rate Operations that make 0 or 1 cut

4. Understanding reuse rate behaviorThe adjacency graphLooking for extreme scenarios

5. Conclusions and open problems

Conclusions and open problems

• The reuse rate depends in a crucial way on how genomes and telomeres are modeled.

• Finding a scenario exhibiting a particular value of reuse rate seems easy, even with complex data.

• However, we do not have an algorithm to do thisif we insist that the intermediate genomes remain linear.

• The transformation between mouse and man with minimal breakpoint reuse can be done withlinear intermediate genomes...