7
TOWARDS INTELLIGENT CHROMOSOME ANALYSIS Aleksandar Jovanovi´ c, Miroslav Mari´ c, Momˇ cilo Borovˇ canin, Aleksandar Perovi´ c GIS - Group for Intelligent Systems, School of Mathematics, University of Belgrade, Studentski trg 16, 11000 Belgrade, Serbia and Montenegro www.gisss.com, www.matf.bg.ac.yu, contact: [email protected] Abstract: Investigation of chromosomes is based on the systems of induced bandings, proce- dures which generate characteristic patterns of light and dark areas - bands along chromosome. These patterns are used to identify individual chromosomes, to diagnose certain diseases expressed as characteristic pattern changes, to address precisely gene location - gene mapping, to determine the origin of extra chromosomal material. Less than ten years ago, in the best scientific pa- pers, chromosomes were treated manually with mechanical scissors operating on the most exciting photographic material. Introducing CCD - microscopy we developed methods for precise repre- sentation, photo morphology, normalization and uncompromising measurements, implemented it in software and supplied our genetic, oncology, hematology departments with comfortable and efficient tools which replaced glue and scissors. Introduction The regular banding of chromosomes is nicely shown on the chromosomes of mitosis pre- sented in Fig.1. Fig. 1 Mitosis Long ago researchers in genetics noticed that the regular mitosis consists of all chromosomes in pairs, except sex determining X and Y appearing as {X, X} or {X, Y } pairs. That means that, each except X, Y , chromosome pattern has identical or a very similar - matching twin, in normal cells. According to the distribution of bands and length chromosomes are designated as first through the twenty second and X or Y . This process is regu- larly accomplished by visual observations and re- sults in chromosome sequencing - cariotype which includes identification of irregular chromosomes or chromosomal segments in cell divisions characteriz- ing appearance of variety of syndromes, especially in oncology. Early precise determination of chro- mosomal changes and divergence from the standard would assist in early syndrome classification and diagnosing. Software tools assisting this work and presumably aiming at completely intelligent sys- tems are being developed by a number of teams. The necessary elements here are the precise chro- mosome description - representation, measurement tool set, including comparison and similarity mea- surements, object extraction and normalization, gene addressing and tools for genetic back tracing of wrong chromosomes. Method Photometrically the banding pattern is represented as a sequence of parallel mountains whose peeks correspond to the places, with low (or high) light absorption. Each longitudinal (i.e. meridian) in- tersection of whole photometric chromosome sur- face is a one argument function, photometric poly- nomial M (x), where algebraic-combinatorial in- variants should be common for similar chromo- somes. For meridian intersections of two chromo- somes of the same type, photometric polynomials M 1 (x) and M 2 (x) with the same arrangement and

TOWARDS INTELLIGENT CHROMOSOME ANALYSISpoincare.matf.bg.ac.rs/~aljosha/GIS/GIS/Method/Images/Chromsomia-SIAM.pdfTOWARDS INTELLIGENT CHROMOSOME ANALYSIS Aleksandar Jovanovi´c, Miroslav

Embed Size (px)

Citation preview

TOWARDS INTELLIGENT CHROMOSOME ANALYSIS

Aleksandar Jovanovic, Miroslav Maric, Momcilo Borovcanin, Aleksandar Perovic

GIS - Group for Intelligent Systems, School of Mathematics, University of Belgrade,

Studentski trg 16, 11000 Belgrade, Serbia and Montenegro

www.gisss.com, www.matf.bg.ac.yu, contact: [email protected]

Abstract: Investigation of chromosomes is based on the systems of induced bandings, proce-dures which generate characteristic patterns of light and dark areas - bands along chromosome.These patterns are used to identify individual chromosomes, to diagnose certain diseases expressedas characteristic pattern changes, to address precisely gene location - gene mapping, to determinethe origin of extra chromosomal material. Less than ten years ago, in the best scientific pa-pers, chromosomes were treated manually with mechanical scissors operating on the most excitingphotographic material. Introducing CCD - microscopy we developed methods for precise repre-sentation, photo morphology, normalization and uncompromising measurements, implemented itin software and supplied our genetic, oncology, hematology departments with comfortable andefficient tools which replaced glue and scissors.

Introduction

The regular banding of chromosomes isnicely shown on the chromosomes of mitosis pre-sented in Fig.1.

Fig. 1 Mitosis

Long ago researchers in genetics noticed thatthe regular mitosis consists of all chromosomes inpairs, except sex determining X and Y appearingas {X, X} or {X, Y } pairs. That means that, eachexcept X,Y , chromosome pattern has identical ora very similar - matching twin, in normal cells.According to the distribution of bands and lengthchromosomes are designated as first through thetwenty second and X or Y . This process is regu-larly accomplished by visual observations and re-

sults in chromosome sequencing - cariotype whichincludes identification of irregular chromosomes orchromosomal segments in cell divisions characteriz-ing appearance of variety of syndromes, especiallyin oncology. Early precise determination of chro-mosomal changes and divergence from the standardwould assist in early syndrome classification anddiagnosing. Software tools assisting this work andpresumably aiming at completely intelligent sys-tems are being developed by a number of teams.The necessary elements here are the precise chro-mosome description - representation, measurementtool set, including comparison and similarity mea-surements, object extraction and normalization,gene addressing and tools for genetic back tracingof wrong chromosomes.

Method

Photometrically the banding pattern is representedas a sequence of parallel mountains whose peekscorrespond to the places, with low (or high) lightabsorption. Each longitudinal (i.e. meridian) in-tersection of whole photometric chromosome sur-face is a one argument function, photometric poly-nomial M(x), where algebraic-combinatorial in-variants should be common for similar chromo-somes. For meridian intersections of two chromo-somes of the same type, photometric polynomialsM1(x) and M2(x) with the same arrangement and

proportions of their local extreme’s positions, weshould have

(∃ε > 0)(∀x)|M1(x)− aM2(x) + c| < ε, (1)

for some a and c and small enough ε.Let

M = {Mi(x)|i ∈ I},be a set of chromosomal photometric representa-tions. In M define an approximation of naturalequivalence ρ by ρε(Mi,Mj) iff (1). Different ε’swill result in different granulation of the relation ρ.The set M could include etalons (obtained statis-tically). Let for example

E = {Me1 , . . . , Men},

be an etalon and let ε be such that for the membersof E

ρε(Mei,Mej

) iff i = j.

This enables introduction of normalized similarityrelation

ρ∗ε(Mei ,Mej ) = ¬δij =

{0, if i = j1, otherwise . (2)

Then the equivalence classes are centered on themembers of E giving a set of metric balls

Kεei

= {Mj |ρε(Mei ,Mj)},Kε = {Kε

ei|ei ∈ E}.

LetRε = M\

⋃Kε, (3)

then for given ε, Rε contains photometricpolynomials redundant from the etalon E (”mu-tants” i.e. significantly changed chromosomes).The above reasoning could be refined if there isa need to include more subtle aspects. In prac-tice, in polynomials M1 and M2 representing twochromosomes of the same type, even when theyare from the same cell division, relative intensitiesof local maxima are hardly maintained, due to avariety of factors acting in the chromosome forma-tion. Fortunately positions of local extremes arewell preserved, more precisely - their relative dis-tances, which is a good basis for definition of chro-mosome invariants. With reasonable allocation ofdark band edges to the saddle points nearest toabsorption local maxima, instead of photometricpolynomials, their simplifications Ci’s - character-istic functions of bands are taken for the less sen-sitive chromosome representations, thus leading tothe comparison - similarity of chromosomes func-tion. Thus, we can calculate similarity of chromo-some A and B with the corresponding characteris-tic - like functions CA, CB with

dA,B(CA, CB) =

mina∈A,b∈B

x∈D

|CA(bx)− CB(x

b+ a)| dx

m(D)(4)

where a is translation, b a contraction factorand m(D) the length of D - the domain of a longerof chromosomal representations CA and CB .

Fig. 2 Comparison of chromosomes - best match-ing of characteristic functions of meridian - longi-tudinal sections

Fig. 3 3D - photomorphology view

In the Fig.2 right half, we have a pair of type1 chromosomes, extracted with their photometricpolynomial representations Mi and the correspond-ing simplifications Ci - represented with band pat-terns. Three variants of similarity measurementsin cents of the two Ci’s are shown at lower left,exhibiting very good match of these two chromo-somes (≤ 6%, i.e. very high similarity). In the

Fig.3 we have complete photo morphology of thechromosome 1 pair. These representations are suit-able when there is a need for more detailed insightinto the chromosomal structure.

Fig. 4 3D-photomorphology with two meridianand one latitude section

We developed a nice set of measurementtools on the 3D - photo morphology chromosomerepresentations, some shown in the Fig.4. Beforecomparison, chromosomes usually need some ”rec-tification”, which is also necessary for gene preciselocation. First we have implemented manual chro-mosome extraction with manual rectification. Inorder to reduce human interaction and get closertowards automatized goals, we implemented auto-matic chromosome normalization. Thus both com-parison and localization of genes on chromosomesdemand the introduction of the cariotype coor-dinate system - chromosomal coordinate system,which will facilitate identification of a chromosomeand its locus - bearing gene signal or other specificfeatures.

Fig. 5 Automatized cariotyping - construction ofa chromosome central meridian

Fig. 6 Automatized chromosome normalization

Fig. 7 Automatized cariotyping - meridian ofchromosome 1

This is done, following the cariotyping bi-ologic standard, which differentiates chromosomesin their development phases and uses standardizedbanding techniques which introduce chromosomespecific banding patterns. In order to reach anylevel of automatization, we need to ”straighten” inthe most reasonable way this objects, which is: re-construct them ”straight” = normalized, so thattheir bands are distributed as if they would be ifa chromosome at first was ”straight”. In fact, thewhole chromosome image, considered as the cor-responding absorption function - a two argumentpolynomial F (x, y), rather than its single longitudi-nal section M(x), exhibits characteristic position-ing of local extremes, which constitute the chromo-some invariant, best for its identification and clas-sification for whatever kind of detail/change inves-tigation. The coordinate transformation is imple-mented in the following way. The central meridianof original F (x, y) is deduced from the primary lat-itudes - normals of already determined tangents onthe contour, Fig.5. End points of an individual seg-ment of the central meridian are the middles of the

adjacent primary latitude line network (beam sec-tions). Then the whole central meridian is formedfrom such elementary segments. Next, the networkof latitudes normal on their central meridian seg-ments is refined, segment by segment. Finally, ro-tating each segment of central meridian, so thatthey all become colinear and maintaining normal-ity of the corresponding latitudes, we obtain the”rectified” - normalized coordinate system. Thenmapping original pixels to their target coordinates,the transform of the original chromosome is ob-tained, which is normalized - straight. Alternativeconstruction of the central meridian is based on theprocess of contour thinning: inscribing contours, fi-nally we reach a nicely determined big portion ofcentral meridian.

Fig. 8 Normalized chromosome 1

Fig. 9 Complete automatic cariotyping

After the automatized normalization, thechromosomes are sorted by length. The applica-tion of normalization of object from Fig.5 is shownin Fig.6. Then Fig.7 and Fig.8 exhibit the similarsteps on the other chromosome. The result of thecomplete automatized chromosome extraction and

normalization is presented in Fig.9. We have intro-duced one step backwards, namely, controlled nor-malization, which allows operator to redefine cen-tral meridian and to perform ”rectification” stepby step, thus providing insight into the highly con-vex chromosome parts, which when compactifyingmight loose some fine micro detail-shown as edgeholes in Fig.10 , leading to essential topology de-struction.

Fig. 10 Semiautomatic mode: chromosomal stepby step normalization

Fig. 11 Trisomy of chromosomes: the chromo-some to the right of the extracted pair in the rightcorner has an extra band, one band more than cor-responding chromosome which is immediately tothe left

Further automatization of this process, bycomparison of the cariotype obtained in this way,with the chromosome image data bases, containingdetails on identified syndromes, would be the nexthuge step. The cataloging of all identified chromo-some expressed pathologies needs to be done be-fore. We will shortly illustrate the application ofthe surveyed functions on the real material - prob-lems of our customers. A serious genetic syndrome- trisomy, appearance of one extra band in one ofpaired chromosomes is shown in Fig.11.

Presence of an irregular normally nonexis-tent chromosome, called marker chromosome isshown in Fig.12, the third in the first column ofbig chromosomes, from the left.

Fig. 12 Marker chromosome (from the left, thefirst bigger chromosome below horizontally posi-tioned one), regularly it does not exist

Fig. 13 Detailed photometric similarity compari-son of the marker chromosome with other big chro-mosomes indicates that it is redundant

Assistance in genetic back tracing of the ma-terial in this marker chromosome (Fig.17) is illus-trated in the figures Fig.13, Fig.14, Fig.15, Fig.16,

Fig.17 together with the visual equation in Fig.18confirming the congruence of irregular marker (top)with the concatenation of chromosome Y and thelonger leg of chromosome 1, which led to the iden-tification of a rare chematology syndrome (8threcorded case).

Fig. 14 Suspected match

Fig. 15 Chromosome 1, long leg - strong similar-ity

Fig. 16 Chromosome 1, long leg - different angle

Fig. 17 Marker - photomorfology

Fig. 18 Visual equation: congruence of markerwith the fusion of Y and longer arm ofchromosome 1

In the Fig.19 there is an evidence of moveof genetic material from the small chromosome toprow to the right to the chromosome bellow it, asdemonstrated in detail in Fig.20. The top row hasa regular pair of these two chromosomes, while inthe lower row, with photomorphologic details wehave results of translation process.

The last example depicts assistance of de-scribed methods implementation in gene localiza-tion. In Fig.21 we have bright dots on chromo-somes corresponding to a gene made visible byfluorescent in situ hybridization (FISH) method.Our photometric representation of chromosomeswith image measurement tools provides highly pre-cise allocation of the maximum of the gene - sig-nal, the spot closer to the chromosome center.

Fig. 19 Mitosis: with a couple of ”wrong” chro-mosomes

Fig. 20 Top normal pair, lower material from thesmall moved to the tail of the bigger chromosome

Fig. 21 Gene - signals (fluorescent in situ hy-bridization - FISH)

Fig. 22 Gene-signal addressing in chromosomeaddress space

The signal maximum position can be de-termined with precision of ≤ 2 − 3 pixels, afterchromosomal normalization. Thus, when workingwith mega-pixel chips, reaching thousands of pix-els per chromosome length, we are approaching subpromile precision of gene - signal addressing (with

respect to the chromosome length), which will be-come more important with introduction of multipleand finer gene hybridization techniques. Our im-plementations are still experimental and with grow-ing functionality. It is free for download from ourweb site.

References

[1] A. Jovanovic, Mathematics in biology, (Serb),School of Mathematics, University of Belgrade,1997.

[2] A. Jovanovic, Group for Intelligent Systems -Problems and Results,(Russ) Intelektualnie sis-temi, Lomonossov Un, tom 6, vip 1-4, Moscow,2002.

[3] Group for intelligent systems - GIS, Schoolof Mathematics, University of Belgrade,www.gisss.com

[4] O. J. Miller and E. Therman, Human chromo-somes, Fourth edition, Springer 2001.