6
mzRepeat: Visual Analysis of Lipids in Mass Spectrometry Shinichi Mukosaka * University of Electro-Communications JEOL Ltd. Kanae Teramoto JEOL Ltd. Hideki Koike University of Electro-Communications ABSTRACT Mass spectrometry is widely used for identifying molecules that have similar structures. A mass spectrum obtain from lipids us- ing MALDI-MS (matrix-assisted laser desorption/ionization-mass spectrometer) was observed to have peaks at equal intervals on the m/z axis. If these repeated peaks can be collectively shown, analysis can be optimized even further. In this study, we propose a visualiza- tion system for analyzing repeated observed peaks on the m/z axis. The system, mzRepeat, was developed and evaluated by analyzing butter. A feature of mzRepeat is that ions that differ only in num- ber of repeating structures are shown collectively. It is especially effective for analysis of many peaks from the samples that contain several molecular species. Keywords: visual analysis, mass spectrometry, lipid analysis Index Terms: H.5.2 [Information Systems]: Information Interfaces and Presentation—User Interfaces; I.3.7 [Comput- ing Methodologies]: Computer Graphics—Three-dimensional Graphics and Realism; J.2 [General]: Physical Sciences and Engineering—; 1 I NTRODUCTION Lipids are hydrophobic or amphiphilic molecules in biological ob- jects. The analysis of these lipids is important not only for in- vestigating life phenomena but also for microorganism classifica- tion and food inspection. In general, lipids are synthesized in vivo are esters of alcohol and fatty acids that have long chains of alkyl groups. There are many molecular species due to the fatty acids having many modified compounds such as different carbon num- bers, double bonds and side chains. To measure these mixed com- pounds, each of which have varying structures, mass spectrometry is commonly used. Mass spectrometry is a method used to measure mass of molecules. A mass spectrometer ionizes molecules and de- tect each ions by separation. The measured results are plotted in a graph (mass spectrum) where the horizontal axis represents the mass of ions, m/z, and vertical axis represents amount of ions or intensity [18]. A mass spectrum obtained from lipids using mass spectrometer has many peaks at equal intervals of mass of CH 2 on the m/z axis due to different carbon number. Moreover, mass spectrum has heavy peaks derived from not only monoisotopic ions consisting of only principal isotope but also from other isotopes. To analyze such mass spectrum effectively, it is necessary that the analyst uses not only commonly used mass spectra that have m/z and intensity axes but also a display that has a feature of showing grouped observed peaks occurring at equal intervals. Statistical analysis is a standard method used for the analysis of complex information however has difficulties in highlighting unex- pected results, as well as making it easy to overlook very impor- tant details due to inappropriate ranging. Visual Analytics [9, 19] * e-mail: [email protected], [email protected] e-mail: [email protected] e-mail: [email protected] solves these problems using the human’s advanced visual infor- mation processing system. If a suitable visualization is developed for mass spectrum analysis, analysts will be able to discover not only expected but also unexpected results easily. However, the in- verse also applies; inappropriate visualization can causes analysts to manually mentally reconstruct the appropriate information and data relatioships from the displayed results, increasing the potential processing load. Therefore, it is important that when developing a visualization system one needs to understand not only the features of the given field but also the human visual information processing system, in other words, cognitive science. 2 RELATED WORK Visualization is already used for data processing of mass spectrom- etry. Kincaid et al. developed MassVis which plots the data ob- tained by LC-MS (liquid chromatograph mass spectrometer) using scatter plot like a 2D gel [10]. Corral et al. proposed a visual- ization method for LC-MS using a GPU [5]. Linsen et al. pro- posed a visualization system for comparing LC-MS spectra to an- alyze proteins [12]. Similarly, another visualization method for comparing data obtained from GCxGC-MS (two-dimensional gas chromatography-mass spectrometer) using standard deviation was proposed by Livengood et al. [13]. These three visualization sys- tems shows spectra obtained by a mass spectrometer using chro- matography. However they did not discuss the method of assign- ing peaks within the mass spectra to the materials in question. Our method adopts mass spectrometry due to its ability to be used for materials that have repeating molecular structures, regardless of its chromatography. Nam et al. developed ClusterSculpter that shows clustered results by mass spectrum through data min- ing [14]. Hughey et al. plots 3,900 peaks obtained from a FTICR- MS (fourier transform ion cyclotron resonance mass spectrometer) on a two-dimensional plot of Nominal Kendrick Mass versus the Kendrick Mass Defect [8]. However they did not discuss the in- tensity change of each compounds. Also, they detect only peaks of spectrum but do not detail the non-peaks. Visualization for repeating structures has been covered in infor- mation visualization and visual analytic fields of research. This project also uses this several approaches which improve on cur- rently existing methods. Carlis et al. and Weber et al. developed spiral visualization system for periodic time series data [3,20]. This research is similar as it also uses visualization of periodic data. However, it is difficult to analyze this data when peaks are shifted using spiral visualization due to the peaks are not aligning well. Moreover, this project’s visualization uses a method similar to what scientists are familiar with thus makes analysis easier and more in- tuitive. Lee et al. developed CycleStack to visualize periodic be- havior in an ultrasound video by aligning the phase of all trace im- ages [11]. In contrast, in this research, this m/z shift is deemed important thus the shift is not aligned in the visualization. In terrain visualization, LOD (Level Of Detail) techniques have been proposed [1,6] however this project’s system does not apply any LOD technique due to the number of data points in a mass spectrum being less than that in terrain data, however, consideration is given to this technique due to resolution of mass spectrometer being constantly improved [16, 17]. 63 IEEE Symposium on Biological Data Visualization 2012 October 14 - 15, Seattle, Washington, USA 978-1-4673-4730-3/12/$31.00 ©2012 IEEE

mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

mzRepeat: Visual Analysis of Lipids in Mass SpectrometryShinichi Mukosaka∗

University of Electro-CommunicationsJEOL Ltd.

Kanae Teramoto†

JEOL Ltd.Hideki Koike‡

University of Electro-Communications

ABSTRACT

Mass spectrometry is widely used for identifying molecules thathave similar structures. A mass spectrum obtain from lipids us-ing MALDI-MS (matrix-assisted laser desorption/ionization-massspectrometer) was observed to have peaks at equal intervals on them/z axis. If these repeated peaks can be collectively shown, analysiscan be optimized even further. In this study, we propose a visualiza-tion system for analyzing repeated observed peaks on the m/z axis.The system, mzRepeat, was developed and evaluated by analyzingbutter. A feature of mzRepeat is that ions that differ only in num-ber of repeating structures are shown collectively. It is especiallyeffective for analysis of many peaks from the samples that containseveral molecular species.

Keywords: visual analysis, mass spectrometry, lipid analysis

Index Terms: H.5.2 [Information Systems]: InformationInterfaces and Presentation—User Interfaces; I.3.7 [Comput-ing Methodologies]: Computer Graphics—Three-dimensionalGraphics and Realism; J.2 [General]: Physical Sciences andEngineering—;

1 INTRODUCTION

Lipids are hydrophobic or amphiphilic molecules in biological ob-jects. The analysis of these lipids is important not only for in-vestigating life phenomena but also for microorganism classifica-tion and food inspection. In general, lipids are synthesized in vivoare esters of alcohol and fatty acids that have long chains of alkylgroups. There are many molecular species due to the fatty acidshaving many modified compounds such as different carbon num-bers, double bonds and side chains. To measure these mixed com-pounds, each of which have varying structures, mass spectrometryis commonly used. Mass spectrometry is a method used to measuremass of molecules. A mass spectrometer ionizes molecules and de-tect each ions by separation. The measured results are plotted ina graph (mass spectrum) where the horizontal axis represents themass of ions, m/z, and vertical axis represents amount of ions orintensity [18]. A mass spectrum obtained from lipids using massspectrometer has many peaks at equal intervals of mass of CH2on the m/z axis due to different carbon number. Moreover, massspectrum has heavy peaks derived from not only monoisotopic ionsconsisting of only principal isotope but also from other isotopes.To analyze such mass spectrum effectively, it is necessary that theanalyst uses not only commonly used mass spectra that have m/zand intensity axes but also a display that has a feature of showinggrouped observed peaks occurring at equal intervals.

Statistical analysis is a standard method used for the analysis ofcomplex information however has difficulties in highlighting unex-pected results, as well as making it easy to overlook very impor-tant details due to inappropriate ranging. Visual Analytics [9, 19]

∗e-mail: [email protected], [email protected]†e-mail: [email protected]‡e-mail: [email protected]

solves these problems using the human’s advanced visual infor-mation processing system. If a suitable visualization is developedfor mass spectrum analysis, analysts will be able to discover notonly expected but also unexpected results easily. However, the in-verse also applies; inappropriate visualization can causes analyststo manually mentally reconstruct the appropriate information anddata relatioships from the displayed results, increasing the potentialprocessing load. Therefore, it is important that when developing avisualization system one needs to understand not only the featuresof the given field but also the human visual information processingsystem, in other words, cognitive science.

2 RELATED WORK

Visualization is already used for data processing of mass spectrom-etry. Kincaid et al. developed MassVis which plots the data ob-tained by LC-MS (liquid chromatograph mass spectrometer) usingscatter plot like a 2D gel [10]. Corral et al. proposed a visual-ization method for LC-MS using a GPU [5]. Linsen et al. pro-posed a visualization system for comparing LC-MS spectra to an-alyze proteins [12]. Similarly, another visualization method forcomparing data obtained from GCxGC-MS (two-dimensional gaschromatography-mass spectrometer) using standard deviation wasproposed by Livengood et al. [13]. These three visualization sys-tems shows spectra obtained by a mass spectrometer using chro-matography. However they did not discuss the method of assign-ing peaks within the mass spectra to the materials in question.Our method adopts mass spectrometry due to its ability to be usedfor materials that have repeating molecular structures, regardlessof its chromatography. Nam et al. developed ClusterSculpterthat shows clustered results by mass spectrum through data min-ing [14]. Hughey et al. plots 3,900 peaks obtained from a FTICR-MS (fourier transform ion cyclotron resonance mass spectrometer)on a two-dimensional plot of Nominal Kendrick Mass versus theKendrick Mass Defect [8]. However they did not discuss the in-tensity change of each compounds. Also, they detect only peaks ofspectrum but do not detail the non-peaks.

Visualization for repeating structures has been covered in infor-mation visualization and visual analytic fields of research. Thisproject also uses this several approaches which improve on cur-rently existing methods. Carlis et al. and Weber et al. developedspiral visualization system for periodic time series data [3,20]. Thisresearch is similar as it also uses visualization of periodic data.However, it is difficult to analyze this data when peaks are shiftedusing spiral visualization due to the peaks are not aligning well.Moreover, this project’s visualization uses a method similar to whatscientists are familiar with thus makes analysis easier and more in-tuitive. Lee et al. developed CycleStack to visualize periodic be-havior in an ultrasound video by aligning the phase of all trace im-ages [11]. In contrast, in this research, this m/z shift is deemedimportant thus the shift is not aligned in the visualization.

In terrain visualization, LOD (Level Of Detail) techniques havebeen proposed [1, 6] however this project’s system does not applyany LOD technique due to the number of data points in a massspectrum being less than that in terrain data, however, considerationis given to this technique due to resolution of mass spectrometerbeing constantly improved [16, 17].

63

IEEE Symposium on Biological Data Visualization 2012October 14 - 15, Seattle, Washington, USA 978-1-4673-4730-3/12/$31.00 ©2012 IEEE

Page 2: mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

3 OUR APPROACH

As a representative for a mass spectrometer manufacturer, we havetaken the opportunity to improve on existing methods mentionedabove. In this study, we propose a visualization system, mzRe-peat, for analyzing mass spectra that has peaks at equal intervalsobserved from lipids. A feature of mzRepeat is that it shows ob-served ions in groups that differ only in number of repeating struc-tures. It is especially effective for the analysis of many peaks fromsamples that contain several molecular species.

Figure 1: Comparison of mass spectrum and mzRepeat.

Fig. 1 shows comparison of mass spectrum and mzRepeat. LetM0 be the start of the m/z domain, MR be the mass of the repeatingunit, the m/z upper boundary of the first domain be M0 + MR and thesecond domain be between M0 + MR and M0 + 2MR. Let K be thenumber of separated spectra is equal to or higher than 0, the startingleft position of m/z is M0 + KMR. Horizontal axis represents ∆Mwhich is difference of m/z with respect to origin. Thus, the arbitraryvalue, M, in mzRepeat represented by K and ∆M is calculated byEq. (1).

M = M0 +KMR +∆M (1)

Let the total molecular weight, MN be the sum of the molecularweight of non-repeating units, MB, and N of units of repeating unitMR: Eq. (2).

MN = MB +NMR (2)

If the mass of a molecular ion expressed by Eq. (2) appears in theposition expressed by Eq. (1), the result is Eq. (3) due to M = MN .

∆M = MB +(N −K)MR −M0 (3)

When mass MB is a known molecule, ∆M is calculated byEq. (3). Therefore, for the first separated spectrum: K = 0, M0and MR are specified values by the analyst, and 0 ≤ ∆M < MR. ∆Mwill be the same across all separated spectra due to the increase of

K and N being the same. So, molecular ions that has same MB anddifferent number of repeating units are displayed as a vertical linein mzRepeat.

For a simple implementation using this method, information be-comes a 3D visualisation. The 3D view is suitable for general anal-ysis but it has problem of occlusion [7]. The 2D view, however,is suitable for detailed analysis due to ease of viewing for exactposition. Therefore, it is important that the system combines boththe abstract 3D view and detailed 2D view. In the 2D visualiza-tion, the assignment of color to intensity was deemed suitable as them/z value is the most important factor in determining and identify-ing molecules. Furthermore, there will be situations where analystswould require intensity information , thus the system requires a fea-ture of changing 3D representations to requirements. If both the 2Dview and 3D view were shown separately, determining a correlationbetween between 2D and 3D images would be confusing due to ap-pearing many peaks in mass spectrometry. Therefore, integrationof 2D and 3D displays is required, for example, animation.

4 SYSTEM

We developed a visualization system, mzRepeat, for analyzingmass spectrum that has peaks at equal intervals. The system runsrapidly using GPU shader.

Figure 2: Overview of mzRepeat user interface.

Fig. 2 shows an overview of our visualization system. Horizontalaxis represents ∆M, vertical axis represents intensity and depth axisrepresents m/z offset. Additionaly, The depth axis has a scale whichallows for easy read m/z value roughly. Fig. 3 shows the informa-tion flow of mzRepeat. mzRepeat separates the mass spectrum bythe mass of the repeating unit, in a specified m/z domain. Each sep-arated mass spectrum is aligned at even intervals. The data pointsof each separated spectrum was linked using triangulation and visu-alized in 3D. Also, 2D visualization was implemented to use colorsto represent intensity.

64

Page 3: mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

Mass spectrum

m/z range

Repeating unit

Read

Separation

Alignment

Mass spectrum

Separated mass spectra

3D objectCoordination

transformation

Image

Data processing

Visualization

The System

Interaction

User input

Figure 3: Information flow of mzRepeat.

4.1 ColorPeaks in mass spectrum appears as spikes in flat background.Therefore, it is difficult to distinguish between peak and noise us-ing a linear color gradient scale. So to solve this, the color scaleis assigned non linearly, set to be proportional to the square root ofrelative intensity such that changes in lower intensities can be seeneasily. Gray scale and rainbow color map are provided and can betoggled. Gray scale is straight forward however has low resolutionwhile the color map is of higher resolution but requires reference. Ifintensity is higher, the gray scale changes from white to black whilethe color map ranges from black to purple, blue, green, yellow, red,and then to white. However, there is a couple of issues with usinga rainbow color map [2]. To researchers in natural science, who arefamiliarized with the use of colors, increasing the color resolutionwill make analysis easier using zoom magnification.

4.2 Dimension2D visualization is efficient in showing many data points on a dis-play. However, it is difficult to grasp intensity from just the colorof the points. If one wishes to view intensity accurately, then onemust use 3D visualization,. The implementation using animation tochange from 2D to 3D views allows smooth movement between 2Dand 3D views without losing context of particular peaks.

Figure 4: Animation between 2D and 3D visualizations.

In 3D visualization, the lines between the data points of the hori-zontal axis, x (m/z), depth axis, z (m/z offset), and the vertical axis, y(intensity) are linearly interpolated to easily see the unity of peaks.In the 2D visualization, lines between data points of horizontal anddepth axes (x and z) are linear interpolated similar to a electrophore-sis gel to allow for easy viewing of the differences of each separatespectrum. By displaying results in a way that is similar to the re-sults of a electrophoresis gel, a method that scientists and analysts

within the field of biology are familiar with, the reading of suchresults will become easier and more familiar.

4.3 InteractionIn order to show both 3D and 2D visualizations from a variable viewpoint, the system provides a capability of zooming and rotating theview around all axes. In addition, it also provides scaling and fram-ing on the x-y plane (m/z vs. intensity) due to the large number ofpoints. The values of m/z and ∆M at each peak which are requiredto determine the structure of the molecule and the intensity which isused for determining the amount of ions are both important. There-fore, the system will show the m/z value, ∆M value and intensityvalue at the nearest data point when clicked. Once clicked again,the system will show the difference between the values of the cur-rently clicked position and the previously clicked position to allowfor easy comparison of molecular species.

5 VISUALIZATION EXAMPLE

For evaluation purposes, mzRepeat was used to visualizea lipid obtained from MALDI-MS (matrix-assisted laserdesorption/ionization-mass spectrometer) analysis of butter.A sample consisted of butter, cationization agent NaI (SodiumIodide), DHB (2,5-dihidroxybenzoic acid matrix for sampleionization) dissolved in THF (tetrahydrofuran). The data wasobtained by the positive ion Spiral mode (high resolution mode)of the JMS-S3000 (MALDI SpiralTOF) [15] that has a resolution(FWHM) of over 60,000.

Molecular species of lipids are written using carbon number andnumber of double bonds. A lipid is written as A(B:C) where Lipidmolecular species is A, carbon number of fatty acid is B and numberof double bond is C.

Fig. 5 shows a mass spectrum of lipids in butter and magnifi-cations at specific m/z ranges. This mass spectrum has 178,427data points and about 200 peaks were detected deriving from lipidcomponents of butter such as sodium adduct ions of TAGs (triacyl-grycerol) [4] and DAGs (diacylgrycerol). Thus this mass spectrumhas many peaks at equal intervals on the m/z axis.

Figure 5: (i) Mass spectrum of butter between m/z 455–1000: en-larged views of (ii) 600–609 and (iii) 615–624.

In existing methods, analyst will look this mass spectrum foranalysis. Fig. 5(i) shows a mass spectrum between m/z 455–1000.Although the intensity distribution of TAGs from observed equalinterval peak groups can be roughly checked, the possibility ofmisassigning molecular species or double bonds can occur due tothe m/z information not being displayed accurately. So, the analyst

65

Page 4: mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

must calculate the m/z position of the molecule in question before-hand and then find out if there are observed ions at this m/z po-sition by zooming into the mass spectrum to prevent this. It canbe observed in Fig. 5(ii) that there is TAG(32:1) at m/z 603.46 andTAG(33:0) at m/z 605.48. Similarly, it is confirmed from Fig. 5(iii)that TAG(33:1) can be seen at m/z 617.48, and DAG(35:1) at m/z617.52 , TAG(33:0) is also observed at m/z 619.50 and DAG(35:0)at m/z 619.53. In addition, peaks appearing at 1 u after each ob-served peaks which are derived from isotopic ions can also be seen.

As a result, mass difference between TAG(32:1) and TAG(33:1)and between TAG(32:0) and TAG(33:0) is 14.02 and that is massof CH2 due to equal series of TAG. Likewise, the difference ofeach peak derived from other lipids or isotopic ions is 14.02 at eachinterval. To check the abundance ratio of molecular species, it isnecessary to assign the lipid at its observed peak at m/z and createhistogram from intensities. Thus, analyzing lipids with this methodrequires a substantial amount of time due to necessary informationbeing dispersed throughout the mass spectrum. Furthermore, it iscritical to have visual analysis due to the extremely high chance ofmistaking the peak with one that was not expected when automaticassigning of peaks because of the minute differences in mass, muchlike in TAG(33:0) and DAG(35:0).

Fig. 6(i) shows the 2D visualization from mzRepeat. Its m/z do-main is 455–1000 and repeating unit is 14.0157 u which is massof the base increment of an alkyl chain, CH2. Table 1 shows ∆Mand MB calculated when M0 = 455.0000 and MR = 14.0157. Ap-pearance positions of sodium adduct ions of TAGs and DAGs areshown as tags in the Fig. 6(i)–(v). The observed positions of TAGcan be classified by the difference in the number of double bonds asshown. It can be observed that the results of TAG(n:1), TAG(n:2)and TAG(n:3) (Fig. 6(iii)–(v)) are at 2 u intervals with the rightmostbeing TAG(n:0) (Fig. 6(ii)) and as the number of double bonds in-crease, the intensity decreases. Furthermore, by looking at the off-set axis, z, the relationship between the series’ total carbon numberand the intensity can be seen. Since the changes in color shadingcan be seen as a dotted line it can be determined that the intensityincreases alternately for each increase in total carbon number.

Table 1: Molecular species in butter as analyzed by mzRepeat.

Name Formula MB ∆M†

TAG (n:0) C3+nH2n+2O6 + Na 156.9749 10.3192TAG (n:1) C3+nH2nO6 + Na 154.9593 8.3036TAG (n:2) C3+nH2n−2O6 + Na 152.9436 6.2879TAG (n:3) C3+nH2n−4O6 + Na 150.9280 4.2723DAG (n:0) C2+nH2n+2O5 + Na 128.9800 10.3556DAG (n:1) C2+nH2nO5 + Na 126.9643 8.3399DAG (n:2) C2+nH2n−2O5 + Na 124.9487 6.3243DAG (n:3) C2+nH2n−4O5 + Na 122.9330 4.3086

†∆M is calculated from M0 = 455.0000 MR = 14.0157.

Fig. 6(ii) shows enlarged view around TAG(n:0) in Fig. 6(i). Inthis figure, another series of molecular species heavier than TAGis also obtained. By using the system’s interactive capabilities, itcan be shown that this series’ ∆M is 0.03 u is higher than TAG(n:0);therefore, it is confirmed that there are also DAGs (from Table 1).In addition, TAG’s distribution of carbon numbers is between 25–59 and DAG’s is between 27–40. TAG(n:0) and DAG(n:0) overallmaximum intensity occurs at odd carbon numbers 39 and 35 respec-tively. This comparison of similar molecular species but with dif-ferent carbon number described above using a standard mass spec-trum is difficult however it is substantially easier to visually graspthis information in mzRepeat. Fig. 7 shows the 3D visualization ofFig. 6(ii) of TAG(n:0). The 3D visualization is more clearer than the2D in terms of visual confirmation of intensity distribution. The in-tensity difference found between TAG(39:0) and DAG(35:0) is also

seen numerically thus DAG’s intensity is tenth lower than TAG’s.Normally, it is necessary to make graph to see intensity distributionfrom each peak intensity. In this system, the analyst can obtain theintensity distribution information by just looking at the 3D visual-ization. The relationship between carbon number and intensity ofTAG(n:1), TAG(n:2) and TAG(n:3) (Fig. 6(iii),(iv),(v)) can be eas-ily seen in a similar fashion.

Figure 7: 3D visualization of butter using mzRepeat for TAG(n:0)(Fig. 6(ii)).

6 DISCUSSION

In this paper we evaluated the effectiveness of a proposed systemby analysing the lipid components of butter. The system has sim-ple method to display mass spectra separated by the repeating unitmass. The distribution of each series can be promptly seen as aresult.

Given that the peaks of the compounds in question appear at spe-cific values of ∆M, analysts are able to easily collectively confirmthese peaks. Also, isotopic ions increase the number of peaks whichresult in an overlap making the analysis of mass spectra complex,however, with mzRepeat, the complexity is reduced by the displayof easily perceivable grouped peaks on the depth axis.

mzRepeat is especially effective for mass spectra that have manypeaks. Apart from classifying molecular species by number of re-peating units, it can also classify molecular species with differingend groups and other modifications. Furthermore, it reduces over-sight avoids loss of information as it does not use information ag-

66

Page 5: mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

Figure 6: 2D visualization of butter using mzRepeat between m/z 1068–1503 with a repeating unit of 14.0157 u (mass of CH2). (ii)–(v) enlargedviews of observed TAG(n:0–3)/DAG(n:0–3)

gregation or statistical analysis. In addition, it shows more infor-mation than the mass spectrum due to space efficiency.

In the 3D visualization, intensity distribution of each molecu-lar species can be clearly seen due to the graph-like representa-tion. Moreover, the 2D visualization also clearly displays multiplemolecular species, each of which can be promptly identified due tomolecules with a greater number of repeating units being arrangedvertically. The analyst can also change between 2D and 3D viewswithout losing context due to the integration of these views despitethe great number of relevant peaks.

In this system, the color was assigned proportional to the squareroot of the intensity. This allows the appearance of peaks that differ10 times or greater to be seen immediately. When analysing massspectra, there are situations where both the existence of strong andweak peaks should be detected, thus a non-linear scale would bemore appropriate.

7 CONCLUSION

In mass spectrometry, comprehensive information such as modifi-cation of end groups, number of repeating structures and intensitydistribution is important for the analysis and classification of com-pounds with repeating structures. As this type of analysis cannotbe done by programmatically, this visualization system, which useshuman cognition is extremely effective.

In this paper, we proposed visualization system for analyzingmolecules with repeating structures and described visualization ex-ample of lipids obtained from MALDI-MS in mzRepeat. This de-veloped system can also be applied to mass spectra obtained fromother mass spectrometry methods that have ions repeating in equalintervals, for example, in microorganism classification, synthesispolymer analysis and quality control.

REFERENCES

[1] A. Bogomjakov and C. Gotsman. Gpu-assisted z-field simplification.3D Data Processing Visualization and Transmission, InternationalSymposium on, 0:673–679, 2004.

[2] D. Borland and R. Taylor. Rainbow color map (still) considered harm-ful. Computer Graphics and Applications, IEEE, 27(2):14 –17, 2007.

[3] J. V. Carlis and J. A. Konstan. Interactive visualization of serial pe-riodic data. In Proceedings of the 11th annual ACM symposium onUser interface software and technology, UIST ’98, pages 29–38, NewYork, NY, USA, 1998. ACM.

[4] B. P. Chapagain and Z. Wiesman. Maldi-tof/ms fingerprinting of tri-acylglycerols (tags) in olive oils produced in the israeli negev desert.Journal of Agricultural and Food Chemistry, 57(4):1135–1142, 2009.

[5] J. D. Corral and H. Pfister. Hardware-accelerated 3d visualization ofmass spectrometry data. Visualization Conference, IEEE, 0:56, 2005.

[6] M. Duchaineau, M. Wolinsky, D. Sigeti, M. Miller, C. Aldrich, andM. Mineev-Weinstein. Roaming terrain: Real-time optimally adaptingmeshes. Visualization Conference, IEEE, 0:81, 1997.

[7] N. Elmqvist and P. Tsigas. A taxonomy of 3d occlusion managementfor visualization. Visualization and Computer Graphics, IEEE Trans-actions on, 14(5):1095 –1109, sept.-oct. 2008.

[8] C. A. Hughey, C. L. Hendrickson, R. P. Rodgers, A. G. Marshall, andK. Qian. Kendrick mass defect spectrum: a compact visual analysis forultrahigh-resolution broadband mass spectra. Analytical Chemistry,73(19):4676–4681, 2001. PMID: 11605846.

[9] D. Keim, F. Mansmann, J. Schneidewind, J. Thomas, and H. Ziegler.Visual analytics: Scope and challenges. In S. Simoff, M. Bohlen,and A. Mazeika, editors, Visual Data Mining, volume 4404 of LectureNotes in Computer Science, pages 76–90. Springer Berlin / Heidel-berg, 2008.

[10] R. Kincaid and K. Dejgaard. Massvis: Visual analysis of protein com-plexes using mass spectrometry. In Visual Analytics Science and Tech-nology, 2009. VAST 2009. IEEE Symposium on, pages 163 –170, 2009.

[11] T.-Y. Lee, A. Chaudhuri, F. Porikli, and H.-W. Shen. Cyclestack: In-ferring periodic behavior via temporal sequence visualization in ultra-sound video. In Pacific Visualization Symposium (PacificVis), 2010IEEE, pages 89 –96, 2010.

[12] L. Linsen, J. Locherbach, M. Berth, and J. Bernhardt. Differ-ential protein expression analysis via liquid-chromatography/mass-spectrometry data visualization. Visualization Conference, IEEE,0:57, 2005.

[13] P. Livengood, R. Maciejewski, W. Chen, and D. S. Ebert. A visualanalysis system for metabolomics data. In Biological Data Visualiza-tion (BioVis), 2011 IEEE Symposium on, pages 71 –78, oct. 2011.

[14] E. J. Nam, Y. Han, K. Mueller, A. Zelenyuk, and D. Imre. Cluster-sculptor: A visual analytics tool for high-dimensional data. Sympo-

67

Page 6: mzRepeat: Visual Analysis of Lipids in Mass Spectrometryweb.cse.ohio-state.edu/~machiraju.1/teaching/CSE...pounds, each of which have varying structures, mass spectrometry is commonly

sium On Visual Analytics Science And Technology, 0:75–82, 2007.[15] T. Satoh, T. Sato, A. Kubo, and J. Tamura. Tandem time-of-flight

mass spectrometer with high precursor ion selectivity employing spi-ral ion trajectory and improved offset parabolic reflectron. Journalof The American Society for Mass Spectrometry, 22:797–803, 2011.10.1007/s13361-011-0090-3.

[16] T. Satoh, T. Sato, and J. Tamura. Development of a high-performancemaldi-tof mass spectrometer utilizing a spiral ion trajectory. Journalof the American Society for Mass Spectrometry, 18(7):1318 – 1323,2007.

[17] S. Shimma, H. Nagao, J. Aoki, K. Takahashi, S. Miki, and M. Toy-oda. Miniaturized high-resolution time-of-flight mass spectrome-ter multum-s ii with an infinite flight path. Analytical Chemistry,82(20):8456–8463, 2010.

[18] M. Strohalm, M. Hassman, B. Kosata, and M. Kodıcek. mmass dataminer: an open source alternative for mass spectrometric data anal-ysis. Rapid Communications in Mass Spectrometry, 22(6):905–908,2008.

[19] J. Thomas and K. Cook. A visual analytics agenda. Computer Graph-ics and Applications, IEEE, 26(1):10 – 13, jan.-feb. 2006.

[20] M. Weber, M. Alexa, and W. Muller. Visualizing time-series on spi-rals. In Information Visualization, 2001. INFOVIS 2001. IEEE Sym-posium on, pages 7 –13, 2001.

68