Upload
yintengfei
View
1.884
Download
0
Embed Size (px)
Citation preview
Interface 2012, Rice UniversityStepping
0
5
10
15
0 100 200 300
Coverage
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
12
34
56
78
910
1112
1314
1516
1718
1920
2122
XY
0.0e+00 5.0e+07 1.0e+08 1.5e+08 2.0e+08
gieStainacengneggpos100gpos25gpos50gpos75gvarstalk
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
ll
0M 50M
100M
150M
200M
0M
50M
100M
150M
200M
0M
50M
100M
150M
0M
50M
100M
150M
0M
50M
100M
150M
0M
50M
100M
150M
0M
50M100M150M0M50M100M
0M
50M
100M
0M
50M
100M
0M
50M100M
0M
50M
100M
0M
50M
100M
0M
50M
100M
0M
50M
100M
0M
50M
0M
50M
0M
50M
0M
50M 0M
50M 0M
0M 50M 1
2
34
5
6
7
8910
11
12
1314
1516
1718
1920
21 22
rearrangementsinterchromosomalintrachromosomal
tumreadsl
l
l
l
l
4681012
ggbio Extending the
Grammar of Graphics to Genomic Data
Tengfei Yin, Di Cook Iowa State UniversityMichael Lawrence
Genentech
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
MotivationLots of tools exist for displaying genomic dataMany different packages, many standalone, many different data standards
2
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
Circos
Figure 2
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
Circos
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
3
CircosNeed construct a central and many other
configuration files from scratch, learning curve is very high
Adding legend not easyCannot map aesthetics to certain
variables
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome BrowserKaryogram view, with associated data plotted
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome BrowserKaryogram view, with associated data plotted
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome Browser
Logical zoom, all we know about
this genetic code
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome Browser
Logical zoom, all we know about
this genetic code
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome Browser
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
4
UCSC Genome BrowserVery commonly used, very popularGives broadly applicable, generic, but
narrow selection of plot choicesNo operations on genomic ranges views to
facilitate perception of structure
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
5
Chromosome X
65.93 mb
65.94 mb
65.95 mb
65.96 mb
65.97 mb
3040506070
−2024
5 Composite plots for multiple chromosomes
As mentioned in the introduction section, a set of Gviz tracks has to share the same chromosome when
plotted, i.e., only a single chromosome can be active during a given plotting operation. Consequently, we can
not directly create plots for multiple chromosomes in a single call to the plotTracks function. However, since
the underlying graphical infrastructure of the Gviz package uses grid graphics, we can build our own composite
plot using multiple consecutive plotTracks calls. All we need to take care of is an adequate layout structure
to plot into, and we also need to tell plotTracks not to clear the graphics device before plotting, which can
be archieved by setting the function’s add argument to FALSE. For details on how to create a layout structure
in the grid graphics system, please see the help page at ? grid.layout).
We start by creating an AnnotationTrack objects and a DataTrack object which both contain data for
several chromosomes.
> chroms <- c("chr1", "chr2", "chr3", "chr4")
> maTrack <- AnnotationTrack(range = GRanges(seqnames = chroms,
+ ranges = IRanges(start = 1, width = c(100, 400,
+ 200, 1000)), strand = c("+", "+", "-", "+")),
+ genome = "mm9", chromosome = "chr1", name = "foo")
> mdTrack <- DataTrack(range = GRanges(seqnames = rep(chroms,
+ c(10, 40, 20, 100)), ranges = IRanges(start = c(seq(1,
38
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
5
Gviz (Hahne et al)Chromosome X
65.93 mb
65.94 mb
65.95 mb
65.96 mb
65.97 mb
3040506070
−2024
5 Composite plots for multiple chromosomes
As mentioned in the introduction section, a set of Gviz tracks has to share the same chromosome when
plotted, i.e., only a single chromosome can be active during a given plotting operation. Consequently, we can
not directly create plots for multiple chromosomes in a single call to the plotTracks function. However, since
the underlying graphical infrastructure of the Gviz package uses grid graphics, we can build our own composite
plot using multiple consecutive plotTracks calls. All we need to take care of is an adequate layout structure
to plot into, and we also need to tell plotTracks not to clear the graphics device before plotting, which can
be archieved by setting the function’s add argument to FALSE. For details on how to create a layout structure
in the grid graphics system, please see the help page at ? grid.layout).
We start by creating an AnnotationTrack objects and a DataTrack object which both contain data for
several chromosomes.
> chroms <- c("chr1", "chr2", "chr3", "chr4")
> maTrack <- AnnotationTrack(range = GRanges(seqnames = chroms,
+ ranges = IRanges(start = 1, width = c(100, 400,
+ 200, 1000)), strand = c("+", "+", "-", "+")),
+ genome = "mm9", chromosome = "chr1", name = "foo")
> mdTrack <- DataTrack(range = GRanges(seqnames = rep(chroms,
+ c(10, 40, 20, 100)), ranges = IRanges(start = c(seq(1,
38
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
5
Gviz (Hahne et al)
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Motivation
5
Gviz (Hahne et al)Pretty good!Incorporated with R, and R data
structuresUses grid (low level) graphics, very
flexible, but not leveraging tools like ggplot2
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
OutlineWhat is the grammar of graphics?How it is extended for genomic data.ExamplesNext steps: interactive graphics
6
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
GrammarGrammar forms the foundation of a language. It is a set of structural rules that govern composition.For graphics, it provides a way to construct a plot in a common form, and enables clarification of similarities and differences between plots.
7
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()
day
count
020406080
Thu
FriSat
Sunday
Thu
Fri
Sat
Sun
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()
day
count
020406080
Thu
FriSat
Sunday
Thu
Fri
Sat
Sun
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
Pie chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()
day
count
020406080
Thu
FriSat
Sunday
Thu
Fri
Sat
Sun
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
day
count
020406080
Thu
FriSat
Sunday
Thu
Fri
Sat
Sun
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Rose plot/Coxcombggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1) + coord_polar()
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
day
count
020406080
Thu
FriSat
Sunday
Thu
Fri
Sat
Sun
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)
day
count
0
20
40
60
80
Thu Fri Sat Sun
dayThu
Fri
Sat
Sun
Bar chartggplot(data=tips, aes(x=day, fill=day)) + geom_bar(width=1)
day
count
020406080
Thu
FriSat
Sunday
Thu
Fri
Sat
Sun
8
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)Stacked bar chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1)
""
count
0
50
100
150
200
dayThu
Fri
Sat
Sun
9
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)Stacked bar chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1)
Pie chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1) + coord_polar(theta="y")
""
count
0
50
100
150
200
dayThu
Fri
Sat
Sun
9
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar (ggplot2)Stacked bar chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1)
Pie chartggplot(data=tips, aes(x=””, fill=day)) + geom_bar(width=1) + coord_polar(theta="y")
""
count
0
50
100
150
200
dayThu
Fri
Sat
Sun
""
count
0
50
100150
200 dayThu
Fri
Sat
Sun
9
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Grammar ElementsDATA: What is to be plottedSTAT: Statistical operations to make on data, like binning.GEOM: Geometric object, elements to use to displays aspects of the dataSCALE: Map data to aesthetics to geomCOORD: Coordinate system to use, eg Cartesian(FACET): subset and display
10
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
DATA=expression data frame, x=average intensity, y=fold change
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plotGEOM=point
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
SCALE=x is logged
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
SCALE=color is mapped to statistical significance
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
COORD=default, Cartesian
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
FACET=none
ggbio - Genomic Data Vis - Interface 2012, Rice University /3111
Example: MA plot
ggbio - Genomic Data Vis - Interface 2012, Rice University /3112
Example: MA plotqplot(baseMean, log2FoldChange, data = res, geom = "point", xlab = "Normalized mean", ylab = "log2 fold change", xlim = c(0, 10000), color = group) + scale_x_log10() + scale_color_manual( values = c("black", "red"))
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
What’s different?Genomic data has interval contextSeveral common geoms used in standard plots, not in current grammarAdditional transformations commonLining up of multiple data plots, especially against genome
13
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
What’s different?
14
No seqnames ranges strand tx id exon id
1 chrX [48242968, 48243005] + 35775 132624
2 chrX [48243475, 48243563] + 35775 132625
3 chrX [48244003, 48244117] + 35775 132626
4 chrX [48244794, 48244889] + 35775 132627
5 chrX [48246753, 48246802] + 35775 132628
... ... ... ... ... ... ...
26 chrX [48270193, 48270307] - 35778 132637
27 chrX [48269421, 48269516] - 35778 132636
28 chrX [48267508, 48267557] - 35778 132635
29 chrX [48262894, 48262998] - 35778 132633
30 chrX [48261524, 48262111] - 35778 132632
Table 2: Typical biological data coerced into a data frame: A GRanges table representing gene SSX4 and
SSX4B. One row represents one exon, seqnames indicates the chromosome name, ranges indicates the interval
of exons, strand is the direction, tx id and exon id are the internal id’s used for mapping cross database.
31
DATA: Genomic ranges
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Extensions
15
layout
data source(s)
abstract data (formal model) meta data
geom stat scale
coord facet
plots
grammar of graphics with extensions
autoplot
I/O packages in bioconductor
tracks
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Extensions
16
current software. Development of new visualization tools should beindependently factorized into components of the grammar. Table 1describes the extensions developed in this work.
Comp name usage icon
geom geom rect rectangle
geom segment segment
geom chevron chevron
geom arrow arrow
geom arch arches
geom bar bar
geom alignment alignment (gene)
stat stat coverage coverage (of reads)
stat mismatch mismatch pileup foralignments
stat aggregate aggregate in slidingwindow
stat stepping avoid overplotting
stat gene consider gene struc-ture
stat table tabulate ranges
stat identity no change
coord linear ggplot2 linear butfacet by chromo-some
genome put everything ongenominc coordi-nates
truncate gaps compact view byshrinking gaps
layout track stacked tracks
karyogram karyogram display
chr1chr2
chr3
50 100 150 200 250 300start
circle circular
faceting formula facet by formula
ranges facet by rangesscale not extended ggplot2default
Table 1: Components of the basic grammar of graphics, with the
extensions available in ggbio.
In comparison to regular data elements which might be mappedto the ggplot2 geoms of points, lines and polygons, genomic datahas a basic currency of an interval. Intervals underlie exons, in-trons and locations on the genome, which form the reference framefor biological data. We have introduced several new geoms forrepresenting intervals and connections between intervals: rectan-gle, segment, chevron, arch, arrow and arrow rectangle. The geomalignment and its variants are combinations of those geoms, whichfunction as a unit. For example, the alignment geom might drawexons as arrow rectangles and introns as chevrons. Figure 2 showsthe new geoms for interval data.
layout
data source(s)
abstract data (formal model) meta data
geom stat scale
coord facet
plots
grammar of graphics with extensions
autoplot
I/O packages in bioconductor
tracks
Figure 1: Diagram of the framework for biological data visualiza-
tion. It starts with a mapping from empirical data to an abstract data
model, followed by a general and extended grammar of graphics that
map data elements to graphical elements. Orange boxes indicate the
new components provided by ggbio and dashed frame indicates the
body of grammar of graphics, including the parts we extended with
ggbio.
Several new types of statistical transformation (stat) are defined.In ggplot2, some common transformations are possible, for exam-ple binning to create a histogram, or smoothing to add a line rep-resenting a model fit to the data. For genomic data there are somecommonly useful transformations that are incorporated in ggbio:coverage, i.e., feature stack depth, and mismatch summaries fromread alignments.
Additional types of coordinate system (coord), layout andfaceting methods are also available. These additional componentsare listed in Table 1, an they are described in more detail in the plotexamples.
Let’s analyze the anatomy in a minimal example, illustrated inFigure 3, to get an impression of the components of the grammar.In this plot, we are showing a gene structure with four transcriptsby using the alignment geom, stepping transformation and aestheticattributes such as color mapped to strand, in the genomic coordinatesystem. It will become clear that almost all graphics found in exist-ing genome browsers or visualization tools could be described bythe different components introduced here. This is the strength ofthe grammar of graphics. While it may appear simplistic initially,once one gains a deeper understanding of the grammar, one maydiscover that seemingly complex data graphics can be abstracted ascomponents from “pictures” and thus made tangible. This will aidin the design of future graphics.
The following sections of the paper describe the grammar exten-sions, and provide examples of their use.
2.1 Modeling Data
Data are the first component of the grammar, and data may be col-lected in different ways. Wilkinson makes a distinction betweenempirical data, abstract data and metadata [19]. Empirical data arecollected from observations of the real world, while abstract dataare defined by a formal mathematical model. Metadata are dataabout data, which might be empirical, abstract or metadata them-selves.
Genomic data are often communicated in tabular text files, suchas csv and tab-delimited files. Rows always represent observationsand columns always represent a set of variables. Annotation tracksare usually stored according to specific formats, each with a fixed
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
current software. Development of new visualization tools should beindependently factorized into components of the grammar. Table 1describes the extensions developed in this work.
Comp name usage icon
geom geom rect rectangle
geom segment segment
geom chevron chevron
geom arrow arrow
geom arch arches
geom bar bar
geom alignment alignment (gene)
stat stat coverage coverage (of reads)
stat mismatch mismatch pileup foralignments
stat aggregate aggregate in slidingwindow
stat stepping avoid overplotting
stat gene consider gene struc-ture
stat table tabulate ranges
stat identity no change
coord linear ggplot2 linear butfacet by chromo-some
genome put everything ongenominc coordi-nates
truncate gaps compact view byshrinking gaps
layout track stacked tracks
karyogram karyogram display
chr1chr2
chr3
50 100 150 200 250 300start
circle circular
faceting formula facet by formula
ranges facet by rangesscale not extended ggplot2default
Table 1: Components of the basic grammar of graphics, with the
extensions available in ggbio.
In comparison to regular data elements which might be mappedto the ggplot2 geoms of points, lines and polygons, genomic datahas a basic currency of an interval. Intervals underlie exons, in-trons and locations on the genome, which form the reference framefor biological data. We have introduced several new geoms forrepresenting intervals and connections between intervals: rectan-gle, segment, chevron, arch, arrow and arrow rectangle. The geomalignment and its variants are combinations of those geoms, whichfunction as a unit. For example, the alignment geom might drawexons as arrow rectangles and introns as chevrons. Figure 2 showsthe new geoms for interval data.
layout
data source(s)
abstract data (formal model) meta data
geom stat scale
coord facet
plots
grammar of graphics with extensions
autoplot
I/O packages in bioconductor
tracks
Figure 1: Diagram of the framework for biological data visualiza-
tion. It starts with a mapping from empirical data to an abstract data
model, followed by a general and extended grammar of graphics that
map data elements to graphical elements. Orange boxes indicate the
new components provided by ggbio and dashed frame indicates the
body of grammar of graphics, including the parts we extended with
ggbio.
Several new types of statistical transformation (stat) are defined.In ggplot2, some common transformations are possible, for exam-ple binning to create a histogram, or smoothing to add a line rep-resenting a model fit to the data. For genomic data there are somecommonly useful transformations that are incorporated in ggbio:coverage, i.e., feature stack depth, and mismatch summaries fromread alignments.
Additional types of coordinate system (coord), layout andfaceting methods are also available. These additional componentsare listed in Table 1, an they are described in more detail in the plotexamples.
Let’s analyze the anatomy in a minimal example, illustrated inFigure 3, to get an impression of the components of the grammar.In this plot, we are showing a gene structure with four transcriptsby using the alignment geom, stepping transformation and aestheticattributes such as color mapped to strand, in the genomic coordinatesystem. It will become clear that almost all graphics found in exist-ing genome browsers or visualization tools could be described bythe different components introduced here. This is the strength ofthe grammar of graphics. While it may appear simplistic initially,once one gains a deeper understanding of the grammar, one maydiscover that seemingly complex data graphics can be abstracted ascomponents from “pictures” and thus made tangible. This will aidin the design of future graphics.
The following sections of the paper describe the grammar exten-sions, and provide examples of their use.
2.1 Modeling Data
Data are the first component of the grammar, and data may be col-lected in different ways. Wilkinson makes a distinction betweenempirical data, abstract data and metadata [19]. Empirical data arecollected from observations of the real world, while abstract dataare defined by a formal mathematical model. Metadata are dataabout data, which might be empirical, abstract or metadata them-selves.
Genomic data are often communicated in tabular text files, suchas csv and tab-delimited files. Rows always represent observationsand columns always represent a set of variables. Annotation tracksare usually stored according to specific formats, each with a fixed
Extensions
17
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Extensions
18
current software. Development of new visualization tools should beindependently factorized into components of the grammar. Table 1describes the extensions developed in this work.
Comp name usage icon
geom geom rect rectangle
geom segment segment
geom chevron chevron
geom arrow arrow
geom arch arches
geom bar bar
geom alignment alignment (gene)
stat stat coverage coverage (of reads)
stat mismatch mismatch pileup foralignments
stat aggregate aggregate in slidingwindow
stat stepping avoid overplotting
stat gene consider gene struc-ture
stat table tabulate ranges
stat identity no change
coord linear ggplot2 linear butfacet by chromo-some
genome put everything ongenominc coordi-nates
truncate gaps compact view byshrinking gaps
layout track stacked tracks
karyogram karyogram display
chr1chr2
chr3
50 100 150 200 250 300start
circle circular
faceting formula facet by formula
ranges facet by rangesscale not extended ggplot2default
Table 1: Components of the basic grammar of graphics, with the
extensions available in ggbio.
In comparison to regular data elements which might be mappedto the ggplot2 geoms of points, lines and polygons, genomic datahas a basic currency of an interval. Intervals underlie exons, in-trons and locations on the genome, which form the reference framefor biological data. We have introduced several new geoms forrepresenting intervals and connections between intervals: rectan-gle, segment, chevron, arch, arrow and arrow rectangle. The geomalignment and its variants are combinations of those geoms, whichfunction as a unit. For example, the alignment geom might drawexons as arrow rectangles and introns as chevrons. Figure 2 showsthe new geoms for interval data.
layout
data source(s)
abstract data (formal model) meta data
geom stat scale
coord facet
plots
grammar of graphics with extensions
autoplot
I/O packages in bioconductor
tracks
Figure 1: Diagram of the framework for biological data visualiza-
tion. It starts with a mapping from empirical data to an abstract data
model, followed by a general and extended grammar of graphics that
map data elements to graphical elements. Orange boxes indicate the
new components provided by ggbio and dashed frame indicates the
body of grammar of graphics, including the parts we extended with
ggbio.
Several new types of statistical transformation (stat) are defined.In ggplot2, some common transformations are possible, for exam-ple binning to create a histogram, or smoothing to add a line rep-resenting a model fit to the data. For genomic data there are somecommonly useful transformations that are incorporated in ggbio:coverage, i.e., feature stack depth, and mismatch summaries fromread alignments.
Additional types of coordinate system (coord), layout andfaceting methods are also available. These additional componentsare listed in Table 1, an they are described in more detail in the plotexamples.
Let’s analyze the anatomy in a minimal example, illustrated inFigure 3, to get an impression of the components of the grammar.In this plot, we are showing a gene structure with four transcriptsby using the alignment geom, stepping transformation and aestheticattributes such as color mapped to strand, in the genomic coordinatesystem. It will become clear that almost all graphics found in exist-ing genome browsers or visualization tools could be described bythe different components introduced here. This is the strength ofthe grammar of graphics. While it may appear simplistic initially,once one gains a deeper understanding of the grammar, one maydiscover that seemingly complex data graphics can be abstracted ascomponents from “pictures” and thus made tangible. This will aidin the design of future graphics.
The following sections of the paper describe the grammar exten-sions, and provide examples of their use.
2.1 Modeling Data
Data are the first component of the grammar, and data may be col-lected in different ways. Wilkinson makes a distinction betweenempirical data, abstract data and metadata [19]. Empirical data arecollected from observations of the real world, while abstract dataare defined by a formal mathematical model. Metadata are dataabout data, which might be empirical, abstract or metadata them-selves.
Genomic data are often communicated in tabular text files, suchas csv and tab-delimited files. Rows always represent observationsand columns always represent a set of variables. Annotation tracksare usually stored according to specific formats, each with a fixed
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Extensionsautoplot
Tries, and does a jolly good job, of recognizing the data object to be plotted, and how it should be displayed.
19
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
DATA=GRangesList Object
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
GEOM=alignment, chevron
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
SCALE=stepping, color=strand
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
LAYOUT=linear
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Example
20
1
2
48245000 48250000 48255000 48260000 48265000 48270000hg19::chrX
strand+
−
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
Examine short readsStack them (top)Collapse into “density” (bottom)
21
Stepping
0
5
10
15
0 100 200 300
Coverage
p1 <- autoplot(gr)p2 <- autoplot(gr, stat = "coverage")tracks(p1, p2)
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
Compare transcriptsReduce all to one
22
uc010bzo.2(226)
uc002dvx.3(226)
uc002dvw.3(226)
uc002dvz.3(226)
uc002dwa.4(226)
uc010veg.2(226)
uc002dwc.3(226)
30065000 30070000 30075000 30080000
p1 <- autoplot(txdb, which = genesymbol["A"])p2 <- autoplot(txdb, which = genesymbol["A"], stat = "reduce")tracks(p1, p2, heights = c(4, 1))
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
uc010bzo.2(226)
uc002dvx.3(226)
uc002dvw.3(226)
uc002dvz.3(226)
uc002dwa.4(226)
uc010veg.2(226)
uc002dwc.3(226)
30065000 30070000 30075000 30080000hg19::chr16
knownG
ene
uc010bzo.2(226)
uc002dvx.3(226)
uc002dvw.3(226)
uc002dvz.3(226)
uc002dwa.4(226)
uc010veg.2(226)
uc002dwc.3(226)
30064411 30081741hg19::chr16
knownG
ene
Examples
Focus on exons
23
p1 <- autoplot(txdb, which = genesymbol["A"])p2 <- autoplot(txdb, which = genesymbol["A"], truncate.gaps = TRUE)tracks(p1, p2, heights = c(4, 4))
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
24
Manhattan plot: features plotted against genomic position
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
0.5
1.0
1.5
2.0
2.5
3.0
ll
ll
l
l
l
ll
l
ll
l
l
l
l
l l
l
l
l
ll
lll
l
l
l
l
ll
l
l
l
ll
ll
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
lll
l
l
l
ll
l
ll
l
l
l
l l
l
l
l
lll
l
llll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll l
l
ll
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
lll
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
lll
l
l
l
lll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
lll
l
ll l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
ll
ll
lll
lll
ll
l
ll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
ll
ll
l
l
ll
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
lll
l
l
l
l
lll
l
l
l
l
ll
l
l
l
ll l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
ll
l
ll
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l l
l
l
ll
l
l
ll
l
l l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
l
ll
lll
l
l
l
l
l
l
lll
l
l
ll
l
l
l l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l l
lll l
l
ll
ll
l
l
l
ll
ll
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
l
ll
l
l
l
l
ll
l l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l l
l
ll
l
l
ll
l
ll
l
ll
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
ll
l
l l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
lll
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
ll
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
llll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
ll
ll l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l l l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
lll
l
l
l
l
l
l
ll
l
l
l l
l
l
l
l
l
l
l
l
ll
ll
l
l
l l
l
l
l
l l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
l
ll l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
ll
ll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
lll
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
lll
l
l
l l
l
l
ll
l
l
lll
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
ll
l
lll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
llll
l
l
l
l
l
ll
ll
l
l
l
l
lll
l
llll
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
llll
lll
l
l
l
l
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
ll
ll
lll
l
l
lll
l
l
l
l
l
l
lll
l
l
l
ll
ll
l l
ll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
lll
ll
l
ll
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
ll
l
ll
l
l
ll
ll
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
ll
l
ll
l
lllll ll
l
l
l
l
l
ll
ll
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
ll
l
l
l
l
l
llll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
ll
ll
l
l
ll
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
ll
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l l
l
l
l
ll
l
l
l
l ll
l
l
l
l
l
l
l
llll
l
ll
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
lll
l
l
ll
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
ll
l
l
l
lll l l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
ll
ll
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
lll l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
lll
l
l
l
l
l
5.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+081,10,11,12,13,14,15,16,17,18,19,2,20,21,22,3,4,5,6,7,8,9,X,Y
pvalue
0.5
1.0
1.5
2.0
2.5
3.0
l
l
l
l
l
l
l
llll
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
ll
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
ll
l
l
l
llll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
lll
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
l
l
llll
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
ll
lll
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
lll
l
l
l
l
l
ll
l
l
lll
ll
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
ll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
lll
l
ll
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
ll
l
l
l
ll
l
l
l
ll
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
ll
ll
l
l
l
l
lll
ll
llll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
llll
l
l
l
l
l
l
l
l
l
ll
l
lll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
llll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
lll
l
l
ll
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
lll
l
l
l
l
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
l
l
lll
l
l
l
l
ll
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
lll
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
ll
l
ll
l
l
ll
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
l
l
l
l
l
ll
ll
l
l
l
l
ll
l
l
l
lll
lll
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
l
ll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
ll
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
ll
ll
lll
l
l
ll
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
lll
l
lll
l l
l
l
l
l
l
l
l
l
l
l
l
lll
l
lll
ll
l
l
ll
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
lll
l
ll
l
l
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
llll
ll
l
l
l
l
l
lll
l
l
l
ll
l
ll
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
lll
l
l
l
ll
ll
l
l
l
l
l
l
ll
lll
l
ll
l
l
l
l
l
ll
l
l
l
l
lll
l
l
lll
l
l
l
l
l
ll
l
l
ll
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
ll
l
l
l
l
ll
l
l
llll
l
ll
l
l
l
l
l
ll
ll
l
l
l
lll
ll
l
l
ll
l
l
ll
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
lll
l
l
l
l
l
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Ygenome
y
p1 <- autoplot(gr.snp, geom = "point", aes(y = pvalue))
p2 <- plotGrandLinear( gr.snp, aes(y = pvalue))
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
24
Manhattan plot: features plotted against genomic position
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
0.5
1.0
1.5
2.0
2.5
3.0
ll
ll
l
l
l
ll
l
ll
l
l
l
l
l l
l
l
l
ll
lll
l
l
l
l
ll
l
l
l
ll
ll
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
lll
l
l
l
ll
l
ll
l
l
l
l l
l
l
l
lll
l
llll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll l
l
ll
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
lll
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
lll
l
l
l
lll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
lll
l
ll l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
ll
ll
lll
lll
ll
l
ll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
ll
ll
l
l
ll
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
lll
l
l
l
l
lll
l
l
l
l
ll
l
l
l
ll l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
ll
l
ll
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l l
l
l
ll
l
l
ll
l
l l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
l
ll
lll
l
l
l
l
l
l
lll
l
l
ll
l
l
l l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l l
lll l
l
ll
ll
l
l
l
ll
ll
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
l
ll
l
l
l
l
ll
l l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l l
l
ll
l
l
ll
l
ll
l
ll
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
ll
l
l l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
lll
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
ll
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
llll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
ll
ll l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l l l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
lll
l
l
l
l
l
l
ll
l
l
l l
l
l
l
l
l
l
l
l
ll
ll
l
l
l l
l
l
l
l l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
l
ll l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
ll
ll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
lll
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
lll
l
l
l l
l
l
ll
l
l
lll
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
ll
l
lll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
llll
l
l
l
l
l
ll
ll
l
l
l
l
lll
l
llll
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
llll
lll
l
l
l
l
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
ll
ll
lll
l
l
lll
l
l
l
l
l
l
lll
l
l
l
ll
ll
l l
ll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
lll
ll
l
ll
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
ll
l
ll
l
l
ll
ll
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
ll
l
ll
l
lllll ll
l
l
l
l
l
ll
ll
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
ll
l
l
l
l
l
llll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
ll
ll
l
l
ll
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
ll
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l l
l
l
l
ll
l
l
l
l ll
l
l
l
l
l
l
l
llll
l
ll
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
lll
l
l
ll
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
ll
l
l
l
lll l l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
ll
ll
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
lll l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
lll
l
l
l
l
l
5.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+081,10,11,12,13,14,15,16,17,18,19,2,20,21,22,3,4,5,6,7,8,9,X,Y
pvalue
0.5
1.0
1.5
2.0
2.5
3.0
l
l
l
l
l
l
l
llll
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
ll
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
ll
l
l
l
llll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
lll
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
l
l
llll
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
ll
lll
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
lll
l
l
l
l
l
ll
l
l
lll
ll
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
ll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
lll
l
ll
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
ll
l
l
l
ll
l
l
l
ll
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
ll
ll
l
l
l
l
lll
ll
llll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
llll
l
l
l
l
l
l
l
l
l
ll
l
lll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
llll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
lll
l
l
ll
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
lll
l
l
l
l
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
l
l
lll
l
l
l
l
ll
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
lll
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
ll
l
ll
l
l
ll
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
l
l
l
l
l
ll
ll
l
l
l
l
ll
l
l
l
lll
lll
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
l
ll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
ll
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
ll
ll
lll
l
l
ll
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
lll
l
lll
l l
l
l
l
l
l
l
l
l
l
l
l
lll
l
lll
ll
l
l
ll
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
lll
l
ll
l
l
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
llll
ll
l
l
l
l
l
lll
l
l
l
ll
l
ll
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
lll
l
l
l
ll
ll
l
l
l
l
l
l
ll
lll
l
ll
l
l
l
l
l
ll
l
l
l
l
lll
l
l
lll
l
l
l
l
l
ll
l
l
ll
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
ll
l
l
l
l
ll
l
l
llll
l
ll
l
l
l
l
l
ll
ll
l
l
l
lll
ll
l
l
ll
l
l
ll
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
lll
l
l
l
l
l
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Ygenome
y
p1 <- autoplot(gr.snp, geom = "point", aes(y = pvalue))
p2 <- plotGrandLinear( gr.snp, aes(y = pvalue))
Facets by chromosome #
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
24
Manhattan plot: features plotted against genomic position
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
0.5
1.0
1.5
2.0
2.5
3.0
ll
ll
l
l
l
ll
l
ll
l
l
l
l
l l
l
l
l
ll
lll
l
l
l
l
ll
l
l
l
ll
ll
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
lll
l
l
l
ll
l
ll
l
l
l
l l
l
l
l
lll
l
llll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll l
l
ll
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
lll
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
lll
l
l
l
lll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
lll
l
ll l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
ll
ll
lll
lll
ll
l
ll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
ll
ll
l
l
ll
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
lll
l
l
l
l
lll
l
l
l
l
ll
l
l
l
ll l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
ll
l
ll
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l l
l
l
ll
l
l
ll
l
l l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
l
ll
lll
l
l
l
l
l
l
lll
l
l
ll
l
l
l l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l l
lll l
l
ll
ll
l
l
l
ll
ll
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
l
ll
l
l
l
l
ll
l l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l l
l
ll
l
l
ll
l
ll
l
ll
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
ll
l
l l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
lll
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
ll
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
llll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
ll
ll
ll l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l l l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
lll
l
l
l
l
l
l
ll
l
l
l l
l
l
l
l
l
l
l
l
ll
ll
l
l
l l
l
l
l
l l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
l
ll l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
ll
ll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
lll
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
lll
l
l
l l
l
l
ll
l
l
lll
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
ll
l
lll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
llll
l
l
l
l
l
ll
ll
l
l
l
l
lll
l
llll
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
llll
lll
l
l
l
l
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
ll
ll
lll
l
l
lll
l
l
l
l
l
l
lll
l
l
l
ll
ll
l l
ll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
lll
ll
l
ll
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
ll
l
ll
l
l
ll
ll
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
ll
l
ll
l
lllll ll
l
l
l
l
l
ll
ll
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
ll
l
l
l
l
l
llll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
ll
ll
l
l
ll
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
ll
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l l
l
l
l
ll
l
l
l
l ll
l
l
l
l
l
l
l
llll
l
ll
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
lll
l
l
ll
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
ll
l
l
l
lll l l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
ll
ll
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
lll l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
lll
l
l
l
l
l
5.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+085.0e+071.0e+081.5e+082.0e+081,10,11,12,13,14,15,16,17,18,19,2,20,21,22,3,4,5,6,7,8,9,X,Y
pvalue
0.5
1.0
1.5
2.0
2.5
3.0
l
l
l
l
l
l
l
llll
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
ll
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
ll
l
l
l
llll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
lll
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
l
l
llll
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
ll
lll
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
ll
l
ll
l
ll
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
lll
l
l
l
l
l
ll
l
l
lll
ll
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
ll
ll
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
lll
l
ll
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
ll
l
l
ll
l
l
l
ll
l
l
l
ll
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
ll
ll
l
l
l
l
lll
ll
llll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
llll
l
l
l
l
l
l
l
l
l
ll
l
lll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
llll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
lll
l
l
l
l
l
l
ll
ll
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
ll
ll
l
l
l
l
l
ll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
ll
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
ll
l
l
l
l
l
l
ll
l
lll
l
l
ll
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
l
l
l
lll
l
lll
l
l
l
l
l
l
l
l
l
ll
l
ll
ll
l
l
l
l
l
l
lll
l
l
l
l
ll
ll
l
l
l
lll
l
l
l
l
ll
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
lll
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
ll
l
ll
l
l
ll
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
l
l
l
l
l
ll
ll
l
l
l
l
ll
l
l
l
lll
lll
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
ll
l
l
ll
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
ll
l
l
l
l
ll
l
l
l
l
l
l
ll
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
ll
l
l
l
l
l
ll
l
l
l
ll
l
l
l
l
l
l
ll
l
l
l
l
l
ll
ll
lll
l
l
ll
l
l
l
l
l
l
l
ll
lll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
l
ll
l
l
l
l
lll
l
lll
l l
l
l
l
l
l
l
l
l
l
l
l
lll
l
lll
ll
l
l
ll
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
ll
l
l
l
l
ll
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
lll
l
l
l
lll
l
ll
l
l
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
l
l
l
ll
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
l
ll
l
lll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
llll
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
ll
l
l
l
llll
ll
l
l
l
l
l
lll
l
l
l
ll
l
ll
l
l
l
l
ll
ll
l
l
l
l
l
l
ll
lll
l
l
l
ll
ll
l
l
l
l
l
l
ll
lll
l
ll
l
l
l
l
l
ll
l
l
l
l
lll
l
l
lll
l
l
l
l
l
ll
l
l
ll
l
l
l
l
ll
l
l
llll
l
l
l
l
l
l
l
l
l
l
l
lll
l
l
lll
ll
l
l
l
l
ll
l
l
llll
l
ll
l
l
l
l
l
ll
ll
l
l
l
lll
ll
l
l
ll
l
l
ll
l
l
ll
ll
ll
l
l
l
l
l
l
l
l
l
l
l
l
l
l
l
ll
l
l
ll
l
l
l
l
lll
l
l
l
l
l
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Ygenome
y
p1 <- autoplot(gr.snp, geom = "point", aes(y = pvalue))
p2 <- plotGrandLinear( gr.snp, aes(y = pvalue))
Facets by chromosome #
Turns chromosome # into numerical scale
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
25
12
34
56
78
910
1112
1314
1516
1718
1920
2122
5.0e+07 1.0e+08 1.5e+08 2.0e+08Human genome
Karyogram, highlight locations corresponding to some data feature
autoplot(gr, layout = "karyogram", color = "blue")
GRanges
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
26
l
l
l
l
l
l
l
l
l
ll
ll
l
l
l
l
l
l
l
l
l
ll
0M 50M
100M
150M
200M
0M
50M
100M
150M
200M
0M
50M
100M
150M
0M
50M
100M
150M
0M
50M
100M
150M
0M
50M
100M
150M
0M
50M100M150M0M50M100M
0M
50M
100M
0M
50M
100M
0M
50M100M
0M
50M
100M
0M
50M
100M
0M
50M
100M
0M
50M
100M
0M
50M
0M
50M
0M
50M
0M
50M 0M
50M 0M
0M 50M 1
2
34
5
6
7
8910
11
12
1314
1516
17
1819
2021 22
rearrangementsinterchromosomalintrachromosomal
tumreadsl
l
l
l
l
4681012 Circular layout
of genome with associated data, and connections
ggplot() + layout_circle(gr1, geom = "link", linked.to = "to.gr", aes(color = rearrangement), trackWidth = 1, radius = 10) + layout_circle(gr2, geom = "point", ...) + ...
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
27
Layout genome linearly, stack associated data plots, connections
0
2
4
6
8
10
rearrangementsinterchromosomalintrachromosomal
4
6
8
10
12
l
l
l
l
l
l
l
l l
l
l
l
l
l
l l l l
l
l
l
l
ll
score
tumreadsl
l
l
l
l
4681012
Stepping
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Stepping
p1 <- autoplot(gr1, geom = "arch", aes(color = rearrangement), coord = "genome")p2 <- autoplot(gr2, geom = "point", aes(y = score, size = tumreads), color = "red", coord = "genome")...tracks(p1, p2, p3, p4, heights = c(2, 4, 1, 1))
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Examples
28
Organize multiple circular layouts
library(gridExtra)grid.arrange(square, gg, ncol = 2, widths = c(4/5, 1/5))
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−1
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−2
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−3
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−4
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−5
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−6
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−7
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−8
1
2
34
5
6
78910
11
1213
1415
1617
1819
20 21 22CRC−9
rearrangementsinterchromosomalintrachromosomal
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
BenefitsFlexibility in drawing genomic dataAesthetics are changeable, color schemes for different purposesPlots defined in a way to compare and contrastHuge variety of displays is available in one locationBuilds from a good data model and tools available in bioC.
29
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Future WorkClean up code, autoplot, consistency in usage, make circular layouts as elegant as CircosIdeally integrate new grammar components better with the ggplot2 code (not trivial)Build interactive graphics, using the qtbase, qtpaint primitives
30
ggbio - Genomic Data Vis - Interface 2012, Rice University /31
Availabilityggbio is on www.bioconductor.orgTengfei’s ggbio web page has tutorials and gallery of examples: http://tengfei.github.com/ggbio
Support by Genentech has been vital
31