Upload
phamnhu
View
240
Download
0
Embed Size (px)
Citation preview
R package ggplot2STAT 133
Gaston Sanchez
Department of Statistics, UC–Berkeley
gastonsanchez.com
github.com/gastonstat/stat133
Course web: gastonsanchez.com/stat133
ggplot2
2
Scatterplot with "ggplot2"
Terminology
I aesthetic mappings
I geometric objects
I statistical transformations
I scales
I non-data elements (themes & elements)
I facets
3
Considerations
Specifying graphical elements from 3 sources:
I The data values (represented by the geometric objects)
I The scales and coordinate system (axes, legends)
I Plot annotations (background, title, grid lines)
4
Scatterplot with geom point
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point()
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
100
200
300
10 15 20 25 30 35mpg
hp
5
Another geom
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_line()
100
200
300
10 15 20 25 30 35mpg
hp
6
Mapping Attributes-vs-
Setting Attributes
7
Increase size of points
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3)
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●100
200
300
10 15 20 25 30 35mpg
hp
8
How does it work?
To increase the size of points, we set the aesthetic size to aconstant value of 3 (inside the geoms function):
+ geom_point(size = 3)
9
Adding color
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3, color = "tomato")
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●100
200
300
10 15 20 25 30 35mpg
hp
10
Adding color
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3, color = "#259ff8")
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●100
200
300
10 15 20 25 30 35mpg
hp
11
Test your knowledge
Identify the valid hex-color
A) "345677"
B) "#1234567"
C) "#AAAAAA"
D) "#GG0033"
12
Changing points shape# 'shape' accepts 'pch' values
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3, color = "tomato", shape = 15)
100
200
300
10 15 20 25 30 35mpg
hp
13
Setting and Mapping
Aesthetic attributes can be either mapped —via aes()— orset
# mapping aesthetic color
ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = cyl))
# setting aesthetic color
ggplot(mtcars, aes(x = mpg, y = hp)) +
geom_point(color = "blue")
14
Geom text, and mapping labels
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_text(aes(label = gear))
444
3
3
3
3
4
4
44
3 3333
3
44
4
3
33
3
3
45
5
5
5
5
4100
200
300
10 15 20 25 30 35mpg
hp
15
Changing axis labels and title
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3, color = "tomato") +
xlab("miles per gallon") +
ylab("horse power") +
ggtitle("Scatter plot with ggplot2")
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●100
200
300
10 15 20 25 30 35miles per gallon
ho
rse
pow
er
Scatter plot with ggplot2
16
Changing background themeggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3, color = "tomato") +
xlab("miles per gallon") +
ylab("horse power") +
ggtitle("Scatter plot with ggplot2") +
theme_bw()
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●100
200
300
10 15 20 25 30 35miles per gallon
ho
rse
pow
er
Scatter plot with ggplot2
17
Your turn: Replicate this figure
100
200
300
10 15 20 25 30 35miles per gallon
ho
rse
pow
er disp
100
200
300
400
18
Your turn: Replicate this figure
I Specify a color in hex notation
I Change the shape of the point symbol
I Map disp to attribute size of points
I Add axis labels
19
Your turn
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(size = disp),
color = "#ff6666", shape = 17) +
xlab("miles per gallon") +
ylab("horse power")
20
More geomsggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point() +
geom_smooth(method = "lm")
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
0
100
200
300
10 15 20 25 30 35mpg
hp
21
More geomsWe can map variable to a color aesthetic. Here we map colorto cyl (cylinders)
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = cyl))
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
100
200
300
10 15 20 25 30 35mpg
hp
4
5
6
7
8cyl
22
More geomsIf the variable that maps to color is a factor, then the colorscale will change
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = as.factor(cyl)))
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
100
200
300
10 15 20 25 30 35mpg
hp
as.factor(cyl)
●
●
●
4
6
8
23
Your turn: Replicate this figure
100
200
300
400
10 15 20 25 30 35miles per gallon
dis
pla
cem
en
t
factor(am)
0
1
hp
100
150
200
250
300
Scatter plot with ggplot2
24
Your turn: example 2
I Map hp to attribute size of points
I Map am (as factor) to attribute color points
I Add an alpha transparency of 0.7
I Change the shape of the point symbol
I Add axis labels
I Add a title
25
Your turn: example 2
ggplot(data = mtcars, aes(x = mpg, y = disp)) +
geom_point(aes(size = hp, color = factor(am)),
alpha = 0.7) +
xlab("miles per gallon") +
ylab("displacement") +
ggtitle("Scatter plot with ggplot2")
26
Histogram
ggplot(data = mtcars, aes(x = mpg)) +
geom_histogram(binwidth = 2)
0
2
4
6
10 20 30mpg
coun
t
27
Boxplots
ggplot(data = mtcars, aes(x = factor(cyl), y = mpg)) +
geom_boxplot()
●●
●
10
15
20
25
30
35
4 6 8factor(cyl)
mpg
28
Density Curves
ggplot(data = mtcars, aes(x = mpg)) +
geom_density()
0.00
0.02
0.04
0.06
10 15 20 25 30 35mpg
dens
ity
29
Density Curves
ggplot(data = mtcars, aes(x = mpg)) +
geom_density(fill = "#c6b7f5")
0.00
0.02
0.04
0.06
10 15 20 25 30 35mpg
dens
ity
30
Density Curves
ggplot(data = mtcars, aes(x = mpg)) +
geom_density(fill = "#c6b7f5", alpha = 0.4)
0.00
0.02
0.04
0.06
10 15 20 25 30 35mpg
dens
ity
31
Density Curves
ggplot(data = mtcars, aes(x = mpg)) +
geom_line(stat = 'density', col = "#a868c0", size = 2)
0.02
0.03
0.04
0.05
0.06
0.07
10 15 20 25 30 35mpg
dens
ity
32
Density Curves
ggplot(data = mtcars, aes(x = mpg)) +
geom_density(fill = '#a868c0') +
geom_line(stat = 'density', col = "#a868c0", size = 2)
0.00
0.02
0.04
0.06
10 15 20 25 30 35mpg
dens
ity
33
ggplot objects
34
Plot objects
You can assign a plot to a new object (this won’t plotanything):
mpg_hp <- ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(size = 3, color = "tomato")
To show the actual plot associated to the object mpg hp use thefunction print()
print(mpg_hp)
35
"ggplot2" objects
working with ggplot objects, we can ...I define a basic plot, to which we can add or change layers
without typing everything again
I render it on screen with print()
I describe its structure with summary()
I render it to disk with ggsave()
I save a cached copy to disk with save()
36
Adding a title and axis labels to a ggplot2 object:
mpg_hp + ggtitle("Scatter plot with ggplot2") +
xlab("miles per gallon") + ylab("horse power")
●●
●
●
●
●
●
●
●
●●
● ●●
●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●100
200
300
10 15 20 25 30 35miles per gallon
ho
rse
pow
er
Scatter plot with ggplot2
37
Your turn: example 3
Create the following ggplot object:
# ggplot object
obj <- ggplot(data = mtcars,
aes(x = mpg, y = hp, label = rownames(mtcars)))
Add more layers to the object "”obj” in order to replicate thefigure in the following slide:
38
Your turn: example 3
Mazda RX4Mazda RX4 WagDatsun 710
Hornet 4 Drive
Hornet Sportabout
Valiant
Duster 360
Merc 240D
Merc 230
Merc 280Merc 280C
Merc 450SEMerc 450SLMerc 450SLC
Cadillac FleetwoodLincoln Continental
Chrysler Imperial
Fiat 128Honda Civic
Toyota Corolla
Toyota Corona
Dodge ChallengerAMC Javelin
Camaro Z28
Pontiac Firebird
Fiat X1−9
Porsche 914−2
Lotus Europa
Ford Pantera L
Ferrari Dino
Maserati Bora
Volvo 142E100
200
300
10 15 20 25 30 35miles per gallon
hors
e po
wer
factor(am)
aa
0
1
Scatter plot
39
Your turn: example 3
obj +
geom_text(aes(color = factor(am))) +
ggtitle("Scatter plot") +
xlab("miles per gallon") +
ylab("horse power")
40
Scales
41
Scales
I The scales component encompases the ideas of both axesand legends on plots, e.g.:
I Axes can be continuous or discreteI Legends involve colors, symbol shapes, size, etc
– scale x continuous
– scale y continuous
– scale color manual
I scales will often automatically generate appropriate scalesfor plots
I Explicitly adding a scale component overrides the defaultscale
42
Continuous axis scales
Use scale x continuous() to modify the default values in thex axis
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = factor(am))) +
scale_x_continuous(name = "miles per gallon",
limits = c(10, 40),
breaks = c(10, 20, 30, 40))
43
Continuous axis scales
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
100
200
300
10 20 30 40miles per gallon
hp
factor(am)
●
●
0
1
44
Continuous axis scales
Use scale y continuous() to modify the default values in they axis
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = factor(am))) +
scale_x_continuous(name = "miles per gallon",
limits = c(10, 40),
breaks = c(10, 20, 30, 40)) +
scale_y_continuous(name = "horsepower",
limits = c(50, 350),
breaks = seq(50, 350, by = 50))
45
Continuous axis scales
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
50
100
150
200
250
300
350
10 20 30 40miles per gallon
hors
epow
er factor(am)
●
●
0
1
46
Example: color scale
Use scale color manual() to modify the colors associated toa factor
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = factor(am))) +
scale_color_manual(values = c("orange", "purple"))
47
Example: color scale
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
100
200
300
10 15 20 25 30 35mpg
hp
factor(am)
●
●
0
1
48
Example: modifying legend
Modifying legends depends on the type of scales (e.g. color,shapes, size, etc)
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(aes(color = factor(am))) +
scale_color_manual(values = c("orange", "purple"),
name = "transmission",
labels = c('no', 'yes'))
49
Example: modifying legend
●●
●
●
●
●
●
●
●
●●
● ●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
100
200
300
10 15 20 25 30 35mpg
hp
transmission
●
●
no
yes
50
Faceting
51
Faceting with facet wrap()
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(color = "#3088f0") +
facet_wrap(~ cyl)
●
●
●
●
●
●
●
●
●
●● ●●●
●
●●
● ●
●
●●●
●
●
●
●●
●
●
●
●
4 6 8
100
200
300
10 15 20 25 30 3510 15 20 25 30 3510 15 20 25 30 35mpg
hp
52
Faceting with facet grid()
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(color = "#3088f0") +
facet_grid(cyl ~ .)
●
●
●
●●
●
●
●●
●●
●● ●●●●
●
●
●
● ●●●●
●
●●
●
●
●
●
100
200
300
100
200
300
100
200
300
46
8
10 15 20 25 30 35mpg
hp
53
Faceting with facet grid()
ggplot(data = mtcars, aes(x = mpg, y = hp)) +
geom_point(color = "#3088f0") +
facet_grid(. ~ cyl)
4 6 8
●
●
●
●
●
●
●
●
●
●● ●●●
●
●●
● ●
●
●●●
●
●
●
●●
●
●
●
●
100
200
300
10 15 20 25 30 3510 15 20 25 30 3510 15 20 25 30 35mpg
hp
54
Layered Grammar
About "ggplot2"
I Key concept: layer (layered grammar of graphics)
I Designed to work in a layered fashion
I Starting with a layer showing the data
I Then adding layers of annotations and statisticaltransformations
I Core idea: independents components combined togehter
55
Some Concepts
I the data to be visualized
I a set of aesthetic mappings describing how varibales aremapped to aesthetic attributes
I geometric objects, geoms, representing what you see onthe plot (points, lines, etc)
I statistical transformations, stats, summarizing data invarious ways
I scales that map values in the data space to values in anaesthetic space
I a coordinate system, coord, describing how datacoordinates are mapped to the plane of the graphic
I a faceting specification describing how to break up thedata into subsets and to displays those subsets
56