Ggplot2 A cool way for creating plots in R Maria Novosolov

Preview:

Citation preview

ggplot2A cool way for creating plots in R

Maria Novosolov

Your scientific graphic options

Why not only Excel?

• Excel is relatively limited in its support of scientific graphing

• It’s options provide limited control over the output

• Limited selection of graph types• Limited number of data points that

can be plotted

Why ggplot2?• Helps creating good looking

graphs for publishing• The package format allows

manipulation on every step of the way

• Create new graphics tailored for your problem

• Easy to create complex plots

The idea behind ggplot2

Combining all the good and leaving out all the bad of all

R packages for plotting

Some examples

Attention!!

It doesn’t suggest what graphics you should use to

answer the questions you are interested in

The basics of ggplot2• Works as a layer by layer

system• Based on the Grammar of

Graphics (Wilkinson, 2005)• The plots can be edited after

they are ready

How it works

Two options of working with the package1.Easy – using qplot() (=quick

plot)2.Complicated – using ggplot

qplot

• Basically very similar to the function plot() in R

• The first two arguments to qplot() are x and y

• An optional data argument (recommended)

• Exp: qplot(x, y, data = mydata)

qplot• You can change color, size, shape and

transparency (all of this is called aesthetics)• This can either be set to the default

aesthetics or manually using the function I().

• You can decide on the type of the plot (scatterplot, box-plot, histogram) by using “geom” function.

ggplot

ggplot• Works by creating the plot layer by

layer• A layer is composed of four parts:• data and aesthetic mapping• a statistical transformation (stat)• a geometric object (geom)• and a position adjustment.

Layers• Iteratively update a plot• Change a single feature at a time

• Think about the high level aspects of the plot in isolation

• Instead of choosing a static type of plot, create new types of plots on the fly

• Cure against immobility• Developers can easily develop

new layers without affecting other layers

LHadannynb1
Align the bullets

Lets dive into the syntax

Anatomy of a plot

1. Data *2. Aesthetic mapping *3. Geometric Object * 4. Statistical transformation5. Scales6. Facets7. Themes8. Position adjustments

* Mandatory layers

1. Data

In ggplot2, we always explicitly specify the data

qplot(displ, hwy, data = mpg)

2. Aesthetic MappingIn ggplot land aesthetic means "something you can see“Includes:• Position (i.e., on the x and y axes)• Color ("outside" color)• Fill ("inside" color)• Shape (of points)• Line type• Size

Aesthetic mappings are set with the aes() function

3. Geometric Object• A geom can only display certain

aesthetics• A plot must have at least one geom;

there is no upper limit

Example

ggplot(data=mpg, aes(x=cty, y=hwy)) + geom_point()

Aesthetic Mappings

The data.frame to plot

What geom to use in plotting

4. Statistical Transformations

• Each geom has a default statistic, but these can be changed• Some plot types (such as scatterplots) do not

require transformations–each point is plotted at x and y coordinates equal to the original value• Other plots, such as boxplots, histograms,

prediction lines etc. require statistical transformations

5. Scales• Control mapping from data to aesthetic

attributes• One scale per aesthetic

5. Scales

In ggplot2 scales include:• position• color and fill• size• shape• line type

Modified with scale_<aesthetic>_<type>

5. Scales

Common Scale Arguments:• name: the first argument gives the axis or

legend title• limits: the minimum and maximum of the

scale• breaks: the points along the scale where

labels should appear• labels: the labels that appear at each break

5. ScalesPartial combination matrix of available scales

6. Faceting• Lay out multiple plots on a page• Split data into subsets• Plot subsets into different panels

7. Themes• The ggplot2 theme system handles non-data

plot elements such as• Axis labels• Plot background• Facet label background• Legend appearance

• Specific theme elements can be overridden using theme()

8. Position adjustments• Tweak positioning of geometric objects• Avoid overlaps

Exp: position_jitter

• Avoid overplotting by jittering points

ggplot(overplotted, aes(x,y)) +geom_point(position=position_jitter(w=0.1, h=0.1))

Summary• Allows creating high quality plots• Many options• Many blogs and webpages

explaining how to do different plots in ggplot

• Disadvantage – takes time to learn the grammar

• There is a book:ggplot2: Elegant Graphics for Data Analysis

Recommended