View
41
Download
2
Category
Tags:
Preview:
DESCRIPTION
Introduction to R Lecture 1: Getting Started. Andrew Jaffe 8/30/10. Lecture 1. Course overview What is R? Installing R Installing a text editor Interfacing text editor with R Writing scripts Using R as a calculator. About the Course. Series of 7 seminars Covers the usage of R - PowerPoint PPT Presentation
Citation preview
Introduction to RLecture 1: Getting Started
Andrew Jaffe
8/30/10
Lecture 1
• Course overview
• What is R?
• Installing R
• Installing a text editor
• Interfacing text editor with R
• Writing scripts
• Using R as a calculator
About the Course
• Series of 7 seminars
• Covers the usage of R– Platform for beginning analyses– NOT covering statistics – Good programming etiquette
• Bring your laptop – there will be breaks to allow you to practice the code
About the Course
• This seminar is 1 unit pass/fail
• To pass, attend 5 out of 7 seminars
• Very little outside work
About the Course
• Some learning objectives include:– Importing/exporting data– Data management– Performing calculations– Recoding variables– Producing graphics– Installing packages– Writing functions
About the Course
• Course communication via E-mail
• Lectures and code will be hosted on my webpage– http://
www.biostat.jhsph.edu/~ajaffe/rseminar.html
About the Instructor
• 3rd year PhD student in Genetic Epi program, concurrent MHS in Bioinformatics
• Learned R five years ago, been using regularly the last two
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
What is R?
• R is a language and environment for statistical computing and graphics
• R is the open source implementation of the S language, which was developed by Bell laboratories
• R is both open source and open development
http://www.r-project.org/
What is R?
• Pros:– Free– Tons of packages, very flexible– Multiple datasets at any given time
• Cons:– Much more “programming” oriented– Minimal interface
These are my personal opinions
What is R?
• Often times, a good first step for data cleaning and manipulation
• Then, export data to STATA or SAS for Epi analyses
What is R?
Console Script
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
Installing R - Windows
• Windows: click “base” and download
Installing R - Windows
• Click the link to the latest build
Installing R - Mac
• Mac: click the latest package’s .pkg file
Installing R
• Double click the downloaded file
• Hit ‘next’ a few times
• Use default settings
• Finish installing
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
Installing a Text Editor
• Windows: R’s built-in text editor is terrible– It’s essentially Window’s notepad– We will download a much better one
• Mac: R’s built-in text editor is sufficient– Color coding, signals parenthesis closing, etc– I suggest using this until you think you need a
better one
Installing a Text Editor
• I prefer Notepad++: – http://notepad-plus-plus.org/ – Download the current version:
http://download.tuxfamily.org/notepadplus/5.7/npp.5.7.Installer.exe
– Install on your computer using defaults
Installing a Text Editor
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
Interfacing with R
• Scripts: documents that contain reproducible R code and functions that you can send to the console (and save)– Files are designated with the “.R” extension– You can “source” scripts (more later)
• Console: Type commands directly into the console– Good for looking at your data, trying things,
and plotting
Interfacing with R - Mac
• Mac: File New Script
• This opens the default text editor
• To send a line of code to the R console, press Apple+Enter when the cursor is anywhere on that line
• Highlight chunks of code and press Apple+Enter to send
Interfacing with R - Windows
• Using the default text editor, pressing Ctrl+R sends lines to the console
• However, we want to use Notepad++
• We need to download one more thing…
Interfacing with R - Windows
• “NppToR”: Notepad++ to R
• http://sourceforge.net/projects/npptor/
• It must be running when R and Notepad++ are open
• When properly configured, press F8 to send lines of code, or highlighted chunks, to the console
• I will help configure this after class today
Interfacing with R – Windows
• More detailed instructions for installing NppToR
• http://sourceforge.net/apps/mediawiki/npptor/index.php?title=Installing
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
Writing Scripts
• The comment symbol is # (pound) in R
• Comment liberally - you should be able to understand a script after not seeing it for 6 months
• Lines of #’s are useful to separate sections
• Useful for designating headers
Writing Scripts
################## Title: Demo R Script# Author: Andrew Jaffe# Date: 7/30/10# Purpose: Demonstrate comments in R################### # this is a comment, nothing to the right of it gets read# this # is still a comment – you can use many #’s as you want
# sometimes you have a really long comment, like explaining what you
# are doing for a step in analysis. Take it to a second line
Writing Scripts
• Some common etiquette:– You can use spaces (more generally “white
space”) within functions and commands liberally as well
– Try to keep a reasonable number of characters per column – many commands can be broken into multiple lines
– More to come later…
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
R as a Calculator
• The R console functions as full calculator
• Try to play around with it:+, -, /, * are add, subtract, multiply, and divide
^ or ** is power
( and ) work with order of operations
Lecture 1
• Course overview• What is R?• Installing R• Installing a text editor• Interfacing text editor with R• Writing scripts• Using R as a calculator• Assignment
Assignment
• The assignment… operator: assigning a value to a name
• R accepts two operators “<-” and “=“– Ie: x=8 (remember whitespace!: x = 8, x <- 8)
• Variable names are case-sensitive– Ie: X and x are different
• Set x = 8, and try using calculator functions on x
Assignment
• ‘Assignment’ literally puts whatever is on the right side of the operator into your left-hand side variable– Note that although you can name variables
anything, you might run into some issues naming things the same as default R functions Np++ turns functions red/pink so you know…
Examples of assignment, introducing R data
Enough to get R up and running if this is the only class you attend. We will
see them in much more detail over the next three sessions
Assignment
• status <- c(“case”,”case”,”case”, “control”,”control”,”control”)
status
class(status)
table(status)
factor(status)
[alternatively: status <- c(rep(“case”,3), rep(“control”,3))]
Assignment
• web <- “http://www.biostat.jhsph.edu/~ajaffe/code/lec1_code.R”– class(web)– source(web)
• You also don’t have to save tables/data you find online to your disk (note read.table works for most things – below aren’t tables though) – scan(web, what=character(0), sep = "\n")– scan(“http://www.google.com”, what=character(0))
Assignment
mat <- matrix(c(1,2,3,4), nrow = 2, ncol = 2, byrow = T) # this is sourced in
class(mat)matmat + matmat * matmat %*% mat
Assignment
• class(dat) # dat is also sourced in
• head(dat)
• table(dat$sex, dat$status)
• …To be continued…
Questions?
Recommended