Applied Bioinformatics Introduction to R, continued Bing Zhang Department of Biomedical Informatics...


Citation preview

Applied Bioinformatics

Introduction to R, continued

Bing Zhang

Department of Biomedical Informatics

Vanderbilt University

Matrix subsetting and combining


Task R code

Import data from a tabular file data<-read.table("GSE8671_exp.txt",head=TRUE,sep="\t")

Convert data frame to matrix data0<-as.matrix(data)

Get dimensions of the matrix dim(data0)

Select discrete rows by index data0[c(1,3,5,7,9),]

Select continuous rows by index data0[5:10,]

Select discrete columns by index data0[,c(1,3,5,7,9)]

Select continuous columns by index data0[,5:10]

Select both rows and columns by index data0[1:10,1:5]

Select one row by name data0[“1438_at”,]

Select both rows and columns by name data0[c(“1438_at”, “117_at”),c(“GSM215052”, “GSM215079”)]

Calculate variances for all rows gene_variances<-apply(data0,1,var)

Calculate means for all rows gene_means<-apply(data0,1,mean)

Combine columns (same number of rows) combined<-cbind(data0,gene_means,gene_variances)

Select rows by output of a comparison combined[gene_means>60000,]

Save your work The R environment is controlled by hidden files in the startup directory



Save before quit > q()

Save worksapce image? [y/n/c]:

During a session > save.image()

Save your code to a file (e.g. diff.r), which can be excuted in batch $ R CMD BATCH diff.r &

&: running a program in the background

Screen output to diff.r.Rout


Install and load packages

CRAN packages

>6000 packages

BioConductor packages

~1000 packages for the analysis of high-throughput genomics data


Task R code

Install a CRAN package install.packages (“package name”)

Install a BioConductor package souce (“”)biocLite (“package name”)

Load a package/library library (“package name”)

Graphics in R

R has very strong graphic capacities

High quality, high reproducibility, lots of packages

On-screen graphics Works in R Gui (both Windows and Mac)

In Linux, requires X11 (windowing system for bitmap displays) in Linux

Output to a file postscript, pdf, svg

jpeg, png, tiff, …


Start a pdf file pdf(“gse4183_clustering.pdf”, width=10, height=15)

Generate a heatmap, Rowv=as.dendrogram(rhc), Colv=as.dendrogram(hc), colSideColors=ann, cexRow=0.5, cexCol=0.5, col=greenred(256))

Close the file
