79
An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Embed Size (px)

Citation preview

Page 1: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

An Introduction to R

Prof. Ke-Sheng Cheng

Dept. of Bioenvironmental Systems Eng.

National Taiwan University

Page 2: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

The R-project• R is a free software. (www.r-project.org)• The S language.– S-Plus (a commercial software)

• R is an integrated software environment for data manipulation, calculation and graphical display.– An efficient data handling and storage facility,– A suite of operators for calculations on arrays, in particular

matrices,– A large, coherent, integrated collection of intermediate

tools for data analysis,

04/10/23 2Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 3: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

– Graphical facilities for data analysis and display either directly at the computer or on hardcopy,

– A well developed, simple and effective programming language which includes conditionals, loops, user defined recursive functions and input and output facilities.

• R packages (CRAN)– Standard packages– Other packages available at the Comprehensive R Archive

Network (CRAN)

04/10/23 3Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 4: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Downloading and Installing R

• http://www.r-project.org/

04/10/23 4Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 5: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 5Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 6: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Starting an R session

04/10/23 6Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 7: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 7Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 8: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Working Environment of R

• The working environment of R can be illustrated by the following graph:

Directory 1

Directory 2

Workspace

Temporary memory

Working Directory

04/10/23 8Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 9: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Running R

• When you first start running R the default prompt is the “>” sign.

04/10/23 9Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 10: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Working directory– In using R, you need to know and

specify the working directory.This is done by clicking the Change dir button.

– One can specify different working directories for different projects.

04/10/23 10Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 11: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Getting help– >help(…) and >help.search(“….”)

04/10/23 11Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 12: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 12Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 13: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Executing commands from an external file

• R commands can be stored in an external file (for example, ksc.r) in the working directory. These commands can then be executed with the source command:> source (“ksc.r”) or

04/10/23 13Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 14: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Open and run an existing file

04/10/23 14Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 15: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Objects and Workspace• The entities that R creates and manipulates are

known as objects. These may be variables, arrays of numbers, character strings, functions, or more general structures built from such components.

• During an R session, objects are created and stored by name. The R command> objects() (alternatively, ls())

can be used to display the names of (most of) the objects which are currently stored within R.

• The collection of objects currently stored is called the workspace.

04/10/23 15Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 16: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Data permanency and removing objects• To remove objects the function rm is available:

> rm(x, y, z, ink, junk, temp, foo, bar)• All objects created during an R sessions can be stored

permanently in a file for use in future R sessions. At the end of each R session you are given the opportunity to save all the currently available objects. If you indicate that you want to do this, the objects are written to a file called ‘.RData’ in the current directory, and the command lines used in the session are saved to a file called ‘.Rhistory’.

04/10/23 16Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 17: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• When R is started at later time from the same directory it reloads the workspace from this file (.RData). At the same time the associated commands history is reloaded.

• Remove all objects in the workspace– rm(list=ls())

• Clear the screen– Ctrl l

04/10/23 17Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 18: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Reading data from files

• The read.table() function

04/10/23 18Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 19: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• The scan() function• The read.csv() function

04/10/23 19Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 20: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Output data to files

• write, write.table, write.csv

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

20

write(x,”output.txt”,ncolumns=10,append=TRUE,sep="\t")

write(round(x,digits=2),”output.txt”,ncolumns=10,append=TRUE,sep="\t")

Page 21: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Objects, their modes and attributes

• Intrinsic attributes: mode and length– The entities R operates on are technically known as

objects. Examples are vectors of numeric (real) or complex values, vectors of logical values and vectors of character strings.

– These vectors are known as “atomic” structures since their components are all of the same type, or mode, namely numeric, complex, logical, character and raw.

By the mode of an object we mean the basic type of its fundamental constituents. This is a special case of a “property” of an object. Another property of every object is its length.

04/10/23 21Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 22: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Atomic structures of R– Vectors must have their values all of the same

mode. Thus any given vector must be unambiguously either logical, numeric, complex, character or raw. (The only apparent exception to this rule is the special “value” listed as NA for quantities not available, but in fact there are several types of NA).

– Note that a vector can be empty and still have a mode. For example the empty character string vector is listed as character(0) and the empty numeric vector as numeric(0).

04/10/23 22Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 23: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Recursive structures of R

• R also operates on objects called lists, which are of mode list. These are ordered sequences of objects which individually can be of any mode.

• lists are known as “recursive” rather than atomic structures since their components can themselves be lists in their own right.

04/10/23 23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 24: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• The other recursive structures are those of mode function and expression.

• Functions are the objects that form part of the R system along with similar user written functions.

• Expressions are objects which form an advanced part of R.

04/10/23 24Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 25: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

An example of using function

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

25

# AR2_Bootstrap.R # Coded by KSC 08232011 at the University of Bristol -------------# AR modeling of the flow data seriesx=read.csv("Nine_flow_events.csv",sep=",")n.event=9n.bt=1000 # number of bootstrap samplesalpha1=c();alpha2=c();alpha3=c();alpha0=c()predct=c()par.ar=matrix(rep(0,n.event*4),ncol=4,nrow=n.event)file.name=paste("event",1:n.event,".txt",sep="")bt.name=paste("bootstrap",1:n.event,".txt",sep="")#------------------------------------------------------------------# Function -- AR(2) Forecastingforecast=function(obs,par1,par2,par3,predct){L=length(obs)u1=0;u2=0obs=c(u1,u2,obs)for (i in 1:L) predct[i]=par3+par1*obs[i+1]+par2*obs[i]err=obs[3:(L+2)]-predctout=c(predct,err)return(out)}#------------------------------------------------------------------

Page 26: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

26

# AR(2) Modeling, forecasting and bootstrapping of individual seriesfor (i in 1:n.event) {

event=x[[i]][!is.na(x[i])]# AR(2) Modeling ------------

windows()pacf(event)ar.event=arima(event,order=c(2,0,0))alpha1[i]=ar.event[[1]][1]alpha2[i]=ar.event[[1]][2]alpha3[i]=ar.event[[1]][3]alpha0[i]=(1-alpha1[i]-alpha2[i])*alpha3[i]par.ar[i,]=c(alpha0[i],alpha1[i],alpha2[i],alpha3[i])

## AR(2) Forecasting ---------

out.4cast=forecast(event,alpha1[i],alpha2[i],alpha0[i],predct)err=out.4cast[(length(event)+1):(2*length(event))]err.star=err-mean(err)write(event,file.name[i],ncolumns=10,append=TRUE,sep="\t")

write(out.4cast[1:length(event)],file.name[i],ncolumns=10,append=TRUE,sep="\t")write(err,file.name[i],ncolumns=10,append=TRUE,sep="\t")

#

Page 27: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

27

# Model-based Time Series Bootstrapping --------------btsample=matrix(rep(0,n.bt*(2+length(err))),nrow=n.bt,ncol=2+length(err))for (j in 1:n.bt){epsilon=sample(err.star,size=length(err),replace=TRUE)for (k in 3:(2+length(err))){btsample[j,k]=alpha0[i]+alpha1[i]*btsample[j,k-1]+alpha2[i]*btsample[j,k-

2]+epsilon[k-2]} write(btsample[j,3:

(2+length(err))],bt.name[i],ncolumns=10,append=TRUE,sep="\t")}

## Plot observed and bootstrap sample series

windows()z=scan(bt.name[i],sep="\t")plot(0,0,type="n",xlim=c(0,length(event)),ylim=c(min(z),max(z)))dim(z)=c(length(event),n.bt)for (j in 1:n.bt) lines(1:length(event),z[,j],type="l")lines(1:length(event),event,type="l",col="red",lwd=3)

}par.ar

Page 28: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

An example using function ecdf

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

28

# ECDF_Plot.R# Coded by KSC 09242011 -----------------n.sample=9 # Number of samples in.file=paste("CECP",1:n.sample,".txt",sep="")windows()plot(0,0,type="n",xlim=c(-1,1),ylim=c(0,1))for (i in 1:n.sample){

x=scan(in.file[i],sep="\t")n.L=length(x)x1=x[1:(n.L/2)]x2=x[(1+(n.L/2)):n.L]x1.ecdf=ecdf(x1);x2.ecdf=ecdf(x2)u=seq(-1,1,by=0.005);v=x1.ecdf(u)lines(u,v,type="l",col=i,lwd=3)v1=round(mean(x1),digits=4)v2=round(sqrt(var(x1)),digits=4)v3=round(mean(x2),digits=4)v4=round(sqrt(var(x2)),digits=4)print(c(v1,v2,v3,v4))

}

Page 29: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• The functions mode(object) and length(object) can be used to find out the mode and length of any defined structure.

04/10/23 29Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 30: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Changing the mode of an object

• as.character(x)• as.integer(x)

04/10/23 30Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 31: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Changing the length of an object• An “empty” object may still have a mode. For

example

makes e an empty vector structure of mode numeric.• Once an object of any size has been created, new

components may be added to it simply by giving it an index value outside its previous range.

04/10/23 31Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 32: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Other examples

04/10/23 32Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 33: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

The class of an object

• All objects in R have a class, reported by the function class. For simple vectors this is just the mode, for example "numeric", "logical", "character" or "list", but "matrix", "array", "factor" and "data.frame" are other possible values.

04/10/23 33Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 34: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

What is an object?• Any entity R operates

on is an object.– Vector– Matrix– Array– Dataframe– List– Function– Expression

• Mode of objects– Numeric– Complex– Character– Factor– Logical– Data.frame

Page 35: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Manipulating objects

• Vector assignmentconcatenate

04/10/23 35Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 36: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• If an expression is used as a complete command, the value is printed and lost.

04/10/23 36Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 37: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Vector arithmetic• Vectors can be used in arithmetic expressions, in

which case the operations are performed element by element.

• Vectors occurring in the same expression need not all be of the same length. If they are not, the value of the expression is a vector with the same length as the longest vector which occurs in the expression. Shorter vectors in the expression are recycled as often as need be (perhaps fractionally) until they match the length of the longest vector. In particular a constant is simply repeated.

04/10/23 37Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 38: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Example

• Arithmetic operations+, - , * , / , ^ (power), round, floor, ceiling

• Arithmetic functions– log, exp, sin, cos, tan, sqrt, abs

04/10/23 38Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 39: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Statistical functions– min, max, range, length, sum, mean, median– quantile, var, prod, smmary– sort, order, rank

04/10/23 39Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 40: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Sort y with respect to increasing order of x.

Same as sort(x)

04/10/23 40Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 41: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Logical vectors• As well as numerical vectors, R allows manipulation

of logical quantities. The elements of a logical vector can have the values TRUE, FALSE, and NA (for “not available”).

• Logical vectors are generated by conditions.• The logical operators are <, <=, >, >=, == for exact

equality and != for inequality. In addition if c1 and c2 are logical expressions, then c1 & c2 is their intersection (“and”), c1 | c2 is their union (“or”), and !c1 is the negation of c1.

04/10/23 41Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 42: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Example

Why?

04/10/23 42Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 43: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Missing values

• NA – not available• NaN – not a number• The function is.na(x) gives a logical vector of

the same size as x with value TRUE if and only if the corresponding element in x is NA and NaN.

• The finction is.nan(x) returns TRUE if and only if the corresponding element is NaN.

04/10/23 43Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 44: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Removing the missing values

04/10/23 44Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 45: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Character vectors• Character vectors are used frequently in R, for

example as plot labels. Where needed they are denoted by a sequence of characters delimited by the double quote character, e.g., "x-values", "New iteration results".

• The paste() function takes an arbitrary number of arguments and concatenates them one by one into character strings. Any numbers given among the arguments are coerced into character strings in the evident way, that is, in the same way they would be if they were printed.

04/10/23 45Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 46: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• The arguments are by default separated in the result by a single blank character, but this can be changed by the named parameter, sep=string, which changes it to string, possibly empty.

04/10/23 46Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 47: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Set operations

• union(x, y) • intersect(x, y) • setdiff(x, y) • is.element(el, set)

04/10/23 47Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 48: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 48Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 49: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Selecting and modifying subsets of an object using index vectors

• Subsets of a vector may be selected by appending to the name of the vector an index vector in square brackets, v[i].

• Such index vectors can be any of four distinct types:– A logical vector

04/10/23 49Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 50: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

– A vector of positive integer quantities– A vector of negative integer quantities

Such an index vector specifies the values to be excluded rather than included.

04/10/23 50Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 51: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

– A vector of character strings. This possibility only applies where an object has a names attribute to identify its components.

04/10/23 51Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 52: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• An indexed expression can also appear on the receiving end of an assignment, in which case the assignment operation is performed only on those elements of the vector.

04/10/23 52Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 53: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Arrays and Matrices

• An array can be considered as a multiply subscripted collection of data entries.

• A dimension vector is a vector of non-negative integers. If its length is k then the array is k-dimensional, e.g. a matrix is a 2-dimensional array.

04/10/23 53Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 54: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• A vector can be used by R as an array only if it has a dimension vector as its dim attribute. Suppose, for example, z is a vector of 1500 elements. The assignment > dim(z) = c(3,5,100)gives it the dim attribute that allows it to be treated as a 3 by 5 by 100 array.

04/10/23 54Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 55: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Creating a matrix– Using dim– Using matrix

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

55

Page 56: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

56

Page 57: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• The values in the data vector give the values in the array in the same order as they would occur in FORTRAN, that is “column major order,” with the first subscript moving fastest and the last subscript slowest.

• For example if the dimension vector for an array, say a, is c(3,4,2) then there are 24 entries in a and the data vector holds them in the order a[1,1,1], a[2,1,1], ..., a[2,4,2], a[3,4,2].

04/10/23 57Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 58: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 58Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 59: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Array indexingSubsections of an array

• Individual elements of an array may be referenced by giving the name of the array followed by the subscripts in square brackets, separated by commas.

• More generally, subsections of an array may be specified by giving a sequence of index vectors in place of subscripts; however if any index position is given an empty index vector, then the full range of that subscript is taken.

04/10/23 59Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 60: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 60Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 61: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Index matrices

04/10/23 61Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 62: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

List• An R list is an object consisting of an ordered

collection of objects known as its components.• There is no particular need for the components

to be of the same mode or type, and, for example, a list could consist of a numeric vector, a logical value, a matrix, a complex vector, a character array, a function, and so on.

• If Lst is a list, then the function length(Lst) gives the number of (top level) components it has.

• New lists may be formed from existing objects by the function list().

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

62> Lst <- list(name_1=object_1, ..., name_m=object_m)

Page 63: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• Example of a list object

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

63

They are different.

Page 64: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

64

Page 65: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

65

x=read.csv("20110823_Flow_Series.csv",sep=",")n.event=3event.comb=c()for (i in 1:n.event) event.comb=c(event.comb,x[[i]][!is.na(x[i])]) # combining all events into one series.

Page 66: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Empirical CDF, edcf

04/10/23 66Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 67: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

• The ecdf in R is a function.

04/10/23 67Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 68: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 68Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 69: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23 69Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 70: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Random number generation in R

• R commands for stochastic simulation (for normal distribution – pnorm – cumulative probability– qnorm – quantile function– rnorm – generating a random sample of a specific

sample size– dnorm – probability density function

For other distributions, simply change the distribution names. For examples, (punif, qunif, runif, and dunif) for uniform distribution and (ppois, qpois, rpois, and dpois) for Poisson distribution.

04/10/23 70Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 71: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Approximation of the Poisson distribution by normal distribution

Demonstration using stochastic simulation

• Using R

)20,20()20( 2 NPoisson

131776.08682238.0120

20251]25[

XP

.

Estimated by normal approximation of Poisson distribution

04/10/23 71Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 72: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Poisson CDF by stochastic simulation

Estimated by stochastic simulation of Poisson distribution

Direct calculation using theoretical CDF of Poisson distribution.

04/10/23 72Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 73: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Approximation by normal distribution

Poisson CDF by stochastic simulation

04/10/23 73Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

Page 74: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Random experiment

• Using the “sample” function in R• “sample” takes a sample of the specified size

from the elements of x using either with or without replacement.

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

74

Page 75: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

75

Page 76: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

76

Page 77: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

Sample quantiles

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

77

Page 78: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

The boxplot in R

• boxplot(x,range=0)• boxplot(x) [Default, range=1.5]• boxplot(x,range=3)• The parameter range determines how far the

plot whiskers extend out from the box. If range is positive, the whiskers extend to the most extreme data point which is no more than range times the interquartile range from the box. A value of zero causes the whiskers to extend to the data extremes.

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

78

Page 79: An Introduction to R Prof. Ke-Sheng Cheng Dept. of Bioenvironmental Systems Eng. National Taiwan University

04/10/23Laboratory for Remote Sensing Hydrology and Spatial Modeling, Dept of Bioenvironmental Systems Engineering, National Taiwan Univ.

79