26
Stochastic Models Introduction to R Walt Pohl Universit¨ at Z¨ urich Department of Business Administration February 28, 2013

Stochastic Models - Introduction to Rffffffff-858f-2321-ffff-ffffd14c0f6f/...commercial statistical packages or Matlab. Walt Pohl ... { monthly return on Apple stock. ... Stochastic

Embed Size (px)

Citation preview

Stochastic ModelsIntroduction to R

Walt Pohl

Universitat ZurichDepartment of Business Administration

February 28, 2013

What is R?

R is a freely-available general-purpose statistical package,developed by a team of volunters on the Internet.

It is widely used among statisticians, and frequently newstatistical techniques are first implemented in R.

It is less widely-used by economists, who tend to prefercommercial statistical packages or Matlab.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 2 / 1

R versus Excel

R has many more probability and statisticalfunctions built in or avaiable in free packages.

R is command-driven. You enter a sequence ofcommands to manipulate your data.

While everything in Excel is in terms of cells, R hasa bunch of different data types: vectors, arrays,objects. You can define your own.

Normally you will create a “.R” command file thatis separate from your data.

Note: Excel also has a separate command language –VBA.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 3 / 1

R versus Matlab

The real target audience for Matlab is engineers.Matlab has many features useful for engineers butnot useful for us.

The target application for R is statistics. R hasmany more statistical functions than Matlab.

Matlab started as a package for manipulatingmatrices, and added other features later.Non-matrix based operations are awkward.

R was designed for general-purpose programmingfrom the beginning.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 4 / 1

R versus Other Statistics Programs

R is free.

R is more command-driven and less GUI driven.

R is very close to S-Plus.

R supports as broad of an array of operations as anyother statistics program.

R’s programming language is better-designed thanmost of its competitors.

Since different packages are written by differentvolunteers, R is not as uniform as some othersystems.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 5 / 1

Important URLs

R home page – http://www.r-project.org/

Closest R mirror site – http://stat.ethz.ch/CRAN/

R tutorial –http://cran.r-project.org/doc/manuals/R-intro.html

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 6 / 1

Monte Carlo Simulation in R

R has many, built-in probability distributions. For eachsupported distribution XXX, R comes with four functions:

dXXX – density function

pXXX – cumulative distribution function

qXXX – quantile function (inverse of the CDF)

rXXX – random draw

XXX = unif, norm, chisq, t, etc.

Example: For the normal distribution, we have dnorm,pnorm, qnorm, rnorm.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 7 / 1

Vectors in R

For us, the basic R datatype is a vector of numbers.

The c command creates vectors:Example: If you type c(1, 3, 4.5), R returns the vector(1, 3, 4.5).

You can assign vectors to variables, using the < −operator.

x < − c(1, 3, 4.5);

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 8 / 1

Vectors in R, cont’d

You can get the value of individual entries by using the []operator.

x[3] will return 4.5.

You can also get subvectors by using ranges.

x[1:2] will return the vector 1, 3.

The length function allows you to refer to the end in arange:x[2:length(x)] will return the vector 3, 4.5.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 9 / 1

Operations on Vectors

Where possible, any operation on vectors will be appliedelementwise.

So if x and y are two vectors, then z = x * y will be thevector where z[i] = x[i] * y[i].

Likewise log(x) will be the vector whose each entry willbe log(x[i]), etc.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 10 / 1

Sample Statistics

R has built-in functions for the usual sample statistics:

mean(x) – Mean of vector x

var(x) – Variance of vector x

sd(x) – Standard Deviation of vector x

quantile(x, q) – The q-th quantile of vector x.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 11 / 1

Reading Data

The easiest way to import data into R is through CSVfiles. Excel can export files in this format.

The function read.csv imports a file as a CSV file.

Example: apple < − read.csv(”apple.csv”) imports thefile named ”apple.csv” into the variable apple. The datais returned in the form of a data frame.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 12 / 1

Data frames

A data frame is a named list of vectors. In the case of”apple.csv”, we get four entries on the list:

DATE – end date of month.

RET – monthly return on Apple stock.

VWRETD – monthly return on CRSPvalue-weighted index.

rf – monthly risk-free rate.

You access the vector by using $. Example: apple$RET.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 13 / 1

Regression

R has a very easy to use interface for regression: the lmfunction. For example, to fit the CAPM for Apple, wewould use

lm(RET ∼ VWRETD, data=apple)

The first argument uses the tilde operator indicatethat we want to regress RET on VWRETD.

The second argument indicates that the data comesfrom the apple frame.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 14 / 1

Regression cont’d

lm by itself only returns the coefficients. To get moredetail, including t stats, use

summary(lm(RET ∼ VWRETD, data=apple))

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 15 / 1

Built-In Mathematical Functions

R has various built-in mathematical functions:

exp(x) – ex .

log(x) – natural logarithm, log x . (Use log(x, b) forlogb x).

xˆy – xy .

sqrt(x) –√x

Note these all work on vectors. exp(c(1, 2)) gives youc(2.718282, 7.389056).

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 16 / 1

Special Mathematical Values

Floating point supports some special values

1/0 = Inf.

−1/0 = -Inf.

0/0 = NaN.

Mathematical operations are defined for these specialvalues. For example, Inf + Inf = Inf, and Inf - Inf = NaN.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 17 / 1

Defining Your Own Functions

You can define a function by using R’s functioncommand:

f < − function(x) xˆ2This creates a function that squares its argument, andassigns it to the variable f. Calling f(2) in R will return 4.

Functions can take vector arguments. So f(c(1, 2)) willreturn c(1, 4).

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 18 / 1

Matrices

R also supports matrices. Use matrix(0, nrow=m,ncol=n) to create an m-by-n matrix. For example

g = matrix(0, nrow = 3, ncol = 4);

To access the element in the i -th row and j-th column,use [] with two numbers. For example

g[1,2] < − 3;

assigns 3 to gi ,j .

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 19 / 1

Logical Operations

R has the following basic logical operations.

==: equality

!−: not equal

<,>: greater or less than

<=, >=: greater/less than or equal

They evaluate to TRUE or FALSE.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 20 / 1

Logical Operations on Vectors

Logical operations work on vector arguments, and returna vector of TRUE or FALSE values.

Example: 1:10 > 5.

You can use the functions any or all to see if any or all ofthe entries in the vector are TRUE.

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 21 / 1

Control Structures

R supports the standard control structures found in mostprogramming languages:

Branching: if

Definite iteration: for

Indefinite iteration: while

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 22 / 1

Control Structures: If

A statement like “if test code1 else code2 ’ executescode1 if the test is true, and code2 if the test is false.(“else code2 can be missing, means to do nothing).Example: if (0 == 0) print(“is zero”) else print(“is notzero”).

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 23 / 1

Control Structures: For

For allows you to do something a fixed number of times:Example: for (i in 1:10) print(i);

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 24 / 1

Control Structures: While

While allows you to do something until a conditionbecomes TRUE. (It may take forever).Example:i = 10;while (i>0) {print(i);i = i - 1;}(Notice the use of braces here. This is because the bodyof the while loop contains multiple statements.)

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 25 / 1

Writing Fast R Code

R is faster for vector operations than for loops.Example: x < − (1:1000)2

is faster than

x < − rep(0, 1000); # create an array of all zeros.for (i in 1:1000) {x[i] < − iˆ2;}

Walt Pohl (UZH QBA) Stochastic Models February 28, 2013 26 / 1