Introduction to MATLAB - Dimitri Dimitropoulos, PhD …dimitrid.webs.com/MATLAB/Intro_Matlab.pdfIntroduction to MATLAB This four-week tutorial will introduce you to the most basic

Introduction to MATLABThis four-week tutorial will introduce you to the most basic features of MATLAB. We will cover the following list of topics:

1. Introduction (Week 1)2. Matrices in MATLAB (Week 1)3. Some useful Commands (Week 1)4. Plotting (Week 2)5. More on Linear Algebra (Week 2)6. Flow Control (Week 2)7. Scripts and M-files (Week 2)8. Functions in MATLAB (Week 2)9. Numerical Methods in Economics Part I (Week 3)10. Introduction to the Optimization Toolbox (Week 3)11. Numerical Methods in Economics Part II (Week 4)

1. Introduction

Before we get introduced to the MATLAB computational environment, let's take a moment to familiarize ourselves with the MATLAB user-interface.

To start MATLAB double-click on the MATLAB icon on your desktop.

Once up and running, the MATLAB desktop appears containing tools and graphical user interfaces for managing files, variables, and applications associated with MATLAB. Take-a-look around:

Current directory – Here is where your current MATLAB files are kept. To change the directory, click use the dialog box labeled “Current Directory” on the toolbar.

Command history – This keeps a history of your recent commands

Workspace – Here is where the variables in the memory for your current program are kept.

Variable Editor – Allows you to view the contents of a selected variable.

Command Window – Here is where you type in your commands to MATLAB

An important thing one always needs to know when using a complex piece of software is how to get help on various commands. The MATLAB documentation is a great place to look up various features of MATLAB and\or their syntax. From the drop-down menu, go to Help → Product Help.

2. Matrices in MATLAB

When you start MATLAB your workspace is empty. Let us define a simple scalar variable. In the Command Window, type in

x=1 and hit Return or Enter.

As you can see, MATLAB first performs the computation specified on the right-hand-side of the equals sign and then stores it in a new variable called x. MATLAB variable names must begin with a letter, but can then be followed by any combination of letters, digits, and underscores. MATLAB distinguishes between uppercase and lowercase characters, so X and x are not the same variable.

By default, MATLAB displays all of its computations on screen. However, often you will want MATLAB to perform a command silently – and you can suppress output by ending a command with a semicolon ( ; ) .

x = x+1;MATLAB will perform the computation but does not display any output. This is particularly useful when you generate large matrices – displaying output requires processor time, and you should take all steps to avoid unnecessary waste of CPU time. For example

Y=magic(1000)versus

Y=magic(1000);

Scalars, Vectors and Matrices can be entered in MATLAB several different ways:

1. Entered in directly2. Generated using built-in functions3. Built from other vectors or matrices4. Loaded from external data files (Next week)5. Created using your own functions in M-files (Next week)

The easiest (but least common) way to create a matrix is to enter it in as an explicit list of elements, row by row: the entire list of elements is to enclosed by square brackets, [ ]. Commas ( , ) are used to separate columns and semicolons ( ; ) are used to indicated the end of each row.

M=[1, 2, 3, 4; 5, 6, 7, 8; 9, 10, 11, 12; 13, 14, 15, 16 ] Consequently, to enter a column vector we would use:

v1=[1; 2; 3; 4] and to enter a row vector, we would use:

v2=[1, 2, 3, 4]

You can refer to elements of a vector or matrix by using round brackets, ( ). In general, to refer to the element in the ith row and jth column of the matrix M we type M(i,j). For example:

M(3,2)

We can also pick out particular rows or columns of matrices, by using a colon, : . To select the entire ith row of the matrix M type in M(i,:), e.g.

M(3,:) To select the entire jth column of the matrix M type in M(:,j), e.g.

M(:,2)

The colon operator ( : ) is one of the most important MATLAB operators. It is the “all inclusive” or “series” operator – e.g. in mathematics we write this as three periods . . . - and it occurs in several different forms.

Traditionally, the colon operator is also useful to create vectors containing a sequence of numbers. In order to create a row-vector containing the sequence of numbers beginning a L and ending at U, type in L:U. For example

v3= 1:10We can also obtain a sequence of numbers with non unit spacing by specifying an increment. To create a vector containing the sequence of numbers beginning a L and ending at U but in increments of K, type in L:K:U. For example

v4= 0:0.2:5

The the colon operator can also be used to also pick sequential parts of matrices. For example, to pick out rows 2 through 4 inclusively, and columns 1 through 3 inclusively:

Y=M(2:4,1:3)

In addition to picking out sequential rows or colons, we can also pick out parts of matrices consisting of selected columns and/or rows – use square brackets, [ ]. For instance, to pick out rows 1 and 3, and then columns 1 and 4 :

Z=M([1 3],[1 4])

In fact, what we are actually doing is using one vector/matrix to pick out parts of another - e.g.

r=2:4;c=1:3;Y=M(r,c);

orr=[1,3];c=[2,4];Z=M(r,c);

MATLAB has several predefined matrices – which will often come in handy. For instance

eye(N) – a (square) identity matrix of dimension N – e.g. M1=eye(3) zeros(M,N) – a Matrix of zeros with dimension MxN – e.g. M2=zeros(3,3) ones(M,N) – a Matrix of ones with dimension MxN – e.g. M3=ones(3,3) rand(M,N) – a Matrix of uniformly distributed (pseudo) random numbers with dimension MxN

– e.g. A=rand(5,5) randn(M,N) – a Matrix of normally distributed (pseudo) random numbers with dimension MxN

− e.g. B=randn(5,5)mvnrnd(mu, S, N) – a Nxd matrix consisting of N random draws from a multivariate normal

distribution with mean vector mu and covariance matrix S, where d=dim(mu). – e.g. C=mvnrnd([0,0], [1,0.5;0.5,1], 100)

Consequently, for predefined matrices of variable length, you can make a n-dimensional column vector by typing a dimension of (n,1) – e.g.

v3=zeros(3,1) or a n-dimensional row vector by typing a dimension of (1,n) – e.g.

v4=zeros(1,4)

Matrices can also be built from other vectors or matrices using algebraic operations. Of course, standard matrix operations apply. In addition, since MATLAB defines matrices as two-dimensional numeric arrays arithmetic operations on arrays can be done element by element. The list of operators includes

• + Addition• – Subtraction• * Scalar or Matrix Multiplication• ^ Matrix Power• .* Element-by-Element Multiplication• ./ Element-by-Element Division• .^ Element-by-Element Power• ( ) Round brackets for order of operations.

Of course, THE ORDER OF OPERATIONS MATTERS – the rules of BEDMAS still apply in graduate school. Make sure to use round brackets ( ) whenever necessary.

Adding two scalars, vectors, or matrices is straightforward – use the plus operator (+). Similarly, we can subtract using the minus operator (–) .

ADD= A+BSUB= A-B

But make sure the dimensions agree – otherwise you will get an error.

Scalar multiplication, where we multiply a vector by or matrix by a scalar, is also simply done using the asterix operator (*).

s= 2*v1S= 10*A

In MATLAB, we can also do “scalar addition” and “scalar subtraction”, where we add (or subtract) scalar to each element of a matrix.

Sa= v1 + 10SA= A + 0.5

or

Ss= v1 - 10SA= A - 0.5

Using the asterix operator (*) two vectors can also be multiplied together. But, recall that a column and a row vectors can be multiplied in two ways: (i) multiplying a row vector by a column vector is equivalent to the inner product (or scalar product):

ip=v2*v1 or (ii) multiplying a column vector by a row vector gives the outer product - a matrix

op=v1*v2

Matrix multiplication is also straightforward - To multiply matrices we just use the asterix (*). But remember the congruence rules of multiplication - NxM * MxK = NxK

A*B Of course, as we all know, in general the matrix multiplication is not commutative.

B*A

Further, vectors and matrices can be multiplied using the asterix (*). Of course, the congruence rules still apply - multiplying a row vector by a matrix gives a row vector – recall, 1xN * NxM = 1xM

v2*M1 And multiplying a matrix by a column vector produces a column vector – recall, NxM * Nx1 = Nx1

M1*v1 Other combinations produce an error

Finally, for square matrices, we can compute power matrices using the caret (^).

A^2

A useful property of MATLAB is that it allows element-by-element operations – simply precede any operation with a period. Of course, here the matrices must be of the same size.

For example, to compute the element-by-element product of A and B, where each particular element of A is multiplied with the corresponding element of B (formally called the Hadamard product) – we simply put a period in front of the asterix symbol – i.e. use .*

A.*B The Hadamard product IS cummutative

B.*A

Similarly, if we wanted to divide the matrix A by B element-by-element , so that each particular element of A is divided by the corresponding element of B, we simply put a period in front of the division symbol – i.e. use ./

A./B

Likewise, if we wanted to raise the matrix A to some power element-by-element , so that each particular element of A is raised to that power, we put a period in front of the caret symbol – i.e. use .^

A.^2

Sometimes (in MATLAB lingo) using the .SYMBOL operators are called “vectorized operators”.

Clearly, the dimension of vectors/matrices will be important – and we often will want to see the dimension of our matrix. In this case, we use the size( ) function. For example

size(Z)will return the row (1st element) and column dimensions (2nd element) of Z. If we wanted to only get the row dimension, we would type

size(Z,1)and if we wanted to only get the column dimension, we would type

size(Z,2)

Matrices can be built from existing vectors or matrices That is, we can take a bunch of smaller vectors or matrices and put them into a single, big, matrix. Simply enclose them in square brackets [ ]. Of course, we need to make sure that the rows and columns agree.

For example, consider the following column vectors

a= [1;2;3]b= [4;5;6]c= [7;8;9]

We can construct a matrix with these vectors as the corresponding columns.

W= [a, b, c]

Likewise, consider the following row vectors

d= 1:3e= 4:6f= 7:9

We can construct a matrix with these vectors as the corresponding rows.

X= [d; e; f]

We can even construct a matrix from other matrices

Y= [W, X; X, Q]

And, we can use algebraic operations

Y= [a, a.^2, a.^3]

MATLAB is very useful for the purposes of linear algebra. And it has several predefined functions to manipulate matrices for in this purpose.

To transpose a matrix use either the apostrophe ('), or the function transpose( )

MP=M' or

MP=transpose(M)

To refer to the diagonal elements of a matrix, use the function diag( )

D=diag(M)

To find the determinant of a matrix use the function det( )

det(A)

To find the rank of a matrix – i.e. the number of independent columns (or rows) we use rank( )

R=rank(A)

To find the an inverse of a matrix use the function inv( )

IA=inv(A)

NB: Matrix inversion is one of the most computationally costly operations. Inverting “small” matricies (of dimension less than 1e4) is not too bad. But “large” matricies (of order greater that 1e5) is computationally expensive – e.g. cumputing the inverse of a matrix with dimension greater than 1e8 can literally take hours, even days. Consequently, for the purposes of solving linear equations, MATLAB has more ''advanced'' ways of computing these matricies (more on this later).

Consider now our first matrix M. For det(M)we get ans = 0. Consequently, if we try to compute the inverse with IM=inv(M) we will get the following message

Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.387779e-018.

Roundoff error has prevented the matrix inversion algorithm from detecting exact singularity. But the value of RCND, which stands for reciprocal condition, is on the order of your computer's floating-point relative precision, so the computed inverse is probably of no use.

Of course, MATLAB also has predefined functions to compute common manipulations of matrices outside the realm of linear algebra.

For example, to take the sum of all the elements of a vector use the function sum( ) - e.g.

s=sum(v1)Or to take the product of all the elements of a vector use the function prod( )- e.g.

p=prod(v2)

These two commands to matrices But they will be applied to a matrix – in this case the sum or product is computed for each column (i.e down the rows)

S=sum(A)P=prod(A)

There are MANY other predefined functions in MATLAB that can be applied to vectors or matrices (there are too many too list, but you can look them up). Some of the more popular ones are

exp(A)- is the e-exponential of the elements of A, i.e. e^A, element-by-element

log(A)- is the natural logarithm of the elements of A, i.e. ln(A) , element-by-element

sqrt(A)- is the square root of the elements of A, i.e. A^(1/2) , element-by-element

abs(A)- is absolute value of the elements of A, i.e. |A| , element-by-element

round(A)- rounds the elements of A to the nearest integers

Two popular predefined functions – especially in economics – are the max( ) and min( ) functions. These functions compute the maximum or minimum value of a vector, as well as the location(s) of the maximum or minimum.

For example, consider the following vector

v=[0;1;1;2;3;5;8;13;21;34]This is just a column vector containing the first 10 numbers of Fibonacci's sequence. The largest element is obviously 34 and occurs in location 10.

If we want to MATLAB to calculate the largest element of v and store it in the variable V, we would use the max function and type

V=max(v)

The max (and min) functions are array value functions – they can provide us with a list of outputs: both the maximum value of the vector as well as the location-index of the maximizer argument If we want to MATLAB to calculate the largest number as well as the location of the maximum and store them in the variables V and K respectively, we would type

[V,K]=max(v)

Note that this will result in two objects. V contains the maximum value and K contains the location of the maximum. Of course, if the maximum occurred at more than once location (and with numerical computation, the maximizer is often not unique) then K would be a the location of first instance of the maximum.

We can also apply the max and min functions to matrices In this case, the max or min is computed foreach column (i.e down rows)

For example, consider the following matrix (random keyboard hackings)

W=[2 4 2; 1 9 5; 6 1 0; 9 4 2; 1 8 7]

If we want to MATLAB to calculate, for each column of W, the largest element in the column, we would type

V=max(W)

And if we want to MATLAB to calculate the largest number as well as the location of the maximumAnd store them in the variables V and K respectively, we would type

[V,K]=max(W)Again this will result in two objects. V contains the maximum value of each row. And, for each column, K contains the row location at which the maximum occurs.

APPLICATION: Maximizing a Function using a Grid-Search

One common application of MATLAB is to numerically solve a mathematical problem that has no analytic solution (e.g. a complicated optimization problem). For illustrative purposes, we consider here a maximization problem whose solution is closed form to show you how this works.

Consider the problem2 max XXRX −+ε

This problem has a closed form solution: the associated FOC implies the maximum argument solves021 * =− X

or5.0* =X

Hence, the maximum value of the objective is25.0)5.0(5.0)( 22** =−=− XX

We can solve the same problem numerically using the computer. To do so, we first need to create the graph of the objective function - the set of ordered pairs {(x,y): y=f(x), xεD}. Of course, the domain of the objective is a dense space – and a computer's memory is discrete. So we will

Begin by making a discrete grid of the domain

x=(0:0.01:100)'A discrete approximation to the graph of the objective function is then given by

y = x - x.^2

The vector y contains a complete evaluation of f(x) for all points in the relevant domain for the problem. We can then find the maximum value for f(x), as well as the location-index of the argument which achieves the maximum using the max() command

[y_start,l_star]=max(y) Finally, to solve for the maximal argument

x_star = x(l_start)

The method we used to solve this problem is called a Grid-Search – we search for the solution over the entire grid of feasible values.

3. Some Useful Commands

There are some commands that you will need to know in order to use MATLAB effectively. Of course, if you are in need of more commands, you can always the Product Help under MATLAB's Help menu.

clear• The clear command clears the variables in MATLAB's memory• By itself, using clear will remove ALL VARIABLES from memory• To clear only a select list of variables, for example the variables A B and C, type clear A B C

save• The save command allows you to save all the variable in memory into a .mat file (.mat files are

MATLAB “data” files)

• The general syntax is save filename - this save all variables in memory into a file named filename, where filename is an unquoted string specifying a file name (which can also include a file extension or a full path name).

• By default, MATLAB will save the file in the current directory; to save the file in a different directory you must specify the full path.

• If you want to save only a select set of variables – say only the variables X, Y and Z - use the command save filename X Y Z

• e.g. save workspace or save workspace.mat or save workspace C:\Work\workspace.mat

load• The load command allows you to load all the variables from a previously created .mat file into

memory.

• The general syntax is load filename - this will load all variables from the file specified by filename, where filename is an unquoted string specifying a file name (which can also include a file extension or a full path name).

• By default, MATLAB assumes that the file filename is located in the current directory; to load a file in a different directly you must specify the full path.

• If filename has no extension, load looks for a file named filename.mat and treats it as a binary MAT-file. If filename has an extension other than .mat, load treats the file as ASCII data.

• If you want to load only a select set of variables – say only the variables X, Y and Z - use the command load filename X Y Z

• Of course, make sure there is no duplication between variables currently in memory and those you are trying to load – duplicates will be overwritten!

• e.g. load workspace or load workspace.mat or load C:\Work\workspace.mat

Importing Data

• The easiest way to import data into your MATLAB application is to use the Import Wizard.

• MATLAB can import various types of data files, including text files (.txt files and .dta) as well as spreadsheet files (.csv files and .xls files)

• To use the Import Wizard, from the drop-down menu, go to FILE → IMPORT DATA and then choose the folder/file you wish to import. The Import Wizard will then guide you.

Break• Sometimes you may have typed in a command or have run a script (more on scripts later) that is

taking to long to execute.

• In this case, you will want to stop – or break – MATLAB from running anymore of the time consuming process.

• To break, press and hold together CTRL and C.

Log Files• Often, you will want to have a log-file record the output of your current MATLAB session. This

is done through the MATLAB diary() function

• The general syntax start your log is diary('filename.txt' )

• MATLAB will open the file filename.txt and then append to it the output that appears on your screen from your current MATLAB session. This file will be saved in your current directory.

• To turn off the diary at the end of you session type diary off

• The more advanced user will want to log his work base on date/time. This can be done by the sequence of code found below

date = clock;date = strcat( num2str(date(3)), num2str(date(2)), num2str(date(1)) ) ;diary (strcat('log_of_', date, '.txt'));

This may also be a good place to say few words about how MATAB stores and displays numbers. By default, all variables are stored, and calculations are performed in double precision (64 bits). And, by default, all output is displayed in short format, i.e. in 4-digit fixed point format.

You cannot change the precision of your MATLAB machine. However, you can change the way MATLAB display's output – use the format command. Various options are available:

format short - fixed point format, with 4 digits after the decimal point.format long - fixed point format with 14 to 15 digits after the decimal point.format short e - floating point format, with 4 digits after the decimal point.format long e - floating point format, with 14 to 15 digits after the decimal point.format short g - best of fixed or floating point, with 4 digits after the decimal point.format short g - best of fixed or floating point, with 14 to 15 digits after the decimal point.

For instance:format short pi

vsformat short epi

vsformat longpi

vsformat long epi

The format, once defined, applies to the rest of the session. To revert to the default format simply typeformat

4. Graphing

A two-dimensional graph is a set of ordered pairs – often represented as a plot of values in the Cartesian space. MATLAB can easily plot values of one vector vs. another using the plot command

Suppose the vector y contains the values of a certain function at the values of independent variable given by the vector x. For example

x=(1:10)';y=exp(x);

We can plot the values of y against the values of x using the plot command.

plot(x,y)

MATLAB will open the Figure Window, and then will plot the pairs of components of these two vectors and connect them by straight lines.

Various line types, plot symbols and colors can be obtained by including a character string as a third element in the plot command. The general syntax is

plot(x, y,'s') where s is element, or any combination of elements from the following table

COLOURS Symbols Lines b blue . point - solid g green o circle : dotted r red x x-mark -. dashdot c cyan + plus -- dashed m magenta * star y yellow s square k black d diamond w white p pentagram

For example plot(x, y,'g')

plot(x, y,'md') plot(x, y,'c+:')

We can edit the plot from the Figure Window. For example, to adjust the axes / enter graph titles, use the drop-down menu and go to EDIT → Axes Properties.

Or, you can insert labels, titles, legends and arrows from the Insert drop-down menu.

And, obviously, you can save the figure from the File drop-down menu

Suppose now we have a second set of points. For example

y2 = x.^2;

If we try to plot the second set of points

plot(x2,y2)

we see that the new plot overwrote the old one in the plot window.

To have two sets of values plotted on the same graph you can either include both in the same call to plot command, for instance:

plot(x,y,'o', x,y2,'+')

Alternatively, we can add the second plot as a new object to the first plot, without overwriting it, by using the command hold on (hold off will release the window; hold will toggle the window between on and off)

plot(x,y)

hold on

plot(x,y2)

To have several plots displayed side by side (and/or above each other) we create an array of plots with the subplot command. The general syntax is

subplot(r,c,n) , plot(x,y,'s')where r is a number of rows in this array, c a number of columns and n is the index of which figure to use for the current plot

subplot(2,2,1) , plot(x,y)subplot(2,2,2) , plot(x,y2)subplot(2,2,3) , plot(x,y,'g')subplot(2,2,4) , plot(x,y,'m:')

MATLAB also allows us to graph 3-dimensional objects. This topic is a little advanced, and to learn everything about 3D plots would require its own separate tutorial; so, I will only provide an brief introduction.

Suppose we wished to graph the following function of two variables:2222 )2()2()2()2( +−+−−−−− −= yxyx eez

Formally, the graph of a two variable function is called a surface.

To plot a surface, the first thing we need to do is create a (discrete) grid of points in the (x,y) plane - called a mesh - at which the function will be calculated. We create the mesh using the command mesh grid( ).

x=-5:0.1:5;y=-5:0.1:5;[X,Y]=mesh grid(x,y);

Formally, this creates two matrices – X and Y – together which form the Cartesian product of the space defined by x and the space defined by y. The rows of the output array X are copies of the vector x, while the columns of the output array Y are copies of the vector y.

Now we calculate a matrix of values of the function at these points:

Z= exp( -(X-2).^2 -(Y-2).^2 ) - exp(-(X+2).^2 -(Y+2).^2 );And to plot the surface, we use the command mesh( )

mesh(Z)

Again, we can interactively edit the 3D graph using the drop-down menus.

5. Linear Algebra

Consider the system of linear equations Ax=b where A is a nan matrix, b is a nX1 vector, and x is a nX1 vector of unknowns.

Provided the matrix A has full rank, a solution x* to this system exists. Moreover, this solution unique. In particular x*=inv(A)b

Suppose we wanted to solve the system of equations Ax=b in MATLAB. And suppose A is square and has full rank.

A= rand(100,100);b= rand(100,1);rank(A)

The most straightforward way to compute the solution to this system is to use the inverse of A. We type

x_star = inv(A)Of course, this requires the computer to calculate the inverse of A directly – which is often computationally cumbersome task and is sometimes (for large matrices) not feasible. So MATLAB has alternative ways of computing the solution – using its “Matrix Division” methods. That is, we type

x_star = A\bA\b returns the same answer as would inv(A)*B, except it is computed in a different way – in particular, using Gaussian elimination (which is much faster).

6. Flow ControlStructural commands – or flow control - are the backbone of programming. In MATLAB, the most basic structural commands include

• “if” statements• “for” loops• “while” loops

But before we get into structural commands, some basic understanding of relational and boolean operations is needed. The set of relational and boolean operators include

< less than> greater than<= less than or equal to>= greater than or equal to== equal to~= not equal

& and| or

Relational operators perform element-by-element comparisons between two variables, and then return a logical array of the same size, with elements set to logical true (1) where the relation is true, and elements set to logical false (0) where it is not. The boolean operators for ''and''/''or'' allow us to combine more than one logical relation.

These relational operators can obviously be used in structural commands.

But they can also be used to create “indicator variables”. For example, consider the following two vectors

v5=10*rand(5,1)v6=10*rand(5,1)

We can create a vector of 1s and 0s, 1 if true and 0 if false, based on an element-by-element comparizon of these two vectors. For example, we may want to create a variable I with 1's whenever the elements of v5 are greater than v6. Then, simply put the logical statment into brackets - we type

I=(v5 > v6)

We can also make an indicator function based on some operation on v5 and v6. For example, we may want to create a variable J with 1's whenever the elements of 2 times v5 are greater than v6 are greater than zero. We type

J=(2*v5-v6 > 0)

The first structural statement we will review is the if statement. The if statement evaluates a logical expression and executes a group of statements when the expression is true. An end keyword, which matches the if, terminates the last group of statements. The optional elseif and else keywords provide for the execution of alternative groups of statements. The syntax of the structure is

The general form of the if statement is if logical relation statements to be execute if the logical relation is true end

We can extend the if-statement to perform a command in the event of a false

if logical relation statements to be execute if the logical relation is true else statements to be execute if the logical relation is false end

or we can extend the if-statement to perform a command based on many logical relations

if logical relation (1) statements to be execute if the logical relation (1) is true elseif logical relation (2)

statements to be execute if the logical relation (2) is true...

else statements to be execute if all logical relations are false end

Here is a simple example

I= round(10*rand)J= round(10*rand)if I == J

A = I;elseif I > J

A = I;else

A = J;end

The next two most important structural commands are loops.

The for loop repeats a group of statements a fixed, predetermined number of times. A matching end delineates the statements to be executed.

The general form of the for loop is for i = 1 : N

statements to be repeated endBasically, this statement will make a counter variable called “i”, which begins at 1 and ends at N – where N is a per-specified number, and then statements are repeated for each value of the counter. Of course, nothing prevents us from beginning at a number other than 1.

Here is an example of a simple for loop.

N=5 for i = 1 : N

R(i) = i^2 end

We can also nest for loops amongst each other Just make sure to name the counters differently. For example, consider a set of commands to make a matrix with elements computing the difference between column and row numbers

N=5M=5

for i = 1 : N for j = 1 : M

R(i,j) = j – i ; end

end

NB: Here, I have used for loops to build matrices element-by-element. However, unless the matrix you want to build is directly related to the counter in the loop (and even then its questionable), this should never be done in practice – computationally, this is extremely inefficient! Rather, figure out a way to use matrix operations to build the desired matrix.

The while loop repeats a group of statements an indefinite number of times, under the control of a logical condition. A matching end delineates the statements to be executed.

The general form of the while loop is while logical expression statements to be repeated endBasically, this statement, on each repetition, check to see whether the logical expression is true or not. If it is true, the command statements are executed; if they are false, the loop is broken.

Here is an example of a while-loop – which numerically computes the steady-state of the dynamic equation given by Xt+1 = sqrt(Xt) – Xt, with X0 given.

X1 = 10;CRIT = 1;TOL = 1e-4;counter = 1;

display( 'Iteration Crit')

while CRIT > TOL X0 = X1;X1 = sqrt(X0) – X0;CRIT = norm(X1 – X0);counter = counter + 1;iter = [counter, CRIT]

end

As before, we can nest for while amongst each other In fact, we can nest for, while, and if statements all-together - this is how you make a program!

7. M-Files and Scripts

M-files are text files which contain MATLAB code. M-files come in two varieties: scripts and functions (more on functions later).

A script is a collection of MATLB commands, to be executed sequentially, for the purposes of a particular computation program. These commands operate on data in the workspace - so the variables they create become objects in the workspace of the session , and any variables they refer to must to be already defined in your session.

In MATLAB, script files are saved with the file extension “.m”. This is why these files are called M-Files.

MATLAB has its own EDITOR (embedded with an automatic DEBUGGER) to create and modify m-files. To create a script file from scratch, we start by just issuing the command edit, which opens up a new EDITOR window

edit Alternatively, we can use the menu, and go to FILE → New → Blank M-File

For illustrative purposes, consider a short script file which will create a nan matrix of zeros and then populate it with random numbers just above and below the diagonal - where n is to be determined by you.

Of course, every good program needs to be documented – so that you don't forget what you are doing and so others who may look at your program can understand what you are trying to do. Comment lines in an m-file are preceded by the percent symbol, %, and these lines will not be executed by the program.

%Get size of n from the user n=input('Enter the dimensions of the matrix: ') %Create matrix of zeros B=zeros(n,n); %Populating elements just above and below the diagonal for i=1:n for j=1:n if ( j== i-1 & i>=2 ) | ( j==i+1 & i<n) B(i,j)=rand; end end end

Once you have typed in the above commands in your script, use the drop down menu to save the file as “pop.m” in your “My Documents” directory.

To be able to use an m-file we have to tell MATLAB where to look for it. The command path tells us which directories MATLAB looks at (a lot of MATLAB functions are just m-files).

We can add a directory to the path using the drop-down menu. Go to FILE → Set Path. Then, from the Set Path Dialog Box, click on Add Folder and select the folder in which you saved your .m file. Finally, SAVE your appended path.

Now we can execute commands in the script just by calling the name of the m-file from the Command Window.

pop

The M-file we have created above is very simple. In your own work, you will most likely create much larger M-files and it is quite likely (even certain) that you will make mistakes. MATLAB has a powerful Debugger which considerably facilitates finding errors.

To run a script file with the assistance of the Debugger, while in the Editor Window just hit F5 on the keyboard. Or, we can use the drop-down menu and go to DEBUG → Run pop.m.

Another useful feature of MATLAB is its Profiler. This useful tool profiles the execution time of your scripts, broken down by each line of code. This allows you to sort through your sequence of MATLAB and optimize CPU time. To access the profile, from the Editor Window go to Tools → Open Profiler. Once the Profiler Window opens, to start the process of profiling your script click on Start Profiling.

8. Functions

A Function is a rule or procedure which maps set of arguments into a value.

In MATLAB, functions can be single valued, or vector/matrix valued. In fact, the output of a function can be an array of matrices

MATLAB allows users to define two different types of functions: (i) Anonymous Functions, and (ii) M-File Functions.

Anonymous functions give you a quick means of creating simple, one-line functions. The syntax for creating an anonymous function from an expression is

fhandle =@(input_args) expression

The syntax statement shown above constructs an anonymous function by the name fhandle (called the function handle) which takes as arguments input_args and performs the computation provided in expression .

Creating a function handle, fhandle, for an anonymous function gives you a means of invoking the function.

Leading off, the statement = @ declares to MATLAB the construction of a function handle.

Next input_args is a list of comma-separated arguments - e.g. (arg1, arg2, … ) - to be passed to the function.

Finally, expression represents the body of the function: the code that performs the main task your function is to accomplish. This consists of any single, valid MATLAB expression.

As an example, consider an anonymous function that calculates the square of a number.

squared =@(x) x.^2

When you call this function, MATLAB assigns the value you pass in to variable x, and then uses x in the equation x.^2. Note that this function has been “vectorized” so it can be applied to scalars, vectors and matrices

Once defined, you can use your anonymous function by specifying the variable that contains the handle, followed by a comma-separated argument list in parentheses – i.e. fhandle(arg1,arg2, … ) . For our examples

y=squared(2)orx= (0:10)'y=squared(x)

or Y=squared(M)

The example above shows an anonymous function based on a single argument. Of course, we can construct functions based on many arguments. For example

squ=@(x,y) x.^(2) + y.^(2)In this case, to call the function we would use sq2(x,y) – e.g.

x=(1:10)'y=(-1:-1:-10)'z=squ(x,y)

And, of course, we can nest one function inside another function. For example

squared =@(x) x.^2squ2 = @(x,y) squared(x) + squared(y)

It should be obvious that the functions squ and squ2 are functionally equivalent.

The second type of functions that can be defined in MATLAB are M-File Functions (named as such because they are written in separate code within an M-File)

M-File functions provide the highest degree of functionality in the language. They are the equivalent of subroutines in FORTRAN or procedures in PASCAL. They constitute the basis for complex programs and model prototyping.

M-File functions are used to construct functions which require more than one line of code for their computation. In addition, they can be used to perform a particular task (a subroutine) for a much larger script.

To begin the construction of an M-File function we need to open up a new M-File. Go to FILE → New → Function M-File. This will open up the M-File Editor.

The basic syntax for a M-File function is

function [output_args] = funname(input_args) expression s

end

An M-File function defines a function with the name of funname which accepts as inputs the list of arguments given by input_args - e.g. (arg1, arg2, … ) - and returns the as output output args. Note, M-file functions can be array-valued; they can produce as output a list of of scalars/vectors/matrices – e.g. [out1, out2, … ]. Input arguments are to be enclosed in round brackets; output values are to be enclosed in square brackets.

The command function declares that the M-File being used is of the function type.

Next output_args is a list of comma-separated values enclosed in square brackets - e.g. [out1, out2, … ] - to be computed by the function

Following which, = funname defines the handle by which the function is to be called.

Then, input_args is a list of comma-separated inputs enclosed in round brackets values - e.g. [out1, out2, … ] - to be computed by the function

expression represents the body of the function: the code that performs the main task your function is to accomplish in the construction of output_args. This consists of any list of MATLAB expressions.

An end command completes the function declaration

Note, by default, the variables passed into the function as arguments and those introduce within the body of the function are all local variables. That is, they exist only inside the function that uses them, and are saved in a workspace separate from the main workspace.

This is often useful – a sort of Object Oriented approach to programming.

However, sometimes you will want to share a single copy of a variable(s) between the base workspace and a function – so that they can be manipulate by both. In this case, you will need to define this variable(s) as a global variable – both in the base workspace as well as in any function which plans to use them. For example, suppose we want to share the variable X between our base workspace and an M-File function. Then, we will type in our larger script file, and in our M-file functionality

global X

Once you have finished programming an M-File function it must be saved – as an .m file – in order for you to be able to call it during a larger computation routine. From the M-File Editor window go to FILE → Save As.

The name of the m-file should (must) be the same as the function handle, funname, as defined in the first line of the M-file.

By default, the function will be save in the Current Directory. But you can save the file in any directory which has been added (or that you will add) to the Matlab Path.

Once saved, for you to call on an M-File function simply use its function handle, along with passing it the relevant input arguments.

funname(arg1, arg2, …)

We will see an example of M-file functions a little later on.

9. Numerical Methods in Macroeconomics

The most common use of MATLAB in macroeconomics is to solve Dynamic Programming Problems. In this section, we cover two important examples: (1) Deterministic Growth Model and (2) Stochastic Growth Model.

Example: Deterministic Growth ModelConsider the planner's growth problem to choose a pair of infinite sequences {ct} and {kt+1} to solve

max∑t=0

∞bt u c t s.t. ctk t1= f k t with k 0 given

Using the constraint, we can write this as a choice over the sequence {kt+1} alonemax∑t=0

∞bt u [ f k t −k t1] k 0 given

From our study of Dynamic Programming we know that we can equivalently write this problem using a Bellman Equation

v k =max y u [ f k − y]bv y Even though the Bellman Equation is a functional equation (i.e. the unknown of the equation is a function, namely v), it is much easier to solve (especially by computer).

Namely, we make a guess of v, call it v0, and then create the sequence of functions defined byv t1k =max y u [ f k − y ]bv t y

By the contraction mapping theorem, this sequence converges to the solution of the Bellman Equation

However, for arbitrary function u and f, there is no closed form solution to this problem. However, if we parametrize u and f, and then calibrate these parameters, we can solve the problem numerically – i.e. solve for the graph of v given by the ordered pairs {( k,v(k) ) : k in K}.

At the same time, by solving the Bellman equation, we derive the optimal policy g(k) – which can be used to back out the optimal sequence {kt+1} through the iteration kt+1=g(kt) . O

Using what we have learned so far, let's open up a M-File a write a scrip to solve the planners problem.• First, lets parametrize the problem• Next, we discretize the state-space

• Then, for each value of k in the state-space, we solve associated DP problem.• Of course, to solve for the entire function v(k) we need to do this for all k in the state-space

together.

• To do this, we make a matrix containing values of the RHS of the bellman equation - with element (i,j) = U( k(j) , y(i) )

• That is, each column of the matrix fixes a value for k, and then compute the value of the RHS for every possible value of y.

• We then maximize down columns – i.e. maximize each vector v(k(j) , :)

clear %Calibration & Parametrization b=0.9 a=1/4; f=@(x) x.^a; s=1/2; u=@(k,K) sqrt( (f(k)-K>0) .* (f(k) -K) ) ; %State and Choice space K=(0.001:0.001:0.3)'; n = size(K,1); Y = K; %Initial Guess V1=zeros(n,1); %Tolerance Level TOL = 0.001; CRIT = 1 ITER_COUNT=1 %Iterating on the recursion v_t+1 = T(v_t) while CRIT >0.001 %Updating guess V0=V1; %Computing T(v_t)) column by column for i = 1:n %fixing k k= K(i,1)*ones(n,1); %computing the RHS for fixed k U(:,i) = u(k,Y)+ b*V0; end

%Maximizing down columns [V1,IND] = max(U); V1=V1'; %Computing Convergence Criteria CRIT = norm(V1-V0); ITER_COUNT=ITER_COUNT+1 end %Constructing g(K) g=K(IND); %Some Graphs plot(K,V1) %Graph of V plot(K,g) %Graph of g(K) hold on plot(K,K) %45 degree line

Example: Stochastic Growth ModelConsider the planner's stochastic growth problem, formulated as a dynamic problem

v k , z =max y u [ zf k − y ]bE z [v y , z ' ]Here, z is a shock that is observed at time t. Of course, the next-period value of the shock is unknown at time t – hence we need to compute an expected continuation value.

We assume that z follows a markov process such that f(z'|x,z,y)=f(z'|z). Of course, this implies thatf(z',y'|z,y)=0 for all y'!=y (but, as it turns out, we can handle the program we will write can handle the more difficult case as well)

For simplicity, we assume that the space of z is a discrete space (you will see how to deal with more complex cases in class. However, even with this discrete assumption, we will run into a dimensionality problem – so to make the problem more simpler, we assume that z can take only two values, {ZL, ZH }. As such, its stochastic process can be described by a Markov-Matrix

P=[ PLL, PLH ; PHL , PHH]

Again, we solve this problem by iterating on the recursion vt+1=T(vt). Here v is a function of two variables – and we want to represent this using a vector. The issue is that we we usually represent a function of two variables using a matrix.

How do we represent a function of two variables using a vector – easy, we stack the matrix columns. Intuitively, what we are doing is calculating v(s) where s is a (k,z) pair from the Cartesian product KxZ

Of course, this means that we translate the Markov-Matix for the process of z' into a markov matrix for the process of s=(y,z') – where y is not stochastic, but rather a choice variable. But this is easy – for a given y, this larger matrix will have Pz'z as an element only its y coordinates, and zeros everywhere else.

Using what we have learned so far, let's open up a M-File a write a scrip to solve the planners problem.

clear b = 0.95 K=(0.001:0.001:1.5)';n = size(K,1);X=[K;K]; N=size(X,1); zL=2;zH=4; p1=0.6*eye(n);p2=0.4*eye(n);PS=[p1,p2;p2,p1]; Os = ones(n,n);Zs = zeros(n,n);Ds = [Os, Zs; Zs, Os]; U=zeros(N,N);for i = 1:N if i<=n x= X(i,1)*ones(N,1); U(:,i) = sqrt( (zL*(x.^(1/4)) -X>0) .* (zL*x.^(1/4)- X) ); else x= X(i,1)*ones(N,1); U(:,i) = sqrt( (zH*(x.^(1/4))-X>0) .* (zH*x.^(1/4)- X) ); endend V1=zeros(N,1);CRIT = 1;TOL = 1e-4COUNT=1 while CRIT >TOL V0=V1; EV=(PS*V0)*ones(1,N); V=(U+b*EV).*Ds; [V1,I] = max(V); V1=V1'; CRIT = max(abs(V1-V0)) COUNT=COUNT+1end VL=V1(1:n,1); VH=V1(n+1:2*n,1); g=X(I);gL=g(1:n,1); gH=g(n+1:2*n,1); plot(K,gL)hold onplot(K,gH)hold onplot(K,K)

10. Introduction to the Optimization Toolbox

Consider the unconstrained minimization problem

minx f(x)

where the domain of f, X, is some compact and convex subset of RN, and f:X→R is C2 differentiable.

We have consider these types of problems before – the RHS of our bellman equation is a single-variable optimization problem. In our study of dynamic programming, we used a grid-search method to solve the problem: that is, we evaluated the objective function for every point in the domain of x, and then found the value of x* at which the function was the largest.

Grid-search methods work fine when the number of choice variables is just one or two. However, when we are considering optimization problems for three or more choice variables, we will run into a dimensionality problem – and so a grid-search method is not feasible. So, we need to find other numerical methods.

Assume the minimizer x* lies in the interior of X. Then x* solves to system of first order necessary conditions

g(x*)=0

where g is the (N-dimensional) gradient of f.

If g(x)=0 is a system of linear equations (e.g. Ax-b=0) then there is an analytical closed-form expression for x* (e.g. x*=A ¹b).⁻

There may exist a closed form expression for x* even when the g(x) is nonlinear. However, for the most interesting cases (e.g. in econometrics) this is quite uncommon Instead, for non-linear models, we have to numerically solve the system g(x)=0. How do we do this?

To solve the nonlinear system g(x)=0 we use an iterative procedure:• The algorithm is initialized with an initial guess x0

• At iteration k≥1 of the algorithm, our guess is updated to

xk+1 = D(xk)

where D(·) is some function of x - typically constructed from 1st and 2nd derivatives of f.• We iterate on equation ( ) until we converge to a fixed point of D(·)∗

Of course, the trick is to construct D(·) so that a fixed point of D(·) is indeed a solution to g(x)=0. How do we do this?

Newton's MethodConsider a first-order Taylor approximation of g(x*) around some arbitrary value x

g(x*) ≈ g(x) + H(x)(x*-x)

where H is the Hessian (NxN matrix) of f. Now, since g(x*) =0, it follows that

0 ≈ g(x) + H(x)(x*-x)

Solving for x* we get: x* ≈ x + H(x)-1g(x)

Based on this idea, Newton proposed that D take the form D(x)=x-H(x) ¹g(x). That is, Newton⁻ iterations are based on

xk+1 ≈ xk + H(xk)-1g(xk)

Note that, at a fixed point x of D, we have

x ≈ x + H(x)-1g(x) H(x) ¹g(x)=0 g(x)=0⇔ ⁻ ⇔That is, a fixed point of D is a solution to g(x)=0

Some Comments:• Newton's method requires computation of both the gradient and the Hessian of the objective

function f.• Often, these derivatives have closed form expressions and are straight-forward to compute.• However, in many cases - especially when f is a complicated function – coming up with closed

forms expressions of these derivatives may be burdensome, if not impossible.• In these cases, we use a numerical approximation to these derivatives - based on their

definitions as limits of finite differences.

• Note, while the above discussion was presented in terms of a minimization problem, it is straightforward to extend it to the case our analysis to maximization problems. Recall

maxx f(x) = minx -f(x)

• Thus, if you want to maximize f(x), you can equivalently minimize -f(x)

• Finally, Newton's method solves only a local optimization problem (local to the starting value x0). So, it works very well for optimization problems where the objective is strictly quasi-convex (the algorithm will always converge to the global optimum)

• For problems where the objective function is not well behaved, depending on the initial guess: the algorithm may not converge; and if it does converge, it will only find a local optimum,

• As such a global search over different starting values will be necessary (global search algorithms are beyond the scope of this tutorial).

MATLAB's Optimization Toolbox provides us the means to numerically solve a minimization problem. The Optimization Toolbox has pre-programmed routines to solve many different types of problems, including

∙ fminunc - solves unconstrained minimization problems ∙ fminsearch - solves unconstrained minimization problems using derivative free methods ∙ fmincon - solves constrained minimization problems ∙ fsolve - solves system of nonlinear equations ∙ linprog - solves linear programming problems ∙ quadprog - solves linear programming problems ∙ lsqnonlin - solves non-linear least squares problems

The syntax for these various routines is similar. Our discussion here will focus on fminunc.

fminunc attempts to find a minimum of a real-valued function of several variables, starting at an initial estimate. Let fun.m be an M-File function which computes the value of the function f for any given value of x. Let x0 be your initial guess of the optimum. Finally, let options is a structure of optimization options specified using optimset (more on this later). The general syntax for fminunc is

xstar = fminunc('fun',x0,options)or

[xstar,fval,exitflag,output,grad,hessian] = fminunc('fun',x0,options)

This procedure will return: ∙ the solution vector, stored in xstar ∙ the optimized value of the objective, stored in fval ∙ a variable, exitflag, that describes the exit condition (a value of 1 means that the procedure

terminated successfully because g(x)=0, a value of 2 means 1 means that the procedure terminated successfully because norm(xk+1- xk)≈0, and a value of 0 means that the procedure failed).

∙ a structure, called output, that contains information about the optimization. ∙ the gradient of the objective, stored in the vector grad ∙ the Hessian of the objective, stored in the matrix hessian

NB: fminunc is a Quasi-Newton method. In particular, the medium-scale algorithm is a BFGS direction set method while the large-scale algorithm is a Trust Region method (precise descriptions of these are beyond the scope of this tutorial) . The computation of the iteration rules under these approaches are different than Newton's, but the idea behind them is similar.

If fun.m uses additional parameters in its computations - for example, a vector/matrix called data - you can pass them through fminunc. In this case, the syntax is

x = fminunc('fun',x0,options,data)or

[x,fval,exitflag,output,grad,hessian] = fminunc('fun',x0,options,data)

The structure options allows us to choose some of the particulars of the minimization algorithm, as set by optimset. There are many options to choose from, and covering all of them would require a class on its own. Consequently, you should definitely look at the MATLAB documentation to see how all of them work (in Product Help, type fminunc, and then click on "See Optimization Options").

The general syntax to set the optimization options. The general syntax to set options is

options = optimset('procedure_name')options = optimset(options, 'Op_1', value1, ... , 'Op_N', 'valueN')

Some popular options are (note, these are case sensitive)

• 'Largescale' – switches between a large-scale algorithm and a medium-scale algorithm. • 'Largescale', 'off' – is for a medium-scale, BFGS direction set algorithm. Moreover gradients

and hessians are computed by MATLAB using finite differences (user defined gradients are optional, but not hessians)

• 'Largescale', 'on' – is for a large-scale, trust-region algorithm. Here gradients must computed by the user in fun.m (user defined hessians are optional).• If you choose a large-scale algorithm, you must add the option 'GradObj', 'on' to tell

MATLAB that fun.m will calculate the gradient. (User-defined gradients for the medium scale algorithm are optional, using the same command).

• User defined Hessians are optional for the large-scale algorithm - if fun.m also computes the hessian, you can add the option 'Hessian', 'on'. (User-defined gradients are not available for a medium scale algorithm)

• 'Display' – selects what output the procedure should display • 'Display', 'off' – displays no output• 'Display', 'iter' – displays the iteration count, number of function evaluations, value of f,

stepsize in the BFGS algorithm, and the value of norm(g).• 'Display', 'final' – displays only the final output

• 'TolX' – Termination tolerance for the convergence criterion norm(xk+1- xk)≈0. By default, this is set to 1e-6. Personally, I often use 'TolX', 1e-4

• 'TolFun' – Termination tolerance for the convergence criterion g(x) ≈0. By default, this is also set to 1e-6. Personally, I often use 'TolFun', 1e-4

• 'MaxIter' – Maximum number of iterations before procedure terminates unsuccessfully. By default, this is also set to 400. Personally, I usually don't change this. But when I do, I use 'MaxIter', 1000*dim(x)

• 'MaxFunEvals' – Maximum number of function evaluations before procedure terminates unsuccessfully. By default, this is also set to 100*dimension(x). Note, for a medium-scale algorithm, each iteration uses at least 4 function evaluations. Personally, I usually don't change this. But when I do, I use 'MaxFunEvals', 1000*dim(x)

Note, you do not need to specify all/any of these options – it is fine to leave them at their default values. Moreover, you can specify more options than I have described – check the documentation for more options.

EXAMPLE: A simple optimization problemConsider the unconstrained maximization problem

2564 max 222

211 −−−− xxxxx

It is straight-forward to see that this has a global maximum of 12 at (x1*,y2*)=(2,-3). Lets write a program to solve for the maximum using fminunc.

First, we will consider the use of a medium-scale algorithm.

Then, we will consider the use of a large-scale algorithm where only the gradient is provided.

Medium-Scale Algorithm: Finite Difference Gradient and Hessians

Step 1. Create a new M-File function to compute f(x).

function obj = fun(x)%Parcing out x-vectorx1=x(1);x2=x(2);

%Creating objective function

f = 4*x1 -x1.^2 - 6*x2 - x2.^2 -25;

%But for a maximization probem obj=-fobj=-f;

end

Step 2. Create a new script to maximize f(x) using fminunc

clear

%Initializing Algorithmx0=[100;100];

%Algorithm Optionsoptions = optimset('fminunc');options = optimset(options,'Display','iter', 'Largescale', 'off', 'TolX', 1e-4);

%Calling Minimization Algorithm[x,fval,exitflag, output]=fminunc('fun',x0,options);

Large-Scale Algorithm: Analytic Gradient, Finite Difference Hessian

Step 1. Write an M-File function to compute f(x) and g(x)

function [f,g] = fun(x)%Parcing out x-vector

x1=x(1);x2=x(2);

%Creating objective function f = -(4*x1 -x1.^2 - 6*x2 - x2.^2 -25); %Creating gradient if nargout > 1 g(1) = -(4-2*x1); g(2) = -(-6-2*x2); end

end

Step 2. Write a script to maximize f(x) using fminunc

clear

%Initializing Algorithmx0=[100;100];

%Algorithm Optionsoptions = optimset('fminunc');options = optimset(options,'Display','iter','Largescale','on','GradObj','on');

%Calling Minimization Algorithm[x,fval,exitflag,output]=fminunc('fun',x0,options);

11. Numerical Methods in Econometrics

In this section we will talk about

(1) Econometrics using extremum estimators (Maximum Likelihood as an example)(2) Simulating probabilities

Maximum LikelihoodConsider the economic variables Y and X (these can be vectors). Suppose we believe that Y (the dependent variable) is generated from X (the independent variable) through the model M(θ) where θ is a vector of (unknown) economic parameters.

Our objective is to estimate the vector of parameters θ. Towards this end, suppose that we have collected a random sample {Yi, Xi : i =1,2,...N}. By random sample we mean that for each observation i in the sample, the data Yi has been generated Xi through M(θ), independent of all other observations. (The case where the data is not a random sample is beyond the scope of this lecture – but it can be handle by ML as well).

Fix a value for θ. For each observation in the sample, we can use the model M(θ) to write a probability statement about the chances of observing a value for the independent variable of Yi given that the vector of independent variables is Xi. Denote this probability by P( Yi | Xi,θ). Given our random sampling assumption, it follows that the probability of observing our sample of data is given by the Likelihood Function

The maximum likelihood estimator of θ is defined as

That is, the MLE is that vector of parameters which solves the FOCs (called the Likelihood Equations)

where g( ) is the gradient of L( ). Why do we focus on MLE? Because it is the efficient estimator of θ, in that it has the lowest variance of all possible estimators.

In practice, the likelihood function is a bit difficult to work with (its is the product of many “very small” numbers). But, recall from the theorem of the maximum that for any monotone function f, the maximizer of f°L is equivalent to the maximizer of L.

So, in our search for the MLE we can use a convenient monotone transformation of the Likelihood function. In particular, we use the Log-Likelihood function – i.e. f= ln( ) .

L =∏i=1

NP Y i∥X i ,

ML=argmax L

g ML=0

The Log-Likelihood function is defined as

Hence, the (log-)likelihood equations that define the MLE are

5l(θML)=0

and the asymptotic distribution of the MLE is

where H(θ) is the hessian of l( ) evaluated at θ.

APPLICATION: Probit Model for Binary ChoicesConsider a consumer's demand for a new car. The problem we, as economists, face in studying demand systems for products such as automobiles is that consumer choices are necessarily discrete – a consumer either purchases a new car {Y=1}, or does not {Y=0}. As such, these types of consumer choices do not fit into the Neo-Classical Demand Model a la Marshall (1890). Instead, we use the Discrete Choice framework of McFadden (1978).

The utility from not purchasing a car is assumed to be zero. The utility a consumer, call him Consumer i, receives from purchasing a car is given by the function Ui(Xi), where Xi are exogenous variables affecting the consumer's choice. We assume that Ui consists of a systematic component δ(Xi) and a random component εi

Ui = δ(Xi,θ) - εi

where εi is and iid (across consumers) idiosyncratic taste shock. In the Probit model, we assume that εi

has the Standard Normal Distribution, with CDF Φ.

For simplicity, assume that that the systematic component is a linear function of the Xi's - so

Ui = Xiθ - εi

Our goal is to estimate the parameters θ.

Suppose we have a random sample of consumers, where we recorded whether or not a consumer chose to purchase a new car as well as the variables that affected his choice, {Yi, Xi : i =1,2,...N}. Consider the choice problem facing consumer i. Consumer i will have bought a car iff

{Yi =1} <==> Ui > 0 <==> Xi'θ - εi > 0 <==> Xi'θ > εi

l =∑i=1

NP Y i∥X i ,

MLA N [ , H −1]

Hence, given Xi, the probability that consumer i has purchased a new car is

Pr{Yi =1} = Pr{Xi'θ > εi} = Φ(Xi'θ)

And, the probability that consumer i did not purchase a new car is

Pr{Yi =0} = Pr{Xi'θ < εi} = 1−Φ(Xi'θ)

More compactly, we can write these probabilities as

Pr{Yi} = Φ(Xi'θ)Yi [ 1−Φ(Xi'θ) ]1-Yi

It follows, the log-Likelihood function for the sample is

It is straightforward to show that the gradient of this function is

and the Hessian is

ExampleConsider the following specification of consumer utility for autos

Ui= θ0 + θ1Dist_Worki + θ2Dist_Enti – θ3ln(Pricei) – εi

As an example of ML and Probit models, we will estimate the parameters for the above specification using some simulated data - the true parameter values are (25, 0.5, 0.25, -3). Download the following files from my website

Stata: dimitrid.webs.com\probitdata.datMATLAB: dimitrid.webs.com\probitdata.mat

For comparative purposes, load probitdata.dat into Stata, and then type in the command

probit y distW distE lp

Now, lets write a MATLAB program to execute the above estimation. For simplicity, lets first consider a medium scale algorithm where we let MATLAB compute the derivatives.

l =∑i=1

NY i ln X i ' 1−Y iln [1− X i ' ]

Step 1. Write an M-File function to compute L(theta,data)

function [obj] = prob(theta,data) %Parcing out data

y=data(:,1);kk=size(data,2);

X=data(:,2:kk);

%Calculating ProbsXb=X*theta;F=normcdf(Xb);P=(F.^y) .* ( (1-F).^(1-y) );

%Calculating Log-Likelihood l = log(P);obj= -1*sum(l);

end

Step 2. Write a script to maximize L(theta,data) using fminunc

clear

load probitdataX = [c, distW, distE, lp];data = [y,X];

%Initializing Algorithm by OLStheta0=inv(X'*X)*X'*y;

%Algorithm Optionsoptions = optimset('fminunc');options = optimset(options,'Display','iter','Largescale','off');

%Calling Minimization Algorithm[theta,fval,exitflag,output,grad,hess]=fminunc('prob',theta0,options,data);V=inv(hess);std = sqrt(diag(V));

%Presenting the Results cats = ['const | '; 'distW | '; 'distE | '; 'logP | '];results = [theta, std];display('-------- coeff. st.err ------')display([cats num2str(results,5)])

We could go on and write a medium-scale algorithm with analytic gradients, or a large-scale algorithm with analytic gradients and hessians (do this as homework). These are probably the routes to take when the likelihood function is not well behaved.

However, the likelihood function for the probit model is globally concave. In this case, why should we use a more complicated minimization procedure like fminunc, when Newton's method converges to the global optimum (and converges very fast).

Of course, this means writing our own optimization routine

Newton's Method – Analytic Gradients and Hessians

Step 1. Write an M-File function to compute L(theta,data) as well as g and H

function [l,G,H] = myprobit(theta,data)

%Parcing out datay=data(:,1);n=size(data,1);kk=size(data,2);X=data(:,2:kk);k=size(X,2);

%Calculating likelihoodXb=X*theta;F=normcdf(Xb);f=normpdf(Xb);

%Dropping "completely determined" observationsD = (F<1) & (F>0);ind = find(D==1);F = F(ind);X = X(ind,:);f = f(ind);y = y(ind);n=size(y,1);

%Calculating likelihoodP=(F.^y) .* ( (1-F).^(1-y) );l = log(P);

%Calculating gradientif nargout>1

wi = (y-F).* (f./(F.*(1-F))) ; W = wi * ones(1,k) ; g = W.*X; G = sum(g)';

end

%Calculating Hessianif nargout>2

vi = ( (f.^2) ./ (F.*(1-F)) ) ; V = vi * ones(1,k) ; H=zeros(k,k);

for i = 1:nx = X(i,:);H = H + V(i)*(x'*x);

end

end

Step 2. Write a minimization routine

function [theta,der,hess] = newton(theta0,data)

%Convergence Tolerancecrit = 1;tol = 1e-6;

%Initializing Algorithmtheta1 = theta0;i = 1; %Iteration Counter

%Some Headers to displaydisplay('Newton -- Iter ----- LLF ----- FOC ----')

while crit > tol

%Updating guesstheta0 = theta1;

%Computing likelihood and its derivatives[L,der,hess] = myprobit(theta0,data);

%Computing new guesstheta1 = theta0 + inv(hess)*(der);

%Convergence criteriacrit=norm(theta1 - theta0);

%Displaying iteration informationiter=[i, sum(L), crit];disp(iter);i = i+1;

end

theta = theta0;end

Step 3. Write a script to maximize L(theta,data) using Newton's method

clearformat short gload probitdataX = [c, distW, distE, lp];data = [y,X];

%Initializing Algorithm by OLStheta0=inv(X'*X)*X'*y;

%Calling Newton Algorithm[theta,grad,hess]=newton(theta0,data);V=inv(hess);std = sqrt(diag(V));

%Presenting the Results cats = ['const | '; 'distW | '; 'distE | '; 'logP | '];results = [theta, std];display('-------- coeff. st.err ------')display([cats num2str(results,5)])

Simulation of Probabilities 1 Let x be a random vector of dimension n with (joint) distribution F on the support X. We are often interested in computing probabilities of an event (a logical relation about x) such as

where A = {x EVENT}.

The issues in computing such probabilities: (i) computing integrals is usually complicated; (ii) these are often multidimensional integrals, which further complicates the problem; and (3) we may have to compute many of such integrals, or compute a limited number of integrals but many times.

There exists numerical methods to compute integrals (see guassian quadrature – in MATLAB its done by the command quad). However, such methods are computationally expensive even for integrals of low dimension; and doing this many times will not be feasible.

Instead, econometricians have developed ways to approximate such probabilities using simulation methods. There are many ways to do this: here, we will cover the most straight-forward method – Frequency Simulation.

A Simple Idea: Simulating Expected ValuesLet g:X−>RS be a (vector) valued function of these random variables. Consider the case where we want to compute the expectation of this function of x.

Again, this involves a multidimensional integral, and so may be a computationally challenging task.

Suppose we had real world data on the x's – in particular, a random sample {xi : i = 1,2,...,N}. For each observation in the sample, lets calculate yi=g(xi). Then, by the Law of Large Numbers we know that the sample mean of the y's will provide us with a consistent estimate of the expected value E[g(x)]

Of course, we probability don't have real world data. However, this doesn't mean that all hope is lost! Instead, we can simulate such data using a Random Number Generator – that is, we will make “pseudo” data by taking random draws from the distribution of the x. Let {xs: s = 1,2,...,S} be S simulations of the x's. Then, it is straight-forward to compute

This is a frequency simulator of the expected value of g(x) – for S large-enough, it will provide a consistent estimate of E[g(x)], where the “simulation error” is decreasing in S (but not by root-S).

1 NB: “Simulation” is often used in the context of “Monte-Carlo” methods, where we simulate pseudo-data for some computation application (e.g. Monte-Carlo-Markov-Chains). We do not cover Monte-Carlo simulation in this tutorial – however the tools for these methods are the same as those we cover here – namely, random number generators.

P x EVENT =∫AdF x

E [ g x ]=∫Xg xdF x

y=1N ∑

i=1

N

g x iE [ g x]

y=1S ∑s=1

S

g x s

MATLAB provides a variety of random number generators for many different distributions. In Product Help, search “Statistic Toolbox Random Number Generators”

Moreover, MATLAB has various algorithms for generating random numbers – and each algorithm can generate numbers from one of a number of seeds. The general syntax to choose a particular algorithm or seed is

S = RandStream('algorithm_name', 'Seed', seed_number);RandStream.setDefaultStream(S)

More information about the various algorithms can be found by going to Product Help and searching for “Using the Random Number Generators”. For the list of algorithms, search “list (RandStream)”

Of course, not all distributions under the sun are available. However, you can still generate random numbers for any arbitrary distribution – but you will have to do it in a round about way.

In particular, let F( ) be an arbitrary distribution from which we want to draw random numbers. Recall that, by the definition of the CDF, F( ) is monotone. As such, its inverse exist. This means that if we take a probability p, we can back out the associated x with that probability –

x = F-1(p)

So, for example, to generate S random numbers from F

1. First generate random probabilities – e.g. from the uniform distribution using p=rand(S,1)2. Generate random numbers from F using its inverse – e.g. x = Finv(p)

Simulating ProbabilitiesYou might be thinking: “whats does all this stuff on expected values have to do with calculating probabilities.” So let's get back to the problem at hand – however, as we will see, if you know how to simulate expected values, then you know how to simulate probabilities.

Consider the probability

where A = {x EVENT}. Define the indicator function 1{x EVENT}. This function will take on the value 1 when x EVENT, and 0 everywhere else. It follows that, if we define a new distribution G by dG(x) = 1{x EVENT}dF(x), then G puts weight only on x's such that x EVENT. Thus

But

Thus, P(x EVENT) = E[1{x EVENT}].

P x EVENT =∫AdF x

P x EVENT =∫AdF x =∫X

dG x

∫XdG x =∫X

1 {x EVENT }dF X =E [1 {x EVENT }]

That is, computing probabilities is equivalent to computing the expected value of a properly defined indicator function. And, we have already seen how to compute expected value of any arbitrary function – so we are done.

In summary, to compute P(x EVENT)

1. Simulate S draws from the distribution of x2. For each draw s, construct the indicator function ys=E[1{xs EVENT}]3. Compute the sample mean of the ys

This is a frequency simulator of P(x EVENT).

Example: Probabilities in a Multinomial Logit Model with Random Coefficients (BLP,1995)Consider a model of consumer demand over automobiles. There are N possible brands to choose from. Of course, a consumer need not purchase an automobile - in which case his utility is zero (we call this option 0). In effect, the consumer has N+1 options available to him (so this is a discrete choice problem).

The utility a consumer gets from purchasing a particular brand of car depends on the characteristics xj of the car his valuation of those characteristics bi and an idiosyncratic taste shock eij. Specifically, the utility of consumer I when purchasing product j is given by

Uij = xjbi – alog(pj) + eij

The idiosyncratic taste shock eij is drawn from the logistic distribution. The vector of characteristics xj

includes a constant term the per-se value of an automobile, as well as things like the cars horsepower (hp), fuel efficiency (mpg). The key to this model is how consumers value these characteristics: for each characteristic k we assume

bki = bk + σkzki

where bk is the average consumer valuation of characteristic k, zki is a random component to consumer i's taste for characteristic k which is drawn from a normal distribution with variance σk.

A consumer will purchase that product which yields him the highest utility. So, consumer I will purchase a car of brand j iff

Uij > Uih for all h=0,1,...N

It follows that the market share of product j (i.e. the probability that a random consumer will purchase product j) is given by

1S ∑s=1

S

1 {x sEVENT }

S j=Pr i chooses j =∫exp [ x j x j z i− log p j]

1∑h=1

N

exp [ xh xh z i−log ph]d z i

This is a multidimensional integral (z is a vector) with no closed-form expression. Moreover, we probably want to compute these probabilities for each car model, N in total – further making standard quadrature methods infeasible.

However, it is straight forward to simulate these probabilities.

• Fix a number of simulation S

• For s = 1,2....,S take a random draw zs from Φ, where dim(zs)= # of characteristics

• For each draw s, compute the integrand (the function we are integrating over) – call this ys

• Compute the sample mean of ys

As a numerical application, download the following data and have a look around

dimitrid.webs.com\automobiles.mat

Let write a script to simulate the BLP purchase probabilities for the automobiles in this data set.

S j=1S ∑s=1

S exp [ x j x j z s− log p j]

1∑h=1

N

exp [ xh xh z s− log ph]

clear %Loading data - take a look aroundload automobiles

S = RandStream('mt19937ar', 'Seed', 12345);RandStream.setDefaultStream(S) %Stacking data into matricesX = [const, hp, mpg, log(p)];b = [b0; bHP; bMPG; a];s = [s0; sHP; sMPG; 0]; n = size(X,1);k = size(b,1); %Number of simulationsS = 1000; %Some useful matriceszR = zeros(1,S);oR = ones(1,S);oC=ones(n,1); %Simulating ZsZ = randn(k-1,S);Z = [Z;zR]; %Computing Xb and XzXb=X*b; for i = 1:S z = Z(:,i); Xz(:,i) = X*(s.*z);end %A matrix with S columns, with Xb in each columnXB = Xb*oR; %DenominatorD = exp(XB + Xz);%NumeratorN = 1+sum(D); %A matrix with n columns, with N in each columnN=oC*N; %The Ysy = (D./N)'; %The simulated probabilitiesProbs = ( sum(y)/S )'; plot(Probs)