Upload
jackta
View
218
Download
0
Embed Size (px)
Citation preview
8/10/2019 Empirical Finance
1/562
Empirical Finance
Executive MSc in Investment and Risk Management Programme
Prof. Robert L [email protected]
+65 6631 8579
EDHEC Business School
2427 Mar 2011
2224 Aug 2011
Singapore Campus
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 1 / 563
8/10/2019 Empirical Finance
2/562
Introduction
Empirical Finance
Introduction
Prof. Robert L [email protected]
+65 6631 8579
EDHEC Business School
2427 Mar 2011
2224 Aug 2011
Singapore Campus
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 2 / 563
8/10/2019 Empirical Finance
3/562
Introduction Introduction
This course is about Empirical Finance.
What do the available data tell us about financial markets, and do theysupport or contradict the various theories we have developed to explain thebehaviour of financial markets?
We will focus mainly on pricing, that is, how prices of financial assets aredetermined. It is possible to focus on other aspects of financial markets,e.g., trading volume.
The course will discuss both econometric techniques, and the actual
empirical findings.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 3 / 563
8/10/2019 Empirical Finance
4/562
Basic Principles
Empirical Finance
Basic Principles
Prof. Robert L [email protected]
+65 6631 8579
EDHEC Business School
2427 Mar 2011
2224 Aug 2011
Singapore Campus
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 4 / 563
B i P i i l P b bili d Di ib i
8/10/2019 Empirical Finance
5/562
Basic Principles Probability and Distributions
Why is there even a subject matter called Empirical Finance?
1 Astronomers can predict the positions of the planets, and phenomenasuch as eclipses, with extreme accuracy, centuries in advance.
2 Meteorologists can predict the weather a few days in advance.3 Can stock market analysts predict stock prices ten minutes in
advance?
Humans have essentially no effect on the motion of the planets, and only(possibly) very long-term effect on the weather. Prices of financial assets
are set on a minute-to-minute basis by people.
How do they decide what the prices of financial assets should be?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 5 / 563
B i P i i l P b bilit d Di t ib ti
8/10/2019 Empirical Finance
6/562
Basic Principles Probability and Distributions
The extent to which financial markets incorporate available information
into asset prices (the degree of market efficiency) is very hotly debated, inboth academic and industry circles.
There is no question, though, that events nobody knows about yet cantbe incorporated into asset prices.
The evolution of the macroeconomy, technological progress, societalevolution, are all very hard to predict, even by people who spend theirwhole lives studying such things. They are best modelled as randomprocesses.
If the fundamental economic processes that affect asset prices are random,then the asset prices themselves are also random.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 6 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
7/562
Basic Principles Probability and Distributions
The fact that security prices are random has profound implications forinvestorsmuch of financial theory involves the investors problem oftrading off risk and average return.
However, it also has profound implications for those who study financialmarkets. Financial theories are generally about relations between averagereturns and various measures of risk. If we observe that the averagereturns of securities differ from what is predicted by a theory, what
conclusion do we draw?
1 The theory is wrong.2 The theory is right, but its predictions are not met exactly because of
the random variation in asset prices.
Which is it?
Probability and statistics are absolutely fundamental to the study of
financial markets.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 7 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
8/562
Basic Principles Probability and Distributions
Examplesuppose there are three assets, X, Y, and Z. We havedeveloped an economic theory that tells us what (on average) the returns
of the assets ought to be. We then get a sample of monthly returns(annualised) of the three assets, over the last 20 year period. The resultsare as follows.
AssetX Y Z
Average return (predicted) 8% 10% 12%
Average return (observed) 6% 16% 14%
Standard deviation of return (observed) 25% 40% 60%
How do the predictions of the theory hold up? Do you have enoughinformation to tell?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 8 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
9/562
Basic Principles Probability and Distributions
A probability distribution specifies the likelihood of each possible outcomeof a random process. They can be discreteorcontinuous.
When a random variable has a discrete probability distribution, there areeither finitely many outcomes, or countably many.
Consider a six-sided die, each side labelled with a number from one to six.If each side is equally likely to come up when the die is rolled, then the
probabilitiesp1, . . . , p6 are all equal to 1/6.
Probabilities (in a discrete probability distribution) must satisfy twoproperties:
1 The probabilities must be zero or positive.2 The probabilities must add up to one.
Do the probabilities specified above satisfy both of these constraints?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 9 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
10/562
Basic Principles Probability and Distributions
Probability Distribution of Six-sided Die Throw
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 10 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
11/562
p y
A discrete probability distribution can have infinitely many outcomes, eachwith positive probability.
Suppose we throw a coin with a heads and a tails side. The coin isfair, meaning each side has a probability of 1/2. Suppose we throw thiscoin repeatedly, and call Xthe number of throws until the first head.What is the probability distribution ofX?
There is a 1/2 probability that the first throw will be heads, sop1 = 1/2. The probability that the second throw will be the first head is1/4, so p2 = 1/4. More generally, pi= (1/2)
i. There is no limit to thevalue of i; it is possible (although not likely) that it will take a million, a
billion, a trillion trillion trillion, etc. throws.
Do these probabilities satisfy the two rules?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 11 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
12/562
p y
Each of the probabilities is clearly greater than zero, so we have noproblem with negative probabilities. Do they add up to one?
i=1
pi=i=1
1
2
i= 1
(For justification of the last step, see any reference on geometric infiniteseries.)
The probabilities are non-negative, and up to onethey are validprobabilities. More generally, any distribution with
pi= (1 p)i1 p
for some p [0, 1] is called a geometricdistribution.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 12 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
13/562
y
Probability Distribution of First Head in Coin Throw Example
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 13 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
14/562
Continuous probability distributions have uncountablyinfinitely many
possible outcomes.
Examplewhat is the amount of rainfall in the centre of Singapore on 22June 2011, measured in millimetres?
This quantity could take anynon-negative valueit could be zero (norainfall at all), or any positive number. (Since water consists of molecules,the amount of rainfall is actually a discrete quantityhowever, it is verywell approximated by a continuous distribution.)
Continuous probability distributions are specified by a probability densityfunction.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 14 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
15/562
Examplethe random variable Xhas a uniform probability distribution onthe interval [0, 1]. Then Xhas the probability density function fX(x) = 1.
The density function does not specify the probability of each outcome;each particular outcome is infinitely improbable (i.e., has probability of 0).But ranges of outcomes have positive probability; what is the probabilitythat X falls in the interval [0.2, 0.3]?
P(0.2 X 0.3) = 0.30.2
fX(x) dx= 0.3
0.2(1) dx= x|0.30.2 = 0.1
Probability density functions must satisfy two rules:
1 They must be non-negative.2 They must integrate to one.
Does this uniform probability distribution satisfy these constraints?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 15 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
16/562
The uniform probability density on [0, 1] is obviously positive on thisrange. It also integrates to one:
10
fX(x) dx= 1
0(1) dx= 1
Note that this integral is only taken over the range of possible values
[0, 1]. We can instead take the probability density to be defined as 0outside this range:
fX(x) = {1 0 x 10 x1We can then just integrate over the entire real line (, +), and thevalue of the integral is still one.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 16 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
17/562
More generally, a uniform distribution can be defined on any range [a, b],with b>a:
fX(x) = { 1
(ba) a
x
b
0 xb
Note that the probability density satisfies the two requirements; it isnon-negative, and it integrates to one.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 17 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
18/562
Uniform Distribution on [0, 1]
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 18 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
19/562
Another examplethe exponentialdistribution, with probability densityfunction defined on the interval [0, +
):
fX(x) =ex, >0
Note that this is not a single distribution, but a family of many
distributions, indexed by the parameter .
The exponential distribution has many applications; for example, it is usedto model the time until a radioactive particle decays. It is sometimes usedto model time to default in credit risk applications.
Does the exponential distribution satisfy the two requirements for a validprobability distribution?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 19 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
20/562
Exponential Distribution with = 0.5
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 20 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
21/562
Another examplethenormal, orGaussian distribution. This distributionis defined for all real numbers (positive, zero, and negative), and has thedensity function:
fX(x) = 1
22e
(x)2
22 , >0
Despite its somewhat odd appearance, the normal distribution arises in avery natural way in many, many applications, and is one of the mostfundamental continuous distributions there is. It is often used to modelreturns of financial assets.
Note that the Gaussian distribution is actually a family of distributions,indexed by and . More on these parameters later.
Does the Gaussian distribution satisfy the two requirements for a validprobability distribution?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 21 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
22/562
Gaussian Distribution with = 0.1 and = 0.25
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 22 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
23/562
We will often use summary statistics, which capture some (but not all) of
the information in the probability distribution of a random variable.One of the most important is the mean, or expected value. This is just theaverage outcome, weighted by probabilities.
E [X] =Ni=1
xipi
where xi is the value of a particular outcome, and pi is its probability. The
sum must be taken across all possible outcomes (the number of outcomesbeing denoted by N here).
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 23 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
24/562
For a random variable with a continuous distribution, the mean is anintegral over all possible outcomes (weighted by probability).
E [X] = +
xfX(x) dx
The expected values of the die and coin throw examples are 3.5 and 2,
respectively. The uniform distribution on [a, b] has an expected value of(a+b) /2. The exponential distribution has a mean of 1/. The normal(Gaussian) distribution has a mean of.
When there are infinitely many possible outcomes, the expected value may
not even existwhat is the expected value of a random variable that hasvalue 2 with probability 1/2, 4 with probability 1/4, etc.? The expectedvalue also does not even have to be one of the possible outcomesin thedie throw example, the mean is 3.5, but no throw ever has this value.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 24 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
25/562
For a random variable X, any function g(X) ofX is also a randomvariable, and we can contemplate its expected value. For example, ifX isthe value of a die throw (1 through 6, with equal probability), what is the
expected value of the squaredoutcome?
From the definition of an expected value:
E X2= 6i=1
x2ipi =16(1)2 +. . .+1
6(6)2 =91
6
Similarly, E
X3
= 441/6 and E
X4
= 2275/6. (Try it.)
When there are infinitely many possible outcomes, the expected value ofXor a particular function ofXmay not exist. However, for the coin throwingexample, E [Xn] is well-defined for any integer n 0. Can you find E[X]and E
X2
?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 25 / 563
Basic Principles Probability and Distributions
W t j t b t th t d l ( t ) b t l
8/10/2019 Empirical Finance
26/562
We care not just about the expected value (or average outcome), but alsohow large deviations from the average tend to be. The varianceof arandom variable is one such measure. For discrete and continuous randomvariables, respectively, the variance is:
Var [X] =N
i=1pi(xi E [X])2
Var [X] = +
fX(x) (x E [X])2dx
In both cases, we can express the variance as an expected value:
Var [X] = E
(X E [X])2
= E
X2 (E [X])2
The last step follows from the definitions of expected value and variance,
although the algebra is tedious.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 26 / 563
Basic Principles Probability and Distributions
What is the ariance of X in the die thro e ample? One method go
8/10/2019 Empirical Finance
27/562
What is the variance ofX in the die throw example? One methodgostraight to the definition of variance:
Var [X] =Ni=1
pi(xi E [X])2
=1
6(1 3.5)2 +. . .+1
6(6 3.5)2 =35
12
Another methodfind the variance in terms of quantities we have alreadycalculated:
Var [X] = E X2 (E [X])2 =916 7
22 =35
12
Both methods give the same answer, which is not a coincidence.
What is the variance in the coin throwing example?Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 27 / 563
Basic Principles Probability and Distributions
Wh h i fi i l i (lik d l )
8/10/2019 Empirical Finance
28/562
When there are infinitely many outcomes, variance (like expected value)may not exist. For example, a Students T distribution with 2 degrees offreedom has an expected value of 0, but its variance does not exist.
For most distributions we deal with, both mean and variance arewell-defined. For the exponential distribution, the variance is:
Var [X] = 1
2
(Can you prove it?)
For the normal (Gaussian) distribution, the variance is:
Var [X] =2
(Proof of this result is more difficult.)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 28 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
29/562
Variance is, by construction, zero or positive. (It is only zero if the randomvariable is always equal to its mean.) It is never negative.
The mean, or expected value of a random variable can beexpressed in thesame units as the random variable itself; however, variance is not soconvenient. For example, suppose the annual return of a security has anormal distribution, with = 0.1 and = 0.4. Then the mean (oraverage) return is 0.1, or 10%, but its variance is 0.16; the units are
percent squared per year squared. We therefore will often use standarddeviation instead of variance:
SD [X]
Var [X]Standard deviation, like variance, is always zero or positive, but is in thesame units as the original random variable. In the example above, thestandard deviation of the securitys return is 40% per year.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 29 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
30/562
In financial and economic applications, mean and variance are used all thetime. Less often, so-called higher ordermoments are used, e.g., the third
and fourth (centred) moments:
E (X E [X])3
= E X3
3 E X2
E [X] + 2 (E [X])3
E
(X E [X])4= E X4 4 E X3E [X]+ 6 E
X2
(E [X])2 3 ( E [X])4
Like variance, these quantities are not in the most convenient units, so theyare often converted to dimensionless quantities, skewness and kurtosis.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 30 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
31/562
Skewness and kurtosis are defined as:
Skew E (X E [X])3(Var [X])
32
Kurt E (X E [X])4(Var [X])2
3
The kurtosis (sometimes called excess kurtosis) has 3 subtracted out to
make a normal distribution have a kurtosis of 0; any distribution withpositive kurtosis is therefore more kurtotic than a normal distribution.
Skewness is related to the symmetry of a distribution, and kurtosis isrelated to the probability of extreme values.
Skewness can take any value, positive or negative. Any symmetricdistribution (e.g., the normal distribution, the uniform distribution, or thedie throwing example) has skewness of zero.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 31 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
32/562
A distribution that has most of the probability near the mean, but also has
a small amount of probability of extremely high values, then thedistribution will have positive skewness. If the extreme values are lowinstead of high, then the skewness will be negative.
Income distributions in most countries have positive skewnessmost
people earn an amount around the median, but a very small number ofpeople typically earn very high incomes.
The skewness of the exponential distribution is 2; the skewness of thedistribution in the coin throwing example is 3/
2. (Can you derive these
results?)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 32 / 563
Basic Principles Probability and Distributions
8/10/2019 Empirical Finance
33/562
Kurtosis has to do with the probability of extreme observations. If a
random variable is almost always close to the mean, but with some smallprobability, it can take on a very large value (above or below the mean),then the distribution has high kurtosis.
The lowest possible value of kurtosis is2; there is no maximum value ofkurtosis. It is possible for the skewness and the kurtosis of a distributionnot to exist.
The exponential distribution has a kurtosis of 6; the uniform distributionhas a kurtosis of1.2. The Gaussian distribution has a kurtosis of zero.The coin throwing example has a kurtosis of 6.5, and the die throwingexample has a kurtosis of222/175. (Can you derive these results?)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 33 / 563
Basic Principles Probability and Distributions
Exponential vs. Gaussian Distribution
8/10/2019 Empirical Finance
34/562
Exponential vs. Gaussian Distribution
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 34 / 563
Basic Principles Probability and Distributions
Exponential vs. Gaussian DistributionRight Tail
8/10/2019 Empirical Finance
35/562
Exponential vs. Gaussian Distribution Right Tail
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 35 / 563
Basic Principles Probability and Distributions
Gaussian vs. Students T Distribution
8/10/2019 Empirical Finance
36/562
Gaussian vs. Student s T Distribution
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 36 / 563
Basic Principles Estimation and Inference
Problem we do not know the distribution of random events
8/10/2019 Empirical Finance
37/562
Problemwe do not know the distribution of random events.
1 For the coin throwing example, it seems like the probability of
heads is 0.5. Are you sure? Maybe it is a trick coin.2 For a security return, we know the future return is random (i.e., we
cannot predict it in advance with perfect accuracy). But what is itsprobability distribution?
If we have historical data (e.g., we have observed the coin being thrownrepeatedly, or we have historical returns for a security), we can use thisdata to learn something about the probabilities of different outcomes. (Isthere an implicit assumption here?)
Estimation of the entire probability distribution of a random variable is avery difficult problem. (Easy for some special cases, like the coin throwingexample.) We will focus on estimating quantities such as the mean andvariance of a random variable.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 37 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
38/562
How do we estimate the mean (expected value) of a random variable, suchas the outcome of a coin throw, or the future return of a security?
An extremely general methodtake the sample averageof the availableobservations. Suppose we have observed N realisations of the randomvariableX, denoted by X1, . . . , XN. Then we can estimate the averagewith:
X = 1
N
Ni=1
Xi
Is this a good way to estimate the expected value of a random variable?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 38 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
39/562
Exampleprobability of heads with a coin throw.
Call the value of a coin throw X= 1 if it comes up heads, and X = 0
otherwise. Call pthe probability of heads. Then:
E [X] =
2
i=1xipi =p 1 + (1 p) 0 =p
So estimating the expected value ofX is the same thing as estimating theprobability of heads. Estimate the sample mean by throwing the coin Ntimes, counting each heads as 1, and each tails as 0. Count up the
number of heads, and divide by N. This is X, the sample mean.
Will the sample average be equal to the true average (i.e., the expectedvalue)?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 39 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
40/562
Exampleexpected return of a security.
Collect historical returns for the last Nmonths. Add them all up, and
divide by N:
R= 1
N
N
i=1Ri
This method is very commonly used to estimate expected returns ofbroadly diversified portfolios; it is used less often to try to estimate theexpected returns of individual securities. (Any idea why?)
Will the sample average return be equal to the true expected return?
What are the statistical properties of the sample mean?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 40 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
41/562
First, we will need a few basic results. Let X and Y be random variables,and let a, b, and cbe constants. Then:
E [X+Y] = E [X] + E [Y]
E [aX] =a E [X]E [a+bX+cY] =a+bE [X] +cE [Y]
These results are true for both discrete and continuous random variables,
and follow directly from the definition of expected value. (The derivationis a little tedious though.)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 41 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
42/562
The first two results are just special cases of the third, which can begeneralized; let X1, . . . , XNbe random variables, and let a0, . . . , aN beconstants. Then:
Ea0+ Ni=1
aiXi= a0+ Ni=1
aiE [Xi]
This last result will be extremely useful in analysing the statisticalproperties of the sample mean.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 42 / 563
Basic Principles Estimation and Inference
Note that the sample mean is itself a random variable; sometimes it will beh h h h d ll b l W fi d
8/10/2019 Empirical Finance
43/562
higher than the true mean, and sometimes it will be lower. We can find itsexpected value, just like we can with any other random variable:
E
X
= E
1
N
Ni=1
Xi
= E
Ni=1
1
NXi
=Ni=1
1
NE [Xi]
=
Ni=1
1N
E [X] = E [X]
So the expected value of the sample average is equal to the true
averageif you estimate the true mean with the sample mean, then onaverage, you will get it right!
We would also like to examine how precise the estimate tends to behowmuch can the sample average deviate from the true average? However, we
need some additional tools first.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 43 / 563
Basic Principles Estimation and Inference
L X d Y b d i bl Th j i di ib i ll h
8/10/2019 Empirical Finance
44/562
Let X and Ybe random variables. The joint distribution tells us theprobabilities of different possible outcomes ofX and ofY individually, butit also tells us how X and Yare related. Suppose there are Mpossible
values ofX, and Npossible values ofY. Then the joint probability pi,j isthe probability that Xwill take the value xi, and Ywill simultaneouslytake the value yj.
The joint probabilities ofX and Ymust satisfy the same two restrictions
that all probabilities must satisfythey must be non-negative, and theymust add up to one.
We can also consider the probabilities of either X orY, considered alone.
For example, let p(X)
1
, . . . , p(X)
M
be the probabilities of theMpossible
values ofX, and let p(Y)1 , . . . , p(Y)N be the probabilities of theNpossible
values ofY. Then these two sets of probabilities are called the marginalprobabilities ofX and Y.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 44 / 563
Basic Principles Estimation and Inference
There is a relation between the marginal probabilities and the jointb biliti S ifi ll
8/10/2019 Empirical Finance
45/562
probabilities. Specifically:
p(X)i =Nj=1
pi,j p(Y)j =
Mi=1
pi,j
SupposeX and Ycan each take on the values1, 0, or +1, and do sowith the following probabilities:
X1 0 +1
1 0.20 0.10 0.00Y 0 0.20 0.05 0.20+1 0.10 0.00 0.15
What are the marginal probabilities ofX and Y?Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 45 / 563
Basic Principles Estimation and Inference
We can also specify the joint probability density function fX,Y(x, y) fort d i bl ith ti di t ib ti
8/10/2019 Empirical Finance
46/562
two random variables with a continuous distribution.
The probability that X [a, b] and Y [c, d] is:
P (a X b, c Y d) = ba
dc
fX,Y(x, y) dydx
In either the discrete or the continuous case, expected values are definedanalogously to the case of a single random variable:
E [g(X, Y)] =
Mi=1
Nj=1
pi,jg(xi, yj)
E [g(X, Y)] =
+
+
fX,Y(x, y) g(x, y) dydx
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 46 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
47/562
We say the discrete random variables X and Y are independent if:
pi,j=p(X)i p
(Y)j
IfX and Yare continuous, then they are independent if:
fX,Y(x, y) =fX(x) fY(y)
Intuitively, X and Yare independent if knowledge ofX tells you nothing
about the probability of different outcomes ofY, and vice-versa.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 47 / 563
Basic Principles Estimation and Inference
We define the covariancebetween X and Y as:
8/10/2019 Empirical Finance
48/562
Cov[X, Y] E [(X E [X]) (Y E [Y])] = E [XY] E [X] E [Y]
Covariance is a measure of how the two random variables are related; e.g.,if it is positive, then when X is above its mean value, Yalso tends to beabove its mean value.
If two random variables are independent, then their covariance is zero.(Proof?) However, it is possible for random variables to have a covarianceof zero, but not be independent.
Other useful properties of covariance are:
Cov[X, Y] = Cov [Y, X] Cov [X, X] = Var [X]
These follow immediately from the definition.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 48 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
49/562
The units of covariance are not particularly useful, so one may prefercorrelation:
Corr [X, Y] Cov[X, Y]SD [X] S D [Y]
Correlation is not well-defined if either X orYhas a standard deviation of
zero. But otherwise, correlation is dimensionless, and is bounded betweenits maximum value of +1 and its minimum value of1.Correlation and covariance have the same signthat is, they are bothpositive, both negative, or both zero.
If two random variables have a correlation of zero, we say they areuncorrelated. This does not necessarily mean that they are independent!
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 49 / 563
Basic Principles Estimation and Inference
ExampleX and Yhave a bivariate normal distribution:
8/10/2019 Empirical Finance
50/562
fX,Y(x, y) = 1
22X2Y(1 2) e
(x X)2
2Y
2 (x X) (y Y) XY+ (y Y)2 2X
2[2X2Y(12)]
This distribution has the following properties:
E [X] =X E [Y] =Y
Var [X] =2X Corr [X, Y] = Var [Y] =2Y
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 50 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
51/562
Note that, if= 0, then X and Yare independent. (Can you show it?)For this particular distribution, X and Yare independent if and only ifthey are uncorrelated.
This result does not generalise to other distributions! It is not true evenfor normal distributions; X and Ycan each have a marginal normaldistribution and a correlation of zero, but not be independent. (Can youconstruct an example?)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 51 / 563
Basic Principles Estimation and Inference
Two Standard Gaussian DistributionsZero Correlation
8/10/2019 Empirical Finance
52/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 52 / 563
Basic Principles Estimation and Inference
Two Standard Gaussian DistributionsCorrelation of+0.5
8/10/2019 Empirical Finance
53/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 53 / 563
Basic Principles Estimation and Inference
Two Standard Gaussian DistributionsCorrelation of0.5
8/10/2019 Empirical Finance
54/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 54 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
55/562
The following properties of variance follow from the definition. (Can you
derive them?) Let X and Ybe random variables, and let a, b, and c beconstants. Then:
Var [X+Y] = Var [X] + Var [Y] + 2 Cov [X, Y]Var [aX] =a2 Var [X]
Var [a+bX+cY] =b2 Var [X] +c2 Var [Y] + 2bcCov [X, Y]
The first two are special cases of the third.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 55 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
56/562
More generally, ifX1, . . . , XNare random variables and a0, . . . , aN are
constants:
Var a0+N
i=1 aiXi=N
i=1 a2i Var [Xi] + 2N
i=1N
j=i+1 aiajCov [Xi, Xj]The presence of the covariance terms has very profound implications forportfolio choice. What is the above result if the X1, . . . , XNare all
uncorrelated with each other?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 56 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
57/562
At this point, it may be useful to specify some properties of covariances.
Let X, Y, U, and Vbe random variables, and let a, b, c, d, f, and g beconstants. then:
Cov[a+bX+cY, d+fU+gV] =bfCov[X, U] +bgCov [X, V]+cfCov[Y, U] +cgCov [Y, V]
For both variances and covariances, adding a constant to the arguments
has no effect.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 57 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
58/562
The previous result may also provide some insight in why constants that
appear multiplicatively inside a variance must be squared when they aretaken outside:
Var [bX] = Cov [bX, bX] =b2 Cov[X, X] =b2 Var [X]
We will state and use a number of statistical results in this section and thenext without proof; if you want to fill in the proofs, the above property ofcovariance will often be useful. This result generalizes to arbitrary linearcombinations of random variables in the obvious way.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 58 / 563
Basic Principles Estimation and Inference
We can now further analyse the statistical properties of the sample mean.Specifically, we would like to find its variance. At this point, we assume
8/10/2019 Empirical Finance
59/562
the X1, . . . , XNare independent of each other. (Is this a reasonableassumption?)
Var
X
= Var
1
N
Ni=1
Xi
=
1
N2
Ni=1
Var [Xi] = 1
NVar [X]
The standard deviation of the sample mean is:
SD
X
=
Var
X
=
1N
SD [X]
From the above results, we can reach the not very surprising conclusionthat, the more observations we have, the better an estimate of the truemean Xis. On average, it is right; furthermore, the more observations wehave, the less likely X is to deviate widely from the true mean.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 59 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
60/562
Examplecoin throwing.
Recall our method of estimating the probability a coin comes upheadsthrow the coin Ntimes, count the number of heads, and dividebyN. The resulting number (which is the sample mean) is an estimate ofthe probability of heads.
On average, the sample mean is an accurate estimate of the true mean.But if you throw a coin 1, 000 times, will it always come up heads 500times, even if it is a fair coin? Suppose it comes up heads 550 timesisthis evidence that it is a trick coin?
Recall that heads receives a value of 1, and tails receives a value of 0.The average value is p, where p is the probability of heads.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 60 / 563
Basic Principles Estimation and Inference
What is the variance of a single coin throw?
8/10/2019 Empirical Finance
61/562
What is the variance of a single coin throw?
E
X2
=p(1)2 + (1 p)(0)2 =pVar [X] = E
X2 (E [X])2 =p p2 =p(1 p)
What is the variance of the sample average?
Var
X
=
1
NVar [X] =
p(1 p)N
We dont know the value ofp, so we dont know the variance of thesample mean.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 61 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
62/562
However, note that p(1 p) takes a maximum value of 1/4 at p= 1/2.So we know for sure that:
Var X 1
4N SD X
1
2N
For N= 1, 000, we have E
X
= 0.5 and SD
X 0.01581
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 62 / 563
Basic Principles Estimation and Inference
S f h b h d I h
8/10/2019 Empirical Finance
63/562
Suppose after 1, 000 throws, we observe heads 550 times. Is the coinfair? The sample mean X is 0.55. If the coin is fair, then p= 0.5, and
E X= 0.5 and SD X 0.01581. There are two possibilities:1 The coin is not fair, and comes up heads more often than tails.2 The coin is fair, but came up heads more often than tails just
due to chance.
Which is it?
When data are generated by a random process, we can never know
anything with absolute certainty. However, we may be able to come to aconclusion with high probability.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 63 / 563
Basic Principles Estimation and Inference
We now construct a test statistic, of the form:
8/10/2019 Empirical Finance
64/562
Z=X 0
where X is the sample mean (i.e., the mean estimated from the data), 0is the hypothesized mean (in this case, 0.5, since we are testing whetherthe coin is fair), and is the standard deviation of the quantity beingtested. Since 550 coins out of 1, 000 came up heads, X = 0.55, vs. thehypothesized value of0 = 0.5. We have calculated = 0.01581. So thetest statistic is:
Z =
X
0
=
0.55
0.50
0.01581 = 3.16
Intuitively, the observed outcome (550 heads) is 3.16 standard deviationsabove the mean outcome, if the coin were fair. Could this have happenedby chance?Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 64 / 563
Basic Principles Estimation and Inference
Certainly 550 heads couldhave happened by chance; 600 heads, 900
8/10/2019 Empirical Finance
65/562
y pp yheads, or 999 heads, or even 1, 000 heads could have happened by chance.But how likely is it? We can get some idea of how probable in outcome is,due to chance, even if the hypothesis being tested is true, using a resultknown as Chebyshevs inequality.
This result states that the a random variable takes values at least kstandard deviations away from the mean with a probability that is at 1/k2.For k 1, it tells us the probability is at most 1, but we knew thatalready, since nothing can happen with probability greater than one. Butfor two standard deviations, Chebyshevs inequality tells us that suchoutcomes can happen with probability ofat most1/4; depending on the
actual distribution, the true probability might be smaller. Outcomes threestandard deviations away from the mean happen with probability of atmost 1/9, etc.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 65 / 563
Basic Principles Estimation and Inference
In this case, the probability of getting a realised value ofX that is
8/10/2019 Empirical Finance
66/562
, p y g gk= 3.16 standard deviations away from the mean is at most1/k2 = 0.10.So 550 heads could have occurred by chance, even if the coin is fair; butthe probability that the outcome would be 50 or more coin throws awayfrom the expected value of 500, is at most 0.10.
Are you willing to conclude that the coin is not fair, based on this test? Ifnot, how extreme would the outcome have to be in order to convince youthat the coin is not fair?
In fact, the actual probability of 550 heads, assuming the coin is fair, isquite a bit smaller than 0.10. The exact distribution of the outcome isknown in this case; it is called the binomialdistribution. However, thebinomial distribution is a bit unwieldy for large values ofN, so we willresort to an approximation.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 66 / 563
Basic Principles Estimation and Inference
Central Limit Theoremwhen the number of observations is large, thed b f h l X l l dl f
8/10/2019 Empirical Finance
67/562
distribution of the sample mean X is approximately normal, regardless ofthe distribution ofX. (Requires existence of finite mean and variance.)
If a random variable has a normal distribution, then any linear function ofthat random variable also has a normal distribution. (Can you prove it?)The sample mean, X, has a normal distribution (approximately) by thecentral limit theorem. Recall the test statistic:
Z=X 0
The test statistic Z is a linear function ofX(note the other quantities inthe expression above are not random), and therefore also hasapproximately a normal distribution.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 67 / 563
Basic Principles Estimation and Inference
What are the mean and standard deviation of the test statistic Z?(Assume the hypothesis, that E [X] = 0.5, is true.)
8/10/2019 Empirical Finance
68/562
E [Z] = E X 0
=
E X 0
=0 0
= 0
Var [Z] = Var
X 0
=
1
2Var
X 0
= 1
2Var
X
=
1
22 = 1
SD [Z] =Var [Z] = 1 = 1The test statistic tthus has approximately a normal distribution, withmean of 0 and variance of 1. (This is not a coincidencethe test statisticwas designed to have these properties.)
We can now use the test statistic to determine how likely an outcome of550 heads is, if the coin is fair.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 68 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
69/562
Basic properties of a normal distribution:
1 The realised value is within one standard deviation of the mean withprobability 0.682.
2 The realised value is within two standard deviations of the mean withprobability 0.954.
3 The realised value is within three standard deviations of the meanwith probability 0.997.
These statistics are determined by integrating over the appropriate range
of the density function for the normal distribution.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 69 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
70/562
For example, to find the second result, we can calculate:
Prob( 2 X + 2) = +22
122
e(x)2
22 dx
The integral above cannot be found in closed-form; however, it can beevaluated numerically. (A closed-form expression that is known to beaccurate to at least 15 decimal places does exist.)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 70 / 563
Basic Principles Estimation and Inference
Many books have tables of the value of integrals of the normal densityfunction for different ranges, and many software packages can also
8/10/2019 Empirical Finance
71/562
g , y p gcalculate it. By any of these methods, we can determine than an
observations at least 3.16 standard deviations from the mean occur withprobability of only 0.00159.
In other words, if you were to throw a fair coin 1000 times, the combinedprobability that you would get either
1 550 heads or more2 450 heads or fewer
is only 0.00159, and the probability that the number of heads will fallbetween 450 and 550 is 0.99841. (These probabilities are based on anapproximation, that the sample mean has a normal distribution. Theapproximation is fairly accurate in this case.)
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 71 / 563
Basic Principles Estimation and Inference
Coin Throw Example1,000,000 Trials, 1,000 Throws Each Trial
8/10/2019 Empirical Finance
72/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 72 / 563
Basic Principles Estimation and Inference
Coin Throw ExampleStandardised Distribution
8/10/2019 Empirical Finance
73/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 73 / 563
Basic Principles Estimation and Inference
Since the distribution ofX is approximately normal for a large number of
8/10/2019 Empirical Finance
74/562
coin throws, the probability that the number of heads would differ from
the mean value by at least 50 is approximately 0.00159.The true value (based on the exact distribution ofX, which in thisexample is binomial) is 0.00173; the assumption of normality leads tosome inaccuracy, but not too much.
So, if the coin were fair, the expected number of heads would be 500, anda realised value as far away as 550 would occur with probability of lessthan 0.002; the probability that the number of heads would be closer to500 is more than 0.998.
Does 550 heads seem very likely to occur just by chance? Are you willingto declare that the coin is not fair?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 74 / 563
Basic Principles Estimation and Inference
(
8/10/2019 Empirical Finance
75/562
Whether we use the approximate probability of 0.00159 (based on thenormal approximation) or the exact probability of 0.00173 (based on thebinomial distribution), this number has a nameit is often called thep-value. A p-value is simply the probability that, under the hypothesisbeing tested, data as extreme as what has been observed would occur justby chance. The p-value in this example is rather extremea result this
extreme (50 or more heads away from the expected value of 500) shouldoccur just by chance, if the coin were fair, fewer than two times out of athousand. If the coin were fair, we have just observed quite a remarkablecoincidence. It is possiblethe coin is fair; but it doesnt seem very likely.
We will now try to formalise this idea.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 75 / 563
Basic Principles Estimation and Inference
We have an hypothesisthe coin is fair, and the probability of heads is0.5.
8/10/2019 Empirical Finance
76/562
We also have evidence550 heads out of 1, 000coin throws.
There are two types of errors we can make here:
1 Type I Errorwe rejectthe hypothesis (that is, conclude that thecoin is not fair) when it in fact is fair.
2 Type II Errorwe fail to reject the hypothesis (concluding the coin isfair) when it is in fact not fair.
It is impossible to avoid both types of errors completely. All we can do is
trade the probability of one off against the other.
The nearly universal convention in finance and economics (which iscompletely arbitrary) is to set the probability of a Type I Error at 0 .05.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 76 / 563
Basic Principles Estimation and Inference
8/10/2019 Empirical Finance
77/562
Hypothesis: the coin is fair (the probability of heads is 0.5).
Evidence: 550 heads from 1, 000 coin throws.
If the hypothesis is true, the probability of getting a deviation from themean this large is only 0.00159 (using the normal approximationtheexact p-value is 0.00173).
Since this probability is less than 0.05, we rejectthe hypothesis, andconclude the coin is not fair.
Could we have just made a Type I error?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 77 / 563
Basic Principles Estimation and Inference
Yes, we could have just made a Type I error. The only way to avoid Type I
8/10/2019 Empirical Finance
78/562
errors (incorrect rejection of an hypothesis that is true) is never to reject
any hypothesis. If one takes that approach, one is likely to commit quite alot of Type II errors (failure to reject an hypothesis which is false).
When the hypothesis is true, if we use a cut-off of 0.05 (as we did in thisexample), we are likely to reject the hypothesis (incorrectly) one time in
every twenty. If this risk of Type I error is unacceptably large, we can lowerour cut-off; for example, we could reject the hypothesis only if the p-valueis less than 0.02. Then we will only commit a Type I error one time inevery fifty, which is an improvement. However, this comes at a pricetheprobability of a Type II error goes up. We will fail to reject an hypothesis
that is false more often, if we decrease our cut-off value. There is no wayaround this trade-off.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 78 / 563
Basic Principles Estimation and Inference
One could take the approach of trying to assess how costly Type I and
8/10/2019 Empirical Finance
79/562
One could take the approach of trying to assess how costly Type I andType II errors are, and changing the cut-off value accordingly. For
example, consider a medical test that is designed to detect the early stagesof a curable disease. If our hypothesis is the patient is healthy, then aType I error is a false positiveconcluding that the patient is sick, when infact the patient is healthy. A Type II error is a false negativefailure todetect the disease, when the patient in fact has it.
If the test is very sensitive, there will be very few false negatives (very fewType II errors), but there will also be a lot of false positives (lots of Type Ierrors). If the test is adjusted so that it is not so sensitive, then there willbe fewer false positives, but more false negatives. So how sensitive shouldwe make the test?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 79 / 563
Basic Principles Estimation and Inference
If we conclude that the cost of a Type II error is very high (a sick patient
8/10/2019 Empirical Finance
80/562
yp y g ( pfails to get treatment, wrongly believing s/he is healthy),whereas the
Type I error is less costly (a healthy patient has some rather anxiousmoments, and undergoes some additional testing/treatment before it isrealised that there was a false positive), then we should make the test verysensitive. If the costs are different (for example, maybe the disease is notso serious, and the treatment is expensive, painful, and largely ineffective),
then we should make the test less sensitive.
This type of analysis is used frequently in some disciplines, such asengineering. It has largely gone out of fashion in financial analysis, wherearbitrary benchmarks (such as 0.05 probability of a Type I error) are
commonplace.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 80 / 563
Basic Principles Testing Pricing Models
Returning to the three securities mentioned earlier:
8/10/2019 Empirical Finance
81/562
AssetX Y Z
Average return (predicted) 8% 10% 12%
Average return (observed) 6% 16% 14%
Standard deviation of return (observed) 25% 40% 60%
Recall that the observed quantities were estimated from 20 years ofmonthly returns data. Can we safely conclude that the securities do notconform to the predictions of the theory?
This problem is much more difficult than the coin throwing example.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 81 / 563 Basic Principles Testing Pricing Models
Assume the predictions of the model are correctthen the deviations ofthe observed average returns from the predicted average returns are justdue to the random variation of the data. We already know:
8/10/2019 Empirical Finance
82/562
E X= 8% E Y= 10% E Z= 12%But we need to know the standard deviations as well:
SD
X
=? SD
Y
=? SD
Z
=?
There were 20 years of monthly data, so N= 240, and
240 15.49.Therefore:
SD
X
=SD [X]
15.49 SD
Y
=SD [Y]
15.49 SD
Z
=SD [Z]
15.49
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 82 / 563 Basic Principles Testing Pricing Models
The problem is that we do not know the standard deviations ofX, Y, andZ; we can only estimate them from the data. Estimates were included inthe table, but how these were determined was not specified.
8/10/2019 Empirical Finance
83/562
The usual way of estimating the variance of a random variable (which canthen be used to estimate the variance of the sample average) is as follows:
s2XX = 1
N 1
N
i=1 (Xi X2
Note that, in order to calculate s2XX, we must first calculateX. The
presence of the N 1 (instead ofN) in the denominator may seempuzzling; this is a correction to account for the fact that the mean is notknown exactly, but must be estimated with X.
The sample variance s2XXis itself a random variablewhat are itsstatistical properties?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 83 / 563 Basic Principles Testing Pricing Models
We have all the tools we need to find its mean and variance, although thealgebra can be tedious.
8/10/2019 Empirical Finance
84/562
E s2XX= E 1N 1Ni=1
(Xi X2=
1
N
1
N
i=1 (E
X2i
2 E
XiX+ E
X2
=
1
N 1Ni=1
Var [X] + E [X]2
2N
Var [X] 2 E [X]2
+
1
NVar [X] + E [X]2
=Var[X]
Can you fill in the missing steps?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 84 / 563 Basic Principles Testing Pricing Models
The following results can also be derived, with considerable difficulty:
8/10/2019 Empirical Finance
85/562
Var
s2XX
= (SD [X])4 2
N 1+Kurt[X]
N
Cov
X, s2XX
=
Skew [X] (SD[X])3N
IfXhappens to have a normal distribution, then its skewness and kurtosisare each equal to zero, the sample mean and variance are uncorrelatedwith each other, and the variance ofs2XXhas a very simple form.
We will not prove these results, but ifXhas a normal distribution, then Xalso has a normal distribution, and s2XXhas a chi-square distribution.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 85 / 563 Basic Principles Testing Pricing Models
Returning to the example, consider security X. We have a theory thatpredicts its expected return is 8%, but when we estimate the mean withX , it is 6%. The estimated standard deviation (we will use the notation
8/10/2019 Empirical Finance
86/562
X, it is 6%. The estimated standard deviation (we will use the notationsX) is 25%.
We would like to construct a test statistic:
Z =X 0SD X =
N X 0SD [X]
If the hypothesis is correct, then the expected value ofX is 6% and itsstandard deviation is SD [X] /
240 (recall that there are 240 monthly
observations). The test statistic then has a mean of zero, and a standarddeviation of one. IfXhas a normal distribution, then Zalso has a normaldistribution; even ifX isnt normal, then by the central limit theorem, Z isapproximately normal for large N.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 86 / 563 Basic Principles Testing Pricing Models
Z-statistic for Stock Return Example1,000,000 Trials
8/10/2019 Empirical Finance
87/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 87 / 563 Basic Principles Testing Pricing Models
The test statistic Z is therefore ideal, except for one little problemit isinfeasible. We dont know SD [X ], and can only estimate it. Note that this
8/10/2019 Empirical Finance
88/562
infeasible. We don t know SD [X], and can only estimate it. Note that thissituation is different from the coin throwing examplethere, under thehypothesis (that the coin is fair, and the probability of heads is 1/2), weknew the standard deviation of a coin throw. Here, we dontthehypothesis tells us what the value of the mean ought to be, but is silentwith respect to the variance and standard deviation.
Instead, we must use the estimated standard deviation, rather than theactual, to form our test statistic:
t= N X 0sX
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 88 / 563 Basic Principles Testing Pricing Models
Because the standard deviation used in our test statistic is estimated, thedistribution of the test statistic is not normal, even if X is. Under the
8/10/2019 Empirical Finance
89/562
distribution of the test statistic is not normal, even ifX is. Under theassumption of normality for X, the test statistic thas a Students tdistribution with N 1 degrees of freedom.The t-distribution approaches a standard normal distribution (i.e., anormal distribution with a mean of zero and a standard deviation of one)as the degrees of freedom become large. When there are many dataobserved, the uncertainty in the estimate of the mean remains much largerthan the uncertainty in the estimate of the standard deviation, and the tstatistic approaches the distribution it would have if the standard deviationwere known with certainty: a standard normal. When the number of data
observations is small, though, the deviation from normality can be verysignificant.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 89 / 563 Basic Principles Testing Pricing Models
T Distribution with Various Degrees of Freedom
8/10/2019 Empirical Finance
90/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 90 / 563 Basic Principles Testing Pricing Models
T-statistic for Stock Return Example1,000,000 Trials
8/10/2019 Empirical Finance
91/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 91 / 563 Basic Principles Testing Pricing Models
T-statistic with Non-Gaussian Returns1,000,000 Trials, T= 240
8/10/2019 Empirical Finance
92/562
Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 92 / 563 Basic Principles Testing Pricing Models
T-statistic with Non-Gaussian Returns1,000,000 Trials, T= 480
8/10/2019 Empirical Finance
93/562
Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 93 / 563 Basic Principles Testing Pricing Models
T-statistic with Non-Gaussian Returns1,000,000 Trials, T= 960
8/10/2019 Empirical Finance
94/562
Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 94 / 563 Basic Principles Testing Pricing Models
The test statistic for security X is then:
t
NX 0
2406% 8%
1 24
8/10/2019 Empirical Finance
95/562
t=
N0
sX =
240 25% 1.24Since the number of degrees of freedom is quite large, we can simply treatthe t-statistic as if it were normally distributed. A test statistic of1.24corresponds to a p-value of approximately 0.215; that is, if the hypothesis
were true, there is still a probability of 0.215 that the sample averagereturn of the security would differ from the hypothesized value by at least2%.
If we use the 0.05 cut-off for p-values, as is common practice in finance,
we cannot reject the hypothesis that E [X] = 8%. The risk that we aremaking a Type I error is too high.
Do the other securities provide evidence against the model?
Kimmel (EDHEC Business School) Empirical Finance Singapore Mar/Aug 2011 95 / 563
8/10/2019 Empirical Finance
96/562
Basic Principles Multivariate Tests
Is there anything wrong with what we are doing here?
8/10/2019 Empirical Finance
97/562
It doesnt make any sense to test the securities one at a time. Suppose themodel we are testing is actually trueit correctly describes the expectedreturns of all securities. If we go out and test its predictions one securityat a time, then for each test we conduct, there is a 0.05 probability(assuming 95% confidence) of a Type I error. If, for example, we test a
model for Japanese stock returns, and decide to conduct a statistical testfor each of the 225 stocks in the Nikkei 225 index, that is 225 chances tohave a Type I error. How likely is it that at least some of the stocks willappear to violate the predictions of the model, just by chance, even though
the model is true?
Ki l (EDHEC B si ss S h l) E i i l Fi Si M /A 2011 97 / 563 Basic Principles Multivariate Tests
What we really ought to do is perform a single statistical test of all thesecurities simultaneously. For example, we could consider a test statisticalong the lines of the following:
8/10/2019 Empirical Finance
98/562
g g
F =t2X+t2Y +t
2Z =
(RX 0,X
22
RX
+
(RY 0,Y
22
RY
+
(RZ 0,Z
22
RZ
Intuitively, this statistic has some advantagesit is big when thet-statistics for the individual assets are big, it places more weight onviolations of the theorys predictions for assets which have small standard
deviations, etc. It also seems like it has a distribution that can becalculatedit is the sum of three squared t distributions. But are these tdistributions independent?
Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 98 / 563 Basic Principles Multivariate Tests
The test statistic just proposed doesnt work if we cant be sure that thereturns of the three assets are independent (or at least uncorrelated). Wecan fix this defect, but first, we will need to be able to estimatecovariances from historical data The usual way of estimating the
8/10/2019 Empirical Finance
99/562
covariances from historical data. The usual wayof estimating the
covariance between X and Y is:
s2XY = 1
T
1
T
t=1 (XtX
(Yt Y
This estimator is unbiased, i.e., E
s2XY
= Cov [X, Y]. Derivation of its
variance (and covariance with other statistics) is very difficult.
The T 1 divisor, instead ofT, is often a point of confusion. T 1 isused to make our estimate unbiased. Some just use T, but if you estimatecovariance (or variance) this way, then your estimate is biased; it tends tobe a little too small, on average. For large T, it doesnt matter very much.
Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 99 / 563 Basic Principles Multivariate Tests
Some software products are quite inconsistent about which divisor theyuse, T 1 orT. For example, a spreadsheet product produced by asoftware company based in Redmond, Washington, USA, usesT 1 in theVAR function but T in the COVAR function Therefore even though
8/10/2019 Empirical Finance
100/562
VAR function, but T in the COVAR function. Therefore, even though
Cov[X, X] = Var [X] by definition, this software package returns differentvalues for VAR(A1:A10) and COVAR(A1:A10,A1:A10). When youhave a piece of software do these sorts of calculations for you, make sure itis doing what you think it is doing.
When we need to estimate a correlation from historical data, we will do soas follows:
= s2XY
sXsY
The little hat over the indicates that the quantity is the estimated,rather than true correlation.
Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 100 / 563 Basic Principles Multivariate Tests
We now return to the problem of constructing a joint test statistic. Forconvenience, we will call the assets X1, . . . , XN. It is convenient to arrangethe means of the assets in a column vector, and the variances andcovariances in a matrix:
8/10/2019 Empirical Finance
101/562
=
E [X1]...
E [XN]
=
Var [X1] Cov[X1, XN]...
. . . ...
Cov[XN, X1] Var [XN]
The sample equivalents are:
= X1
..
.XN =
s211 s21N..
.
. . . ..
.s2N1 s2NNwhere, through a slight abuse of previous notation, sij is the samplecovariance ofXi and Xj.
Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 101 / 563 Basic Principles Multivariate Tests
We will need three linear algebra operations to construct a reasonable teststatistic: matrix multiplication, matrix transposition, and matrix inversion.
8/10/2019 Empirical Finance
102/562
In case these operations are not familiar, we will start with multiplicationof a row vector by a column vector. To perform this operation, we justmultiply each element in one of the vectors by its corresponding element inthe other vector, and add the products all up:
x1 xN
y1...yN
= Ni=1
xiyi
The number of elements in the two vectors must be the same; otherwisethe product is undefined.
Ki l (EDHEC B i S h l) E i i l Fi Si M /A 2011 102 / 563 Basic Principles Multivariate Tests
More generally, we can find the product of any two matrices, provided thenumber of columns in the first matrix is equal to the number of rows inthe second matrix. The product of a KMmatrix and an M N matrixis a K Nmatrix. The element in row iand column jof the product is
8/10/2019 Empirical Finance
103/562
row iof the first matrix multiplied by column jof the second matrix:
x11 x1M
..
.
. ..
...
xK1 xKM y11 y1N
..
.
. ..
...
yM1 yMN=
Mi=1x1iyi1
Mi=1x1iyiN
... . . .
...Mi=1xKiyi1 Mi=1xKiyiN
The inner dimensions of the two matrices must match, or the product isundefined.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 103 / 563
Basic Principles Multivariate Tests
Many of the rules of ordinary multiplication do not apply to matrixmultiplication; for example, matrix multiplication is not commutative.
8/10/2019 Empirical Finance
104/562
A numeric example of matrix multiplication:
3 5 -24 1 0
6 1-8 4
2 1= -26 2116 8
Given the large number of operations involved, it is not a bad idea to havea computer available before multiplying even relatively modestly sizedmatrices together. For example, to multiply a 5
8 matrix by an 8
3
matrix requires 120 multiplications and 105 additions.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 104 / 563
Basic Principles Multivariate Tests
Transpose is a very simple operation, usually denoted by either a T or aprime superscript, i.e., CT orC. The matrix is flipped around, so that therows become columns and the columns become rows:
8/10/2019 Empirical Finance
105/562
x11 x1N... . . . ...xM1 xMN
T
=
x11 xM1... . . . ...x1N xMN
A numeric example:
1 3 -2-8 0 4T
= 1 -83 0
-2 4It doesnt get much easier than matrix transposition.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 105 / 563
Basic Principles Multivariate Tests
Matrix operations can be used to avoid cumbersome algebraic expressionsinvolving large numbers of assets. For example, consider Nassets, withreturns R1, . . . , RN, and a portfolio with share a1 invested in the firstasset, a2 invested in the second asset, and so on, up to aN invested inasset N (The weights a should add up to one ) What is the variance of
8/10/2019 Empirical Finance
106/562
asset N. (The weights ai should add up to one.) What is the variance ofthe return of this portfolio?
Var [a1R1+. . .+aNRN] =N
i=1N
j=1 aiajCov [Ri, Rj]Arranging the a1, . . . , aN in a column vector a, the returns R1, . . . , RN in acolumn vector R, and the variances and covariances of returns in a matrix, we can express the above as:
Var
aTR
= aTa
(Try it!) This expression is valid for any number of assets.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 106 / 563
Basic Principles Multivariate Tests
Numeric examplesuppose the returns of three assets have the covariancematrix:
8/10/2019 Empirical Finance
107/562
=0.040 0.012 0.0200.012 0.090 0.036
0.020 0.036 0.160
What is the variance of the return of a portfolio that is 0.2 invested in thefirst asset, 0.6 in the second asset, and 0.1 invested in the third asset?
0.2
0.60.1T
0.040 0.012 0.020
0.012 0.090 0.0360.020 0.036 0.1600.2
0.60.1= 0.0436
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 107 / 563
Basic Principles Multivariate Tests
Matrix inversion, usually denoted by a 1 superscript, as in C1, is arather difficult operation. The inverse of a matrix satisfies the condition:
C C1 = C1 C = I
8/10/2019 Empirical Finance
108/562
C
C C
C I
where I is the identity matrix, which has 1 for each element on thediagonal, and 0 everywhere else:
I =
1 0 0...
. . . ...
...0 1 0...
... . . .
...
0 0 1
If a matrix is not square (i.e., same number of rows and columns), it doesnot have an inverse.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 108 / 563
Basic Principles Multivariate Tests
Square matrices may or may not have inverses, although covariancesmatrices usually do. Specifically, every matrix that is the covariancematrix of some set of random variables R is automatically positivesemidefinite:
8/10/2019 Empirical Finance
109/562
Var
aTR
=aTa 0 a
Such a matrix is also positive definiteif it satisfies the stronger condition:
Var
aTR
= aTa>0 a = 0
A covariance matrix has an inverse if and only if it is positive definite.That is, if the only portfolio of assets that is risk-free (i.e., has variance ofzero) is the portfolio with weight zero on every asset, then the covariancematrix of the asset returns is positive definite.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 109 / 563
Basic Principles Multivariate Tests
Numeric examplesmatrix inversion is actually rather easy for diagonalmatrices, i.e., those in which the off-diagonal elements are all zero:
5 0 01
0.2 0.0 0.0
8/10/2019 Empirical Finance
110/562
0 2 00 0 1
= 0.0 0.5 0.00.0 0.0 1.0
Note that the inverse is also diagonal, and the elements are just thereciprocals of the elements in the original matrix.
Things are a bit more complicated in general:
3 6 14 7 -26 13 0
1
= 1.6250 0.8125 -1.1875-0.7500 -0.3750 0.62500.6250 -0.1875 -0.1875
(Try verifying the inverses.)Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 110 / 563
Basic Principles Multivariate Tests
Recall the example of the three securities, which were used to test a modelof expected returns. We have no information on the covariances betweenthe three asset returns; suppose these are all estimated at exactly zero(not very likely, but assume so for purposes of the discussion). We can
8/10/2019 Empirical Finance
111/562
arrange the sample mean returns in a vector, and the hypothesized meanreturns in another vector:
= 6%
16%14% 0 =
8%10%12%
The estimated variances and covariances can be arranged in a matrix:
=0.0625 0 00 0.16 0
0 0 0.36
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 111 / 563
Basic Principles Multivariate Tests
The proposed joint test statistic can be expressed as:
T 1
8/10/2019 Empirical Finance
112/562
F = ( 0)T
1
( 0)At an intuitive level, this test statistic has some good properties. Whenany of the assets have an estimated expected return that is far from thehypothesized value, this tends to make the test statistic large.Furthermore, it gives more weight to assets whose mean is estimated moreaccurately. If an asset has a small (estimated) variance of return, when is inverted, the corresponding element is large, giving more weight to thedeviation of this assets average return from the hypothesized value. Assets
with large variance of return require larger differences between the observedand hypothesized returns to have the same effect on the test statistic.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 112 / 563
Basic Principles Multivariate Tests
This test statistic works just as well when the asset returns arecorrelated;the only modification we will make is to add a scaling factor:
8/10/2019 Empirical Finance
113/562
F =T(T N)
N(T 1) ( 0)T 1 ( 0)
where T is (as before) the number of observations, and N is the number
of assets. Under an assumption of normality (the asset returns have themultivariate normal distribution), this test statistic has an F distribution.
An F distribution has two degrees of freedom parameters; the first is N,and the second is T N. This is sometimes written FN,TN. Tables ofthe Fdistribution are widely available in statistics books and otherreferences; many software packages can calculate them.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 113 / 563
Basic Principles Multivariate Tests
F-statistic for Stock Return Example1,000,000 Trials
8/10/2019 Empirical Finance
114/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 114 / 563
Basic Principles Multivariate Tests
When T is very large, the assumption of multivariate normality is not
8/10/2019 Empirical Finance
115/562
When T is very large, the assumption of multivariate normality is notparticularly important. Recall that, for our application, the first degrees offreedom parameter is N, and the second is T N. The Fd1,d2 distributionapproaches a chi-squaredistribution with d1 degrees of freedom as d2approaches +; sinced2 approaches +asd2 becomes very large, this isthe limiting distribution of the test statistic for very large T. However, thetest statistic approaches this distribution, for very large T, even if the dataare not multivariate normally distributed.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 115 / 563
Basic Principles Multivariate Tests
Chi-square Distribution with Various Degrees of Freedom
8/10/2019 Empirical Finance
116/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 116 / 563
Basic Principles Multivariate Tests
F Distribution and Limiting Chi-square Distribution
8/10/2019 Empirical Finance
117/562
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 117 / 563
Basic Principles Multivariate Tests
A test procedure is therefore:
1 Estimate the sample means, sample variances, and sample covariances
8/10/2019 Empirical Finance
118/562
of the asset returns from historical data.2 Arrange the sample means into a vector, and the sample variances
and covariances into a matrix.3 Also arrange the hypothesized values of the mean returns into a
vector.4 Calculate the test statistic F.5 Determine the p-value of this statistic, using tables from a book,
software, or some other source.6
If the p-value is small enough (e.g., smaller than 0.05 for a 95%confidence test), then rejectthe hypothesis that the model is correct.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 118 / 563
Basic Principles Multivariate Tests
Numeric examplesuppose the (estimated) covariance matrix for thethree assets is:
8/10/2019 Empirical Finance
119/562
=
0.0625 -0.0200 0.0300
-0.0200 0.1600 0.02400.0300 0.0240 0.3600
(Are these numbers consistent with the standard deviations reportedearlier?)
Can we reject, with 95% confidence, the predictions of the model?
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 119 / 563
Basic Principles Multivariate Tests
The test statistic is:
F =240 (240 3)
8/10/2019 Empirical Finance
120/562
3(240 1) 6% 8%16% 10%
14% 12%
T 0.0625 -0.0200 0.0300-0.0200 0.1600 0.02400.0300 0.0240 0.3600
1 6% 8%16% 10%14% 12%
2.066
This distribution has 3 and 237 degrees of freedom. Many tables for the Fdistribution do not actually show p-values for different values of the Fstatistic, but rather a single cut-off p-value for tests of different confidence
levels. From a table for 95% confidence tests, we find that the cut-offvalue for an F distribution with 3 and 120 degrees of freedom is 2.6802,and for 3 and infinitely many degrees of freedom, it is 2.6049. For 3 and237 degrees of freedom, it must be somewhere in between.Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 120 / 563
Basic Principles Multivariate Tests
If the F-statistic is above the cut-off value of approximately 3, then thep-value is below 0.05, and we can reject the hypothesis (correctness of themodel) with 95% confidence. If the F-statistic is below thecut-off value ofapproximately 3, then the p-value is above 0.05, and we cannot reject the
8/10/2019 Empirical Finance
121/562
hypothesis. (Recall that this does not mean the hypothesis is true; itmeans we have not found sufficient evidence to conclude that thehypothesis is false.)
The F-statistic is 2.066, which is well below the cut-off value, so wecannot reject the hypothesis with 95% confidence. (We cannot reject itwith 90% confidence eitherthe p-value is 0.1053.)
So despite the fact that a t-test rejects the hypothesis for one of the assetsindividually, a joint test based on an F-statistic fails to reject thehypothesis. We have not seen enough evidence to convince us, with 95%confidence, that the model is false.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 121 / 563
Basic Principles Multivariate Tests
It is worthwhile in a discussion of hypothesis testing to warn against thedangers of data mining.
8/10/2019 Empirical Finance
122/562
In some disciplines, data mining is considered a good thing; one can eventake a course to learn how to do it. In finance and economics, if someonetells you that you are data mining, that person is not paying you acompliment.
What is data mining? Recall that, even if an hypothesis is true, there is acertain probability of committing a Type I error (rejecting the hypothesiseven when it is true). For example, suppose you believe that the level ofthe high tide has an effect on stock market returns. The reality is thatyour theory is wrong, and the tides have no effect on the stock market;
however, you dont know this.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 122 / 563
Basic Principles Multivariate Tests
So, you gather some data on the tides and the stock market, and performa statistic test of your hypothesis. Following common practice, you reject
8/10/2019 Empirical Finance
123/562
the hypothesis the tides have no effect on the stock market if thep-value of your statistical test is 0.05 or less. There is then a one in twentychance that you will reject the hypothesis, and conclude that the tides dohave an effect on the stock market (even though they dont).
Data mining refers to the practice of performing statistic test afterstatistical test, until finding one that rejects, and then reporting onlythelast test. This is a recipe for finding spurious resultschances are goodthat the result you report will be a Type I error, rather than a legitimateresult.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 123 / 563
Basic Principles Multivariate Tests
The pressure to find results is enormous, both in academicand industry
l l fi d l bl d d
8/10/2019 Empirical Finance
124/562
circles. Failure to find a result may mean no publication in academics, andno clients in industry. The incentives to engage in data mining are huge,and many engage in it, either fully aware of what they are doing, or havingsuccessfully deluded themselves into believing that what they are doing islegitimate.
A rule of thumb is the following: if you cant think of a reasonableeconomic story for the statistical result you have found, that should be awarning sign that the result is the product of data mining.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 124 / 563
Testing the CAPM
Empirical Finance
Testing the CAPM
8/10/2019 Empirical Finance
125/562
Prof. Robert L [email protected]
+65 6631 8579
EDHEC Business School
2427 Mar 20112224 Aug 2011
Singapore Campus
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 125 / 563
Testing the CAPM Conditional Probabilities
We need to look at the relation between the returns of multiple securities;the notion of conditional probabilities is absolutely centralto the analysis.
Th b bili f lik l d d h h i f i
8/10/2019 Empirical Finance
126/562
The probability of an event very likely depends on how much informationone has. For example, it is much easier to forecast the value of a stock (orthe weather, or an election) one day in advance than it is three years inadvance. The reason is, over the past three years, a great deal hashappened that affects the value of the stock (or the weather, or theoutcome of the election). However, if you are making your forecast one dayin advance, then you know almost everything that will affect the variableyou are forecasting during the last three years; the only information you aremissing pertains to the one remaining day. If you are making your forecast
three years in advance, you are doing so with much less information.
Kimmel (EDHEC Business School) Empirical Finance SingaporeMar/Aug 2011 126 / 563
Testing the CAPM Conditional Probabilities
Probabilities therefore depend on an informationset; people with different
i f ti h diff t b biliti f th t I
8/10/2019 Empirical Finance
127/562
information have different probabilities for the same event. In somecontexts, the idea of the information set is left implicit; however, we willsometimes need to make it explicit.
We will often deal with the situation of two distinct information sets, with
one being a strict subset of the other. Probabilities based on the moreinformative information set are then called conditionalprobabilities, andthose based on the less informative information set are calledunconditionalormarginalprobabili