Applied Statistics Vincent JEANNIN β ESGF 4IFM
Q1 2012
1
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
2
Summary of the session (est. 4.5h) β’ Reminders of last session β’ Multiple regression β’ Introduction to econometrics β’ Estimations β’ Games: beat the statistics
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Reminders of last session
3
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
3 Methods
β’ Historical β’ Parametrical β’ Monte-Carlo
4
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Options: what to look at to calculate the VaR?
4 risk factors: β’ Underlying price β’ Interest rate β’ Volatility β’ Time
4 answers: β’ Delta/Gamma approximation knowing the distribution of the underlying β’ Rho approximation knowing the distribution of the underlying rate β’ Vega approximation knowing the distribution of implied volatility β’ Theta (time decay)
Yes but,β¦ Does the underling price/rate/volatility vary independently?
Might be a bit more complicated than expectedβ¦
5
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Portfolio scale: what to look at to calculate the VaR?
Big question, is the VaR additive?
NO! Keywords for the future: covariance, correlation, diversification
6
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
VAR ππ + ππ = π2ππ΄π π + π2ππ΄π π + 2πππΆππ(π, π)
Parametric VaR on 2 assets?
π π β€ β1.645 β π + π = 0.05
π π β€ β2.326 β π + π = 0.01
Asset 1 Mean 0
SD 2.34% Weight 50%
Asset 2 Mean 0
SD 1.50% Weight 50%
Correlation 0.59
What is the VaR (95%)?
2.83%
7
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Linear regression model
Minimize the sum of the square vertical distances between the observations and the linear approximation
π¦ = π π₯ = ππ₯ + π
Residual Ξ΅
OLS: Ordinary Least Square
Minimising residuals
πΈ = ππ2
π
π=1
= π¦π β ππ₯π + π 2
π
π=1
π =πΆππ£π₯π¦
π2π₯
π = π¦ β π π₯
8
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
π =πΆππ£π₯π¦
ππ₯ππ¦ Value between -1 and 1
Dispersion Regression
Total Dispersion π 2 =
9
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
10
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
11
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Differentiation can happen before the OLS
What do you suggest?
12
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
ππ·πππ = ln(π)
Letβs create a new variable
Magic!
13
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Only one parameters to estimate: β’ Slope Ξ²
Minimising residuals
πΈ = ππ2
π
π=1
= π¦π β ππ₯π2
π
π=1
When E is minimal?
When partial derivatives i.r.w. a is 0
New idea⦠No intercept
14
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
πΈ = ππ2
π
π=1
= π¦π β ππ₯π2
π
π=1
ππΈ
ππ= β2π₯ππ¦π + 2ππ₯π
2
π
π=1
= 0
π¦π β ππ₯π2 = π¦π
2 β 2ππ₯ππ¦π + π2π₯π2
Quick high school reminder if necessaryβ¦
π₯ππ¦π β ππ₯π2
π
π=1
= 0
π β π₯π2
π
π=1
= π₯ππ¦π
π
π=1
π = π₯ππ¦π
ππ=1
π₯π2π
π=1
π =π₯ππ¦π
π₯π2
Any better?
Multiple regressions
15
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
π¦ = π0 + π1π1+π2π2+β¦+ππππ + Ξ΅
More than one explanatory variables
Choosing factors can be difficult
Much tougher without software
16
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Variables may not be dependent form each other
Financial methods such APT (Arbitrage Pricing Theory) tries to have pure and independent factors
Used a lot in economics
R-Square is very often very poor
17
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Ratio Investment / GDP , World Bank, developing countries
π = 19.5 β5.8πΆππππ’ππ‘πππ + 6.3πΆππππ’ππ‘ππππππππππ‘πππ + 2ππβπππ β 1.1πΊπ·π β 2π·ππ π‘πππ‘πππ
Letβs discussβ¦
β’ Corruption: current corruption β’ CorruptionPrediction: future corruption β’ School: level of education β’ GDP: GDP β’ Distortion: how badly policies are run
18
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Opposite effect of corruption variables
Any logic with this?
The current level of corruption decreases investment
The future level of corruption increases investment
Investors learn how to live with corruptionβ¦
19
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
R-Squared is 0.24, very poorβ¦
β’ General to specific: this starts off with a comprehensive model, including all the likely explanatory variables, then simplifies it.
β’ Specific to general: this begins with a simple model that is easy to understand, then explanatory variables are added to improve the modelβs explanatory power.
How to find the right model?
20
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Golden rules
Be logic
Have the best R-Squared
Not over complicate
Introduction to econometrics
21
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
3 steps
Identify
Fit
Forecast
πππ = πππππ + π with π being a white noise What is a model?
22
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
3 components
Trend
Seasonality
Residual
23
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Stationary series are easier to forecast⦠Transform it!
A series is stationary if the mean and the variance are stable
Which one is more likely to be stationary?
24
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Properties of stationary series
(π1, π2, π3, β¦ , ππ)
(π2, π3, π4, β¦ , ππ+1)
Same distribution of the following
Distribution not time dependent
Rare occurrence
Stationarity accepted if
πΈ(ππ‘) = π Constant in the time
πΆππ£(ππ‘ , ππ‘βπ) Depends only on n
25
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
About the residualsβ¦
White noise!
Normality test
Have an idea with
Skewness
Kurtosis
Proper tests: KS, Durbin Watson, Portmanteau,β¦
26
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
eps<-resid(TReg)
ks.test(eps, "pnorm")
layout(matrix(1:4,2,2))
plot(TReg)
27
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
lag.plot(DATA$Val, 9, do.lines=FALSE)
Differentiation seems to be interesting
28
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Check ACF/PACF for autocorrelation
29
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
ππ‘ = π + π1ππ‘β1 + π2ππ‘β2 + β―+ ππππ‘βπ + ππ‘
ππ Parameters of the model
ππ White noise
Auto Regressive model
AR(n)
Estimations
30
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Small sample: Binomial Distribution
Large sample: Normal Distribution
)()1()!(!
!)( xnx pp
xnx
nxf
)1(, pnpnpN
n is the size of the sample, x, the number individuals with the particular characteristic
πΈ π = ππ
π π = ππ(1 β π)
31
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Binomial Distribution
πΈ π = π π π =π(1 β π)
π
Normal approximation
π~π π,π(1 β π)
π Standardisation possible
πβ~π 0,1
πβ =π β π
π(1 β π)π
Normal approximation works only if
ππ β₯ 5 π(1 β π) β₯ 5
Estimate a proportion π =
π
π
32
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
π π1 < π < π2 = 0.95 Letβs look for p with a 95% confidence interval
Easy solve!
π π β 1.96 β π β€ π β€ π + 1.96 β π = 0.95
33
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
52 Heads out of 100 tossβ¦
π~π 0.52,0.04996
95% confidence interval
π1 = 0.62
π~π ? , ?
π2 = 0.42
34
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Mean estimation
Problem
The SD of the actual population is unknown
Mean has a Studentβs distribution
Similarity with normal
35
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Studentβs properties
β’ It is symmetric about its mean β’ It has a mean of zero β’ It has a standard deviation and variance greater than 1. β’ There are actually many t distributions, one for each degree of freedom β’ As the sample size increases, the t distribution approaches the normal distribution. β’ It is bell shaped. β’ The t-scores can be negative or positive, but the probabilities are always positive.
Normal-ish distribution in a discrete environment with a confidence interval
36
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Studentβs Statistic
S=π
πβ1π
π π₯ βπ
πβ π‘πΌ/2 < π < π₯ +
π
πβ π‘πΌ/2 = 0.95
Degree of freedom
n-1
37
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
IPO Premiums IPO1 / 12% IPO2 / 15% IPO3 / 13% IPO4 / 18% IPO5 / 20% IPO6 / 5%
SD: π=4.81%
DF: π·πΉ=5
S: π=5.27%
t: π‘=2.571
π1: π1=19.36%
π₯ : π₯ =13.83%
π2: π2=8.30%
38
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Is a frequency difference significant?
π1~π π1,π1(1 β π1)
π1 π2~π π2,
π2(1 β π2)
π2
π = π1 β π2
πΈ(π) = πΈ(π1) β E(π2)
π(π) = π(π1) + V(π2) Assumption of independence
π~π π1 β π2,π1(1 β π1)
π1+
π2(1 β π2)
π2
39
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Observations 100 Friendly Takeover, 80 success 60 Hostiles Takeover, 50 success
Is the difference significant? 95% confidence
Friendly 80%
Hostiles 83%
Global frequency
π =π1πΉ1 + π2πΉ2
π1 +π2 π =
80 + 50
100 + 60= 81.25%
40
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
π‘β =πΉ1 β πΉ2
π (1 β π )1π1
+1π2
π‘β = β0.52298
If π(β1.96 < π‘β < 1.96) = 0.95the frequencies are the same
with a 95% confidence interval
The frequencies are equal
Their difference is not significant
Actual difference due to fluctuation of samples
41
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Is a SD difference significant?
Fisher Snedecor distribution
ππ₯ 2
ππ¦ 2
ππ 2
ππ 2
Total variance
Total variance
Sample variance
Sample variance
ππ₯ 2
ππ¦ 2βππ 2
ππ 2~πΉ(ππ β 1, ππ β 1)
42
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
ππ 2 = ππ 2 You want to test
ππ₯ 2
ππ¦ 2~πΉ(ππ β 1, ππ β 1)
43
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
ππ₯ 2
ππ¦ 2~πΉ(5,4)
44
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
95% confidence interval F-Table
ππ₯ 2
ππ¦ 2< 6.26 If SD are equals (at 95% CI)
Games: Beat the Statistics
45
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Is Martingale safe?
Bet on 2:1, double when you loseβ¦
Risk of ruin?
46
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Bet on 2:1
Is this really 2:1? 18
37= 0.4865
Obvious how casino is making money!
The probability of the casino to win is always bigger than the probability of the player to win!
47
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Youβll be right with a martingaleβ¦ Eventually! But when?
The 2011 recorded record series is 26 reds in Las Vegas, Nevada
You were on the black and hoping the reversal, you begun with $2
At the 27 round you need
227 = $134,217,728
And donβt forget you lost already
21 + 22 + β―+ 226 = $134,217,726
Casino limit stakes
Your pocket may not be deep enough anyway!
And if you win at the 27th roll, you madeβ¦
$2 Quite riskyβ¦
48
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
βNo one can possibly win at roulette unless he steals money from the table while the
croupier isnβt looking.β β Albert Einstein
49
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
Binomial approach
π π₯ = πΆπ₯πππ₯(1 β π)πβπ₯
50
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 4
IFM
Q1
20
12
$255, $1 flat bet
$255, $1 start, martingale double when you lose
Ruin in 255 times for flat bet
Ruin in 8 times for martingale
1,000,000 times comparison, 100 rounds maximum
51
vin
zjea
nn
in@
ho
tmai
l.co
m
ESG
F 5
IFM
Q1
20
12
Conclusion
Multiple Regression
Econometrics
Estimations
Statistics & Games