43
Quantile Regression

Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Embed Size (px)

Citation preview

Page 1: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Page 2: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• The Problem• The Estimator• Computation• Properties of the Regression• Properties of the Estimator• Hypothesis Testing• Bibliography• Software

Page 3: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

The Problem

Page 4: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Problem– The distribution of Y, the “dependent” variable,

conditional on the covariate X, may have thick tails.– The conditional distribution of Y may be asymmetric.– The conditional distribution of Y may not be

unimodal.

Neither regression nor ANOVA will give us robust results. Outliers are problematic, the mean is pulled toward the skewed tail, multiple modes will not be revealed.

Page 5: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Problem– ANOVA and regression provide information

only about the conditional mean.– More knowledge about the distribution of the

statistic may be important.– The covariates may shift not only the location

or scale of the distribution, they may affect the shape as well.

Page 6: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

0

5

10

15

20

25

Five Treatments, 100 Patients

1 2 3 4 5

Raw Data

Treatment

Page 7: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

1 2 3 4 5

Treatment

0

4

8

12

16

Sco

re

Five Treatments, 100 PatientsMeans with Error Bars

Page 8: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Quantiles

0

4

8

12

16

20

0 1 2 3 4 5 6

Treatment

Sco

re

10th %tile

25th %tile

50th %tile

75th %tile

90th %tile

Mean

Page 9: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

The Estimator

Page 10: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Ordinarily we specify a quadratic loss function. That is, L(u) = u2

• Under quadratic loss we use the conditional mean, via regression or ANOVA, as our predictor of Y for a given X=x.

Page 11: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Definition: Given p ∈ [0, 1]. A pth quantile of a random variable Z is any number ζp such that Pr(Z< ζ p ) ≤ p ≤ Pr(Z ≤ ζ p ). The solution always exists, but need not be unique.Ex: Suppose Z={3, 4, 7, 9, 9, 11, 17, 21} and p=0.5 then Pr(Z<9) = 3/8 ≤ 1/2 ≤ Pr(Z ≤ 9) = 5/8

Page 12: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Quantiles can be used to characterize a distribution:– Median– Interquartile Range– Interdecile Range

– Symmetry = (ζ.75- ζ.5)/(ζ.5- ζ.25)

– Tail Weight = (ζ.90- ζ.10)/(ζ.75- ζ.25)

Page 13: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Suppose Z is a continuous r.v. with cdf F(.), then Pr(Z<z) = Pr(Z≤z)=F(z) for every z in the support and a pth quantile is any number ζ p such that F(ζ p) = p

• If F is continuous and strictly increasing then the inverse exists and ζ p =F-1(p)

Page 14: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression• Definition: The asymmetric absolute loss function is

• Where u is the prediction error we have made and I(u) is an indicator function of the sort u)0u(Ip

u)0u(I)p1()0u(IpLp

0uif0

0uif1

)0u(I

Page 15: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionAbsolute Loss vs. Quadratic Loss

0

0.5

1

1.5

2

2.5

3

-2 -1 0 1 2

Quad

p=.5

p=.7

Page 16: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Proposition: Under the asymmetric absolute loss function Lp a best predictor of Y given X=x is a pth conditional quantile. For example, if p=.5 then the best predictor is the median.

)x(p

Page 17: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression• Definition: A parametric quantile regression model is correctly specified if, for example,

That is, is a particular linear combination of the independent variable(s) such that

where F( ) is some univariate distribution.

x),x(q)x(p

)x)x((F)xYPr(

)x|)x((F)xX)x(YPr(p

p

pp

x

Page 18: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

.25

x)x(25.

Page 19: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• Definition: A quantile regression model is identifiable if

has a unique solution.

)xY(LEmin pF,

Page 20: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• A family of conditional quantiles of Y given X=x.– Let Y=α + βx + u with α = β = 1

x

Y

90%

75%

50%

25%

10%

Page 21: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Daily High Temperature

0

5

10

15

20

25

30

35

40

45

50

0 10 20 30 40 50

Yesterday

To

day

Page 22: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

5 10 15 20

20

40

60

80Cool Yesterday (n=259)

Temperature Today

Freq

uenc

y75

1

X 1

18.47.6 X 0

Page 23: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

15 20 25 30 35 40 45

20

40

60

80Hot Yesterday (n=259)

Temperature Today

Freq

uenc

y61

6

X 1

42.5514 X 0

Page 24: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression• Quantiles at .9, .75, .5, .25, and .10. Today’s temperature fitted to a quartic on yesterday’s temperature.

Temperature Quantiles

0

10

20

30

40

50

60

5 15 25 35 45

Yesterday

To

da

y

Page 25: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Computation of the Estimate

Page 26: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionEstimation

• The quantile regression coefficients are the solution to

• The k first order conditions are

)1(xyxysgnpn

1min

n

1i

'ii

'ii2

121

)2(0xˆxysgn2

1

2

1p

n

1 n

1iip

'ii

Page 27: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression Estimation

• The fitted line will go through k data points.• The # of negative residuals ≤ np ≤ # of neg

residuals + # of zero residuals• The computational algorithm is to set up the

objective function as a linear programming problem

• The solution to (1) – (2) [previous slide] need not be unique.

Page 28: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Properties of the Regression

Page 29: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionProperties of the regression

• Transformation equivarianceFor any monotone function, h(.),

since P(T<t|x) = P(h(T)<h(t)|x). This is especially important where the response variable has been censored, I.e. top coded.

))x|(Q(h)x|(Q T)T(h

Page 30: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Properties

• The mean does not have transformation equivariance since Eh(Y) ≠ h(E(Y))

Page 31: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionEquivariance

• Practical implications

gularsinnonAx,y,ˆAxA,y,ˆ

x,y,ˆx,xy,ˆ

0x,y,ˆx,y,1ˆ

0x,y,ˆx,y,ˆ

1

k

Page 32: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Equivariance

• (i) and (ii) imply scale equivariance

• (iii) is a shift or regression equivariance

• (iv) is equivariance to reparameterization of design

Page 33: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionProperties

• Robust to outliers. As long as the sign of the residual does not change, any Yi may be changed without shifting the conditional quantile line.

• The regression quantiles are correlated.

Page 34: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Properties of the Estimator

Page 35: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionProperties of the Estimator

• Asymptotic Distribution

• The covariance depends on the unknown f(.) and the value of the vector x at which the covariance is being evaluated.

1iiiii

1iii

L

xx)x|0(fExxExx)x|0(fE)1(

where

,0Nˆn

Page 36: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionProperties of the Estimator

• When the error is independent of x then the coefficient covariance reduces to

where

1

2u

xxE0f

1

n

1i ii xxn

1)xx(E

Page 37: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionProperties of the Estimator

• In general the quantile regression estimator is more efficient than OLS

• The efficient estimator requires knowledge of the true error distribution.

Page 38: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionCoefficient Interpretation

• The marginal change in the Θth conditional quantile due to a marginal change in the jth element of x. There is no guarantee that the ith person will remain in the same quantile after her x is changed.

ij

ii

x

x|yQ

Page 39: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

Hypothesis Testing

Page 40: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionHypothesis Testing

• Given asymptotic normality, one can construct asymptotic t-statistics for the coefficients

• The error term may be heteroscedastic. The test statistic is, in construction, similar to the Wald Test.

• A test for symmetry, also resembling a Wald Test, can be built relying on the invariance properties referred to above.

Page 41: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Heteroscedasticity

• Model: yi = βo+β1xi+ui , with iid errors.

– The quantiles are a vertical shift of one another.

• Model: yi = βo+β1xi+σ(xi)ui , errors are now heteroscedastic.– The quantiles now exhibit a location shift as

well as a scale shift.

• Khmaladze-Koenker Test Statistic

Page 42: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile RegressionBibliography

• Koenker and Hullock (2001), “Quantile Regression,” Journal of Economic Perspectives, Vol. 15, Pps. 143-156.

• Buchinsky (1998), “Recent Advances in Quantile Regression Models”, Journal of Human Resources, Vo. 33, Pps. 88-126.

Page 43: Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression

• S+ Programs - Lib.stat.cmu.edu/s

• www.econ.uiuc.edu/~roger

• http://Lib.stat.cmu.edu/R/CRAN

• TSP

• Limdep