36
Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions Finite Mixtures of Quantile and M-quantile regression models Marco Alf` o 1 Nicola Salvati 2 M.G. Ranalli 3 1 Sapienza Universit` a di Roma 2 Universit` a di Pisa 3 Universit` a di Perugia Workshop on “Recent Advances in Quantile and M-quantile Regression” Universit` a di Pisa — July 15th, 2016 Alf`o, Salvati, Ranalli Finite Mixtures of Quantile and M-quantile regression models

Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite Mixtures of Quantile and M-quantileregression models

Marco Alfo1 Nicola Salvati2 M.G. Ranalli3

1Sapienza Universita di Roma 2Universita di Pisa 3Universita di Perugia

Workshop on “Recent Advances in Quantile and M-quantileRegression”

Universita di Pisa — July 15th, 2016

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 2: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Dependent Observations

Essential References

Alfo, M., Salvati, N., Ranalli M.G. (2016)Finite mixtures of quantile and M-quantile regression models.Statistics and Computing

Tzavidis, N., Salvati, N., Schmid, T., Flouri, E., Midouhas, E. (2016)Longitudinal analysis of the strengths and difficulties questionnaire scores of theMillennium Cohort Study children in England using M-quantile random-effectsregression,Journal of the Royal Statistical Society: Series A

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 3: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Dependent Observations

The presentation at a glance

Data are seldom i.i.d. and without outliers!

Dependent Observations (multilevel, longitudinal, panel data)

Quantile and M-quantile regression models

Introducing Finite Mixtures (nonparametric distribution forthe random effects)

Maximum Likelihood Estimation

Multivariate extension

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 4: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Dependent Observations

Outline

1 Introduction on Finite MixturesDependent ObservationsFinite mixtures of regression models

2 Finite Mixtures for Quantile and M-Quantile regression modelsLikelihood Inference (focus on MQ)

3 ApplicationsPain Labor Data & Treatment of lead-exposed childrenThe Millennium Cohort Study (Joint work with MF Marino &N Tzavidis)

4 Conclusions

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 5: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Dependent Observations

Hierarchically structured data

Regression model for multilevel data

E(yij | xij , bi) = x′ijβ +w′

ijbi, i = 1, . . . , n, j = 1, . . . , ri

yij , observed response variable

xij = (xij1, . . . , xijp)′ vector of explanatory variables; let

xij1 ≡ 1

Linear Models (for ease of notation) → GLMs

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 6: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Dependent Observations

Hierarchically structured data

Regression model for multilevel data

E(yij | xij , bi) = x′ijβ +w′

ijbi, i = 1, . . . , n, j = 1, . . . , ri

wij is a subset of xij that contains those p1 6 p variableswhose effects are assumed to be individual-specific

the effects bi i = 1, . . . , n, vary across individuals according toa distribution h(·)

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 7: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Dependent Observations

Likelihood

Local independence assumption

L(Φ) =

n∏i=1

∫B

ri∏j=1

f(yij |xij , bi)dH(bi)

,

Φ global set of parameters,

f(·) is the Gaussian density,

H(·) is the random coefficient cdf and B the correspondingsupport

In the general case, the integral defining the likelihood can notbe analytically computed (GQ, aGQ, MCML, Composite Lik,etc.)

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 8: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite mixtures of regression models

Nonparametric distribution for the random coefficients

Leave h(·) unspecified

Approximate h(·) by a discrete distribution on G < nlocations {b1, . . . , bG}, with associated probabilities definedby πk = Pr(bi = bk), i = 1, . . . , n and k = 1, . . . , G.

bi ∼G∑k=1

πkδbk

where δθ is a one-point distribution putting a unit mass at θ.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 9: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite mixtures of regression models

Comparing the Likelihoods

Nonparametric distribution for the random effects

L(Φ) =

n∏i=1

G∑k=1

∏j

f(yit|xit, bk)πk

=:

n∏i=1

G∑k=1

∏j

fijkπk

.

Parametric distribution for the random effects

L(Φ) =

n∏i=1

∫B

∏j=1

f(yij |xij , bi)dH(bi)

,

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 10: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite mixtures of regression models

Comparing the Likelihoods

Nonparametric distribution for the random effects

L(Φ) =

n∏i=1

G∑k=1

∏j

f(yij |xij , bk)πk

=:

n∏i=1

G∑k=1

∏j

fijkπk

.

Φ = {β, b1, . . . , bG, π1, . . . , πG}fijk is the distribution of the response variable for the j-thmeasurement in the i-th cluster when the k-th component ofthe finite mixture, k = 1, . . . , G is considered

resembles the likelihood function for a finite mixture ofGaussian distributions

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 11: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite mixtures of regression models

Regression model

semi-parametric approximation to a fully parametric, possiblycontinuous, distribution for the random coefficients

a model-based clustering approach, where the population ofinterest is assumed to be divided in G homogeneoussub-populations which differ for the values of the regressionparameters

Considering the k-th component of the mixture,

E(yij | xij , bk) = x′ijβ +w′

ijbk.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 12: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite mixtures of regression models

Estimation of model parameters (1)

The score function can be written as the posterior expectation ofthe score function corresponding to a standard LM:

S (Φ) =∂ log[L(Φ)]

∂Φ=

n∑i=1

G∑k=1

τik∑j

∂ log fijk∂Φ

,

where the weights

τik =

∏j fijkπk∑l

∏j fijlπl

represent the posterior probabilities of component membership.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 13: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite mixtures of regression models

Estimation of model parameters (2)

Likelihood equations that are essentially weighted sums of thelikelihood equations for a standard LM, with weights τik.

The basic EM algorithm is defined by solving equations for agiven set of the weights, and updating the weights as afunction of the current parameter estimates.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 14: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite Mixtures for Quantile and M-Quantile regression models

Outline

1 Introduction on Finite Mixtures

2 Finite Mixtures for Quantile and M-Quantile regression modelsLikelihood Inference (focus on MQ)

3 Applications

4 Conclusions

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 15: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite Mixtures for Quantile and M-Quantile regression models

Quantile and M-Quantile regression models for dependentobservations

Linear Quantile Random Effect models(Geraci & Bottai, 2007, 2014; Liu & Bottai, 2009)

Qq(yij | xij , bi,q) = x′ijβq +w

′ijbi,q

Linear M-Quantile Random Effect models(Tzavidis et al., 2016)

MQq(yij | xij , bi,q, ψ, c) = x′ijβq +w

′ijbi,q

Note that both fixed and random coefficients vary withq ∈ (0, 1)Random effects are normally distributed

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 16: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Finite Mixtures for Quantile and M-Quantile regression models

Finite Mixtures of Q and MQ regression models

Approximate the distribution of the random coefficients through adiscrete distribution defined on a finite, G-dimensional, set oflocations. Then, conditional on k,

Qq(yij | xij , bk,q) = x′ijβq +w

′ijbk,q

MQq(yij | xij , bk,q, ψ, c) = x′ijβq +w

′ijbk,q

for k = 1, . . . , G.

Each component of the mixture is characterised by a different(sub-) vector of regression coefficients, bk,q, k = 1, . . . , G

Note that the distribution of bk,q may vary with quantiles

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 17: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Estimation of model parameters (focus on MQ)

L(Φq) =

n∏i=1

G∑k=1

∏j

fq(yij |xij , bk,q)πk,q

.

Φq ={βq, b1,q, . . . , bG,q, σq, π1,q, . . . , πG,q

}fq(·) is the ALID (Asymmetric Least Informative Density,Bianchi et al., 2015):

fq(·) =1

Bq(σq, c)exp{−ρq(·)}

Bq(σq, c) is a normalising constant that ensures the densityintegrates to oneρq(·) is the Huber loss function.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 18: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Missing data approach

zik,q =

{1 if unit i is in component k of the mixture0 otherwise

P (zik,q = 1) = πk,q = P (bi,q = bk,q)zi,q = (zi1,q, ..., ziG,q)

′, i = 1, ..., n, are considered as missingdata

Complete data log-likelihood

Should we have observed, for each i, (yi, zi,q), the log-likelihoodfor the complete data would have been:

`c(Φq) =

n∑i=1

G∑k=1

zik,q{log[fq(yi | βq, bk,q, σq)

]+ log(πk,q)

}Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 19: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Missing data approach

zik,q =

{1 if unit i is in component k of the mixture0 otherwise

P (zik,q = 1) = πk,q = P (bi,q = bk,q)zi,q = (zi1,q, ..., ziG,q)

′, i = 1, ..., n, are considered as missingdata

Complete data log-likelihood

Should we have observed, for each i, (yi, zi,q), the log-likelihoodfor the complete data would have been:

`c(Φq) =

n∑i=1

G∑k=1

zik,q{log[fq(yi | βq, bk,q, σq)

]+ log(πk,q)

}Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 20: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Maximum Likelihood via the EM algorithm – E-step

Expected value of `c(Φq) over zi,q, conditional on the observeddata and the current parameter estimates:

Q(Φq | Φ(t)

q ) = EΦ

(t)

q

[`c(Φq) | yi]

=

n∑i=1

G∑k=1

τ(t+1)ik,q

{log[fq(yi | βq, bk,q, σq)

]+ log(πk,q)

}.

That is, the unobservable indicators are replaced by theirconditional expectation, which, at iteration (t+ 1) are given by

τ(t+1)ik,q =

π(t)k,qfik,q(Φ

(t)

q )∑l π

(t)l,q fil,q(Φ

(t)

q ), i = 1, . . . , n, k = 1, . . . , G.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 21: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Maximum Likelihood via the EM algorithm – E-step

Expected value of `c(Φq) over zi,q, conditional on the observeddata and the current parameter estimates:

Q(Φq | Φ(t)

q ) = EΦ

(t)

q

[`c(Φq) | yi]

=

n∑i=1

G∑k=1

τ(t+1)ik,q

{log[fq(yi | βq, bk,q, σq)

]+ log(πk,q)

}.

That is, the unobservable indicators are replaced by theirconditional expectation, which, at iteration (t+ 1) are given by

τ(t+1)ik,q =

π(t)k,qfik,q(Φ

(t)

q )∑l π

(t)l,q fil,q(Φ

(t)

q ), i = 1, . . . , n, k = 1, . . . , G.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 22: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Maximum Likelihood via the EM algorithm – M-step

Maximise the function Q(·) w.r.t. Φq to update parameterestimates.Then Φ

(t+1)

q are defined to be the solutions to the following scoreequation:

∂Q(Φq | Φ(t)

q )

∂Φq= 0,

which are equivalent to the score equations for the observed data,S(Φq) = 0.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 23: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Standard Errors

Oakes (1999)’s identity

I(Φq) = −

{∂2Q(Φq | Φq)

∂Φq∂Φ′q

∣∣∣∣∣Φq=Φq

+∂2Q(Φq | Φq)

∂Φq∂Φ′q

∣∣∣∣∣Φq=Φq

= A + B

A Cond. exp. of the complete data Hessian given the obs. data (EM)

B First derivative of the cond. exp. of the complete data Score giventhe obs. data (numDeriv in R)

Sandwich Cov(Φq

)= I(Φq)

−1V (Φq)I(Φq)−1, where

V (Φq) =∑n

i=1 Si(Φq)Si(Φq)′.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 24: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Likelihood Inference for MQ

Standard Errors

Oakes (1999)’s identity

I(Φq) = −

{∂2Q(Φq | Φq)

∂Φq∂Φ′q

∣∣∣∣∣Φq=Φq

+∂2Q(Φq | Φq)

∂Φq∂Φ′q

∣∣∣∣∣Φq=Φq

= A + B

A Cond. exp. of the complete data Hessian given the obs. data (EM)

B First derivative of the cond. exp. of the complete data Score giventhe obs. data (numDeriv in R)

Sandwich Cov(Φq

)= I(Φq)

−1V (Φq)I(Φq)−1, where

V (Φq) =∑n

i=1 Si(Φq)Si(Φq)′.

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 25: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Classical Datasets

Applications

Univariate response (Alfo, Salvati, Ranalli, Stat. Comp., 2016)

Pain Labor DataTreatment of lead-exposed children

Multivariate response (Joint work with M.F. Marino & N.Tzavidis)

The Millennium Cohort Study

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 26: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

The Millennium Cohort Study

Longitudinal study on children’s emotional/behaviouralproblems measured via the Strengths and DifficultiesQuestionnaire (SDQ)

n = 9021 children born in the UK between Sept. 2000 andSept 2001

First information collected when children were around 9months old. Waves 2, 3, 4 took place around ages 2, 5, and 7

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 27: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Outcome variables

internalizing SDQ - i-SDQ (emotional problems): total scoreon 5 emotional symptom items + 5 peer problem items(0− 20)

externalising SDQ - e-SDQ (behavioural problems): total scoreon 5 conduct problem items + 5 hyperactivity items (0− 20)

i−SDQ0 5 10 15 20

0.0

0.1

0.2

0.3

0.4

e−SDQ0 5 10 15 20

0.00

0.05

0.10

0.15

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 28: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Multivariate Extension

yijh, h = 1, 2 observed outcomes

The joint conditional distribution from unit i is

fq(yi | βq, bi,q,σq) =H∏h=1

∏j

fq(yijh | βh,q, bih,q, σh,q

).

Conditional independence assumption

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 29: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Covariates

ALE11 : number of potentially Adverse Life Events (0− 11)

SED4 : family poverty score measured on the SED scale (0− 4)

KESSM: maternal depression score measured on the Kessler scale(0− 24)

IMD: neighborhood deprivation rank measured by the Index ofMultiple Deprivation with lower values corresponding to higherdeprivation (1− 10)

Age: child’s age

Maternal education: no qualification (bsl.), degree, GCSE

Ethnicity : non-white (bsl.), white

Gender: female (bsl.), male

Statification: advantaged (bsl.), ethnic, disadvantaged

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 30: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Modeling details

Focus on more severe emotional and behavioural problems,i.e. q = {0.50, 0.75, 0.90}Discrete random intercepts to account for dependence

Age is centered around the mean and a squared effect is alsoconsidered

ALE11, SED4, KESSM, and IMD are centered around theirindividual means to account for between/within individualeffects

BIC is used to select the optimal model (G = 1, . . . , 15)

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 31: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Discrete distributions of random effects

−2 0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

i−SDQ

Locations

Est

imat

ed c

df

q = 0.50

q = 0.75

q = 0.90

−4 −2 0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

e−SDQ

Locations

Est

imat

ed c

df

q = 0.50

q = 0.75

q = 0.90

Higher dispersion for e-SDQ intercepts

The probability of higher components increases with q

Random intercept distribution is quite far from symmetry andunimodality

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 32: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Model for the M-median

i-SDQ e-SDQEst se Est se

Age -0.02 0.04 -0.45 0.05Age2 0.07 0.01 0.21 0.02ALE11 mean 0.09 0.23 0.19 0.04ALE11 0.06 0.02 0.09 0.06SED4 mean 0.12 0.05 0.17 0.14SED4 -0.04 0.06 -0.01 0.07Kessm mean 0.17 0.08 0.23 0.09Kessm 0.08 0.01 0.11 0.02Degree -0.66 0.74 -1.17 0.44Gcse -0.41 0.34 -0.50 0.27White -0.31 0.11 0.17 0.16Male 0.05 0.12 0.75 0.16IMD mean -0.02 0.04 -0.04 0.04IMD -0.00 0.03 -0.03 0.04Ethnic st. 0.18 0.10 -0.05 0.22Disadv st. 0.07 0.39 0.11 0.32σu 1.72 2.52

Both i-SDQ and e-SDQ reduceas the time passes by untilchildren are 5 years old andstart increase afterwards

Adverse life events (ALE11)and maternal depression(KESSM) are positivelyassociated with both responses

Family poverty (SED4) seemsto affect i-SDQ only

White children have loweri-SDQ

Males have higher e-SDQ

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 33: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Model for M-q = 0.75

i-SDQ e-SDQEst se Est se

Age -0.01 0.01 -0.47 0.01Age2 0.08 0.01 0.24 0.01ALE11 mean 0.19 0.04 0.32 0.06ALE11 0.08 0.02 0.10 0.03SED4 mean 0.12 0.05 0.23 0.07SED4 -0.03 0.04 0.00 0.05Kessm mean 0.24 0.01 0.26 0.02Kessm 0.10 0.01 0.13 0.01Degree -0.78 0.12 -1.40 0.18Gcse -0.48 0.11 -0.60 0.15White -0.34 0.12 0.42 0.22Male 0.17 0.05 0.97 0.10IMD mean -0.05 0.02 -0.05 0.02IMD -0.01 0.02 -0.03 0.03Ethnic st. 0.22 0.13 -0.05 0.25Disadv st. 0.06 0.07 0.18 0.12σu 1.73 2.54

ALE11, SED4, and Kessmpositively affect bothresponses and their impact ishigher wrt q = 0.50

Males have more severeinternalising and externalisingproblems that females

Children living in less deprivedareas (higher IMD) havelower i-SDQ and e-SDQ

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 34: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

The Millennium Cohort Study

Model for M-q = 0.90

i-SDQ e-SDQEst se Est se

Age 0.04 0.01 -0.46 0.02Age2 0.09 0.01 0.25 0.01ALE11 mean 0.37 0.06 0.51 0.08ALE11 0.10 0.03 0.10 0.04SED4 mean 0.21 0.08 0.34 0.10SED4 -0.05 0.06 0.01 0.07Kessm mean 0.35 0.02 0.36 0.03Kessm 0.13 0.02 0.16 0.02Degree -1.05 0.14 -1.65 0.21Gcse -0.63 0.13 -0.75 0.19White -0.42 0.13 0.37 0.24Male 0.35 0.09 1.25 0.14IMD mean -0.09 0.02 -0.07 0.03IMD -0.00 0.04 -0.03 0.04Ethnic st. 0.14 0.16 -0.18 0.25Disadv st. -0.02 0.11 0.25 0.18σu 1.70 2.40

The effect of ALE11, SED4,maternal depression (KESSM),and neighbourhood deprivation(IMD) becomes much strongerfor high SDQ scores

Severe problems are less likelywith higher mother’seducational levels

The effect of race and genderbecomes more evident forhigher percentiles

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 35: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Conclusions

Conclusions

We have developed Q and MQ regression models that candeal with dependent observations: the dependence withinobservations from the same individual is modelled viaindividual-specific discrete random parameters

By suitably setting the tuning constant c to a large value, weget Finite Mixtures of Expectile regression models

Nonparametric distribution of the random effects is more inthe spirit of Q and MQ models

It is possible to carry out a ML inference and obtain analyticalSEs

It can be extended to handle Multivariate outcomes

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models

Page 36: Finite Mixtures of Quantile and M-quantile regression models · Likelihood Inference (focus on MQ) 3 Applications Pain Labor Data & Treatment of lead-exposed children The Millennium

Introduction on FM Finite Mixtures for Q and MQ Applications Conclusions

Conclusions

Future developments

Consider time-varying random parameters to model sources ofunobserved heterogeneity that evolve over time, e.g. viaLatent Markov Models (Farcomeni, 2012)

Extension to zero-inflated data

Extension to count data

Application in the small area estimation setting (focus is onprediction, rather than estimation)

Alfo, Salvati, Ranalli

Finite Mixtures of Quantile and M-quantile regression models