58
Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche a de Comunita, Universita degli studi di Milano, Italy, January 2018 Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 1 / 58

Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

  • Upload
    vokhue

  • View
    221

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Flexible Regression and SmoothingThe gamlss.family Distributions

Bob Rigby Mikis Stasinopoulos

Dipartimento di Scienze Cliniche a de Comunita, Universita degli studi di Milano, Italy, January 2018

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 1 / 58

Page 2: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Outline

1 Types of distribution in GAMLSS

2 Continuous distributionsSkewness and kurtosisTypes of continuous distributionsDistributions on <Distributions on <+

Distributions on <(0,1)

Summary of methods of generating distributionsTheoretical ComparisonTypes of tailsFitting a distribution

3 End

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 2 / 58

Page 3: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Outline

Information

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 3 / 58

Page 4: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Types of distribution in GAMLSS

Different types of distribution in GAMLSS

1 continuous distributions: fY (y |θ), are usually defined on (−∞,+∞),(0,+∞) or (0, 1).

2 discrete distributions: P(Y = y |θ) are defined on y = 0, 1, 2, . . . , n,where n is a known finite value or n is infinite, i.e. usually discrete(count) values.

3 mixed distributions: (finite mixture distributions) are mixtures ofcontinuous and discrete distributions, i.e. continuous distributionswhere the range of Y has been expanded to include some discretevalues with non-zero probabilities.

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 4 / 58

Page 5: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Types of distribution in GAMLSS

Different types of distribution: demos

1 demo.GA()

2 demo.PO()

3 demo.BI()

4 demo.BE()

5 demo.BEINF()

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 5 / 58

Page 6: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Types of distribution in GAMLSS

Example of continuous distribution: SEP1

−4 −2 0 2 4

0.0

0.4

0.8

1.2

(a)

y

f(y)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

(b)

y

f(y)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

(c)

y

f(y)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

(d)

y

f(y)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 6 / 58

Page 7: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Types of distribution in GAMLSS

Example of discrete distribution: Sichel

0 5 10 15 20

0.00

0.05

0.10

0.15

Sichel, SICHEL

SICHEL( mu = 5, sigma = 11.04, nu = 0.98 )

y

pf, f

(y)

0 5 10 15 20

0.00

0.05

0.10

0.15

Sichel, SICHEL

SICHEL( mu = 5, sigma = 1.151, nu = 0 )

y

pf, f

(y)

0 5 10 15 20

0.00

0.05

0.10

0.15

Sichel, SICHEL

SICHEL( mu = 5, sigma = 0.9602, nu = −1 )

y

pf, f

(y)

0 5 10 15 20

0.00

0.05

0.10

0.15

Sichel, SICHEL

SICHEL( mu = 5, sigma = 43.9, nu = −3 )

y

pf, f

(y)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 7 / 58

Page 8: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Types of distribution in GAMLSS

Example of mixed distribution distributions: ZAGA

0 2 4 6 8 10

0.00

0.05

0.10

0.15

Zero adjusted GA

y

pdf ●

0 5 10 15 20

0.0

0.4

0.8

Zero adjusted Gamma c.d.f.

y

F(y)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 8 / 58

Page 9: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Skewness and kurtosis

Skewness and kurtosis

0 5 10 15 20

0.00

0.05

0.10

0.15

negative skewness

y

f(y)

0 5 10 15 20 25 30

0.00

0.05

0.10

0.15

positive skewness

y

f(y)

0 5 10 15 20

0.00

0.04

0.08

platy−kurtosis

y

f(y)

0 5 10 15 20

0.00

0.10

0.20

lepto−kurtosis

y

f(y)

Figure: Showing different types of continuous distributions

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 9 / 58

Page 10: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Types of continuous distributions

Types of continuous distributions

(−∞,∞): real line <(0,∞): positive real line <+

(0, 1) real line on interval <(0,1) (not containing zero or one)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 10 / 58

Page 11: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Location and scale family

For these distributions, if

Y ∼ D(µ, σ, ν, τ)

thenε = (Y − µ)/σ ∼ D(0, 1, ν, τ),

i.e. Y = µ+ σε, so Y is a scaled and shifted version of the randomvariable ε.All (−∞,∞) distributions in GAMLSS except exGAUS(µ, σ, τ) arelocation-scale families

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 11 / 58

Page 12: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Location-scale family

−4 −2 0 2 4

0.0

0.1

0.2

0.3

y

f(y)

GU(0,1)GU(−1,1)GU(1,1)

(a) changing the locationn parameter

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

y

f(y)

GU(0,1)GU(0,0.7)GU(0,1,5)

(b) changing the scale parameter

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 12 / 58

Page 13: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Two parameter on <

Two parameter distributions on (−∞,∞)

Gumbel GU(µ, σ) left skew

Logistic LO(µ, σ) lepto

Normal NO(µ, σ) and NO2(µ, σ)

Reverse Gumbel RG(µ, σ) right skew

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 13 / 58

Page 14: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Two parameter on <

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

y

f(y)

NO(0,1)LO(0,1)

(a) NO and LO distributions

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

y

f(y)

NO(0,1)GU(0,1)RG(0,1)

(b) NO, GU, ans RG distributions

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 14 / 58

Page 15: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Three parameter on <

exponential Gaussian : exGAUS(µ, σ, ν) for modelling right skew data,

normal family: NOF(µ, σ, ν) for modelling mean and variance relationshipsfollowing the power law;

power exponential: PE(µ, σ, ν) and PE2(µ, σ, ν) for modelling lepto andplaty kurtotic data;

t family: TF(µ, σ, ν) and TF2(µ, σ, ν) for modelling lepro kurtotic data;

skew normal: SN1(µ, σ, ν) and SK2(µ, σ, ν) for modellinng skewness indata.

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 15 / 58

Page 16: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Three parameter on <: skew normal type 1

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

y

f(y)

ν = −100ν = −3ν = −1ν = 0

f(y)

y −4 −2 0 2 4

y

f(y)

ν = 100ν = 3ν = 1ν = 0

y

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 16 / 58

Page 17: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Three parameter on <: PE and TF distributions

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

y

f(y)

ν = 1 ν = 2ν = 10ν = 100

(a) power exponential

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

y

f(y)

(b) the t−family

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 17 / 58

Page 18: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Four parameter on <

exponential generalised beta type 2: EGB2(µ, σ, ν, τ) skewness andleptokutrosis;

generalised t: GT(µ, σ, ν, τ) kurtosis;

Johnson’s SU: JSU(µ, σ, ν, τ) and JSUo(µ, σ, ν, τ) skewness andleptokurtosis;

normal-exponential-t: NET(µ, σ, ν, τ), robustly location and scale;

skew exponential power: SEP1(µ, σ, ν, τ), SEP2(µ, σ, ν, τ), SEP3(µ, σ, ν, τ)and SEP4(µ, σ, ν, τ) skewness and lepto-platy;

sinh-arcsinh: SHASH(µ, σ, ν, τ), SHASHo(µ, σ, ν, τ) and SHASHo2(µ, σ, ν, τ)skewness and lepto-platy;

skew t: ST1(µ, σ, ν, τ), ST2(µ, σ, ν, τ), ST3(µ, σ, ν, τ),ST4(µ, σ, ν, τ), ST5(µ, σ, ν, τ) and SST(µ, σ, ν, τ) skewnessand leptokurtosis.

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 18 / 58

Page 19: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Example of four parameter on <: SEP1

−4 −2 0 2 4

0.0

0.4

0.8

1.2

(a)

y

f(y)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

(b)

y

f(y)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

(c)

y

f(y)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

(d)

y

f(y)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 19 / 58

Page 20: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

The NET distribution

−4 −2 0 2 4

0.00

0.05

0.10

0.15

0.20

0.25

Y

f Y(y)

The NET distribution

normal

exponential

t−distribution

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 20 / 58

Page 21: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Summary of GAMLSS family distributions defined on(−∞,+∞)

Distributions family no par. skewness kurtosis

Exp. Gaussian exGAUS 3 positive -

Exp.G. beta 2 EGB2 4 both lepto

Gen. t GT 4 (symmetric) lepto

Gumbel GU 2 (negative) -

Johnson’s SU JSU, JSUo 4 both lepto

Logistic LO 2 (symmetric) (lepto)

Normal-Expon.-t NET 2,(2) (symmetric) lepto

Normal NO-NO2 2 (symmetric)

Normal Family NOF 3 (symmetric) (meso)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 21 / 58

Page 22: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <

Summary of GAMLSS family distributions defined on(−∞,+∞)

Power Expon. PE-PE2 3 (symmetric) both

Reverse Gumbel RG 2 positive

Sinh Arcsinh SHASH, 4 both both

Skew Exp. Power SEP1-SEP4 4 both both

Skew t ST1-ST5, SST 4 both lepto

t Family TF 3 (symmetric) lepto

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 22 / 58

Page 23: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: scale family

If a random variable is distributed as

Y ∼ D(µ, σ, ν, τ)

andε = (Y /µ) ∼ D(1, σ, ν, τ)

then Y has a scale family and the random variable

Y = µε

is a scaled version of the random variable ε.µ does not effect the shape.

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 23 / 58

Page 24: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: Weibull pdf

0 1 2 3 4 5 6

y

f(y)

σ = 0.5σ = 1σ = 6

µ = 1

0 1 2 3 4 5 6

0.0

0.5

1.0

1.5

2.0

0 1 2 3 4 5 6

y

f(y)

µ = 2

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 24 / 58

Page 25: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: one and two parameters

exponencial EXP(µ, σ)

gamma: GA(µ, σ) member of the exponential family;

inverse gamma: IGAMMA(µ, σ);

inverse Gaussian: IG(µ, σ) a member of the exponential family;

log-normal: LOGNO(µ, σ) and LOGNO2(µ, σ).

Pareto: PARETO(µ, σ), PARETO2o(µ, σ) and GP(µ, σ) for heavy tail;

Weibull: WEI(µ, σ), WEI2(µ, σ) and WEI3(µ, σ) used in survivalanalysis.

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 25 / 58

Page 26: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: Weibull survival andhazard functions

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

Y

S(Y

)

σ = 0.5σ = 1σ = 6

(a) survival function

0 1 2 3 4 5 6

y

h(y)

σ = 0.5σ = 1σ = 6

(b) hazard function

0 1 2 3 4 5 6

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 26 / 58

Page 27: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: three parameters

Box-Cox Cole and Green BCCG(µ, σ, ν) known also as the LMS method incentile estimation.

generalised gamma GG(µ, σ, ν)

generalised inverse Gaussian GIG(µ, σ, ν)

log-normal family LNO(µ, σ, ν) based on the standard Box-Coxtransformation

Z =

(Y ν−1)

ν if ν 6= 0

log(Y ) if ν = 0

(1)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 27 / 58

Page 28: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: four parameters

Box-Cox power exponential BCPE(µ, σ, ν, τ) and BCPEo(µ, σ, ν, τ)skewness and platy-lepto;

Box-Cox t BCT(µ, σ, ν, τ) and BCTo(µ, σ, ν, τ) skewness andleptokurtosis;

generalised beta type 2 GB2(µ, σ, ν, τ) skewness and platy-lepto.

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 28 / 58

Page 29: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous distribution on <+: four parameters, BCT

01

23

f(x)

σ = 0.15

y

σ = 0.2

BCT

y

σ = 0.5

ν = −4ν = 0ν = 2

τ = 100

01

23

f(x)

y y

τ = 5

0 1 2 3 4

01

23

f(x)

y

0 1 2 3 4

yy

0 1 2 3 4

yy

τ = 1

Figure: Box-Cox t Distribution BCT (µ, σ, ν, τ). Parameter values µ = 1,σ = (0.15, 0.20, 0.50), ν = (−4, 0, 2), τ = (1, 5, 100).

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 29 / 58

Page 30: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Continuous GAMLSS family distributions defined on(0,+∞)

Distributions family no par. skewness kurtosis

BCCG BCCG 3 both

BCPE BCPE 4 both both

BCT BCT 4 both lepto

Exponential EXP 1 (positive) -

Gamma GA 2 (positive) -

Gen. Beta type 2 GB2 4 both both

Gen. Gamma GG-GG2 3 positive -

Gen. Inv. Gaussian GIG 3 positive -

Inv. Gaussian IG 2 (positive) -

Log Normal LOGNO 2 (positive) -

Log Normal family LNO 2,(1) positive

Reverse Gen. Extreme RGE 3 positive -

Weibull WEI-WEI3 2 (positive) -

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 30 / 58

Page 31: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Transformation from (−∞,∞) to (0,+∞)

Any distribution for Z on (−∞,∞) can be transformed to acorresponding distribution for Y = exp(Z ) on (0,+∞)

For example: from t distribution to log t distribution

gen.Family("TF", type="log")

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 31 / 58

Page 32: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Positive response: log T distribution

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

x

dlog

TF

(x, m

u =

−5,

sig

ma

= 1

, nu

= 1

0)

(a)

mu = −5mu = −1mu = 0mu = 1mu = 2

0.0 0.5 1.0 1.5 2.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

dlog

TF

(x, m

u =

0, s

igm

a =

0.5

, nu

= 1

0)

(b)

sigma = .5sigma = 1sigma = 2sigma = 5

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

1.0

x

dlog

TF

(x, m

u =

0, s

igm

a =

0.5

, nu

= 1

000)

(c)

nu = 1000nu = 10nu = 2nu = 1

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 32 / 58

Page 33: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Positive response: log T distribution

0 2 4 6 8 10

0.00.2

0.40.6

x

dlogT

F(x,

mu =

0)

0 2 4 6 8 10

0.00.2

0.40.6

0.81.0

x

plogT

F(x,

mu =

0)

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

x

qlogT

F(x,

mu =

0)

Histogram of Y

Y

Frequency

0 5 10 15 20 25 30

050

100

150

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 33 / 58

Page 34: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Truncating from (−∞,∞) to (0,+∞)

Any distribution for Z on (−∞,∞) can be truncated to acorresponding distribution for Y on (0,+∞)

For example: from SST(µ, σ, ν, τ) distribution toflipped-SST(µ, σ, ν, τ) distribution

gen.trun(0,"SST",type="left")

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 34 / 58

Page 35: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <+

Truncating from (−∞,∞) to (0,+∞)

0 1 2 3 4

0.0

0.5

1.0

1.5

x

dSS

Ttr

(x, m

u =

0.1

, sig

ma

= 0

.5, n

u =

1, t

au =

10)

(a)

mu = .1mu = 1mu = 2

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

x

dSS

Ttr

(x, m

u =

2, s

igm

a =

0.5

, nu

= 1

, tau

= 1

0)

(b)

sigma = .5sigma = 1sigma = 2

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

x

dSS

Ttr

(x, m

u =

2, s

igm

a =

0.5

, nu

= 0

.1, t

au =

10)

(c)

nu = 0.1nu = 1nu = 2

0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

1.2

x

dSS

Ttr

(x, m

u =

2, s

igm

a =

0.5

, nu

= 1

, tau

= 3

)

(d)

tau = 3tau = 5tau = 100

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 35 / 58

Page 36: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <(0,1)

Continuous GAMLSS family distributions defined on(0, 1)

Distributions family no par. skewness kurtosis

Beta BE 2 (both) -

Beta original BEo 2 (both) -

Logit normal LOGITNO 2 (both) -

Generalized beta type 1 GB1 4 (both) (both)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 36 / 58

Page 37: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <(0,1)

Transformation from (−∞,∞) to (0, 1)

Any distribution for Z on (−∞,∞) can be transformed to acorresponding distribution for Y = 1

1+e−Z on (0, 1)

For example: from t distribution to logit t distribution

gen.Family("TF", "logit")

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 37 / 58

Page 38: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <(0,1)

logit T distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

dlog

itTF

(x, m

u =

−5,

sig

ma

= 1

, nu

= 1

0)

(a)

mu = −5mu = −1mu = 0mu = 1mu = 5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

dlog

itTF

(x, m

u =

0, s

igm

a =

0.5

, nu

= 1

0)

(b)

sigma = .5sigma = 1sigma = 2sigma = 5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

dlog

itTF

(x, m

u =

0, s

igm

a =

1, n

u =

100

0)

(c)

nu = 1000nu = 10nu = 5nu = 1

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 38 / 58

Page 39: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <(0,1)

logit T distribution

0.0 0.2 0.4 0.6 0.8 1.0

0.51.0

1.5

x

dlogit

TF(x,

mu =

0)

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

x

plogit

TF(x,

mu =

0)

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

x

qlogit

TF(x,

mu =

0)

Histogram of Y

Y

Frequency

0.0 0.2 0.4 0.6 0.8 1.0

05

1015

2025

30

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 39 / 58

Page 40: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <(0,1)

Truncating from (−∞,∞) to (0, 1)

Any distribution for Z on (−∞,∞) can be truncated to acorresponding distribution for Y on (0, 1)

For example: from SST(µ, σ, ν, τ) distribution to trun-SST(µ, σ, ν, τ)distribution

gen.trun(c(0,1),"SST",type="both")

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 40 / 58

Page 41: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Distributions on <(0,1)

Truncated SST on (0, 1)

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

x

dSS

Ttr

(x, m

u =

0.1

, sig

ma

= 0

.1, n

u =

1, t

au =

10)

(a)

mu = .1mu = .5mu = .9

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

x

dSS

Ttr

(x, m

u =

0.5

, sig

ma

= 0

.1, n

u =

1, t

au =

10)

(b)

sigma = .1sigma = .2sigma = .5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

x

dSS

Ttr

(x, m

u =

0.5

, sig

ma

= 0

.2, n

u =

0.1

, tau

= 1

0)

(c)

nu = .1nu = 2nu = .5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

x

dSS

Ttr

(x, m

u =

0.5

, sig

ma

= 0

.2, n

u =

1, t

au =

3)

(d)

tau = 3tau = 4tau = 10

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 41 / 58

Page 42: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Summary of methods of generating distributions

Summary of methods of generating distributions

1 univariate transformation from a single random variable (LOGNO,BCCG, BCPE, BCT, EGB2, SHASHo, inverse log and logictransformations . . .)

2 transformation from two or more random variables (TF, ST2,exGAUS)

3 truncation distributions

4 a (continuous or finite) mixture of distributions (TF, GT, GB2, EGB2)

5 Azzalini type methods (SN1, SEP1, SEP2, ST1, ST2)

6 splicing distributions (SN2, SEP3, SEP4, ST3, ST4, NET)

7 stopped sums

8 systems of distributions

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 42 / 58

Page 43: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Theoretical Comparison

Comparison of properties of distributions

1 Explicit pdf cdf and inverse cdf

2 Explicit centiles and centile based measures (e.g median)

3 Explicit moment based measures (e.g. mean)

4 Explicit mode(s)

5 Continuity of the pdf and its derivatives with respect to y

6 Continuity of the pdf with respect to µ, σ, ν and τ

7 Flexibility in modelling skewness and kurtosis

8 Range of Y

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 43 / 58

Page 44: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Theoretical Comparison

Theoretical comparison of distributions

OK we have a lot of distributions in GAMLSS.Do we need any more?

Do we need all of them or some of them are redundant?

Is there any way to compare them theoretically?

We can compare the tail behaviour in of the distributions

Since most of then are location-scale we can compare their flexibilityin terms of skewness and kurtosis

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 44 / 58

Page 45: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Theoretical Comparison

Comparing tails: real line

−10 −5 0 5 10

−12−10

−8−6

−4−2

x

log(pd

f)

NormalCauchyLaplace

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 45 / 58

Page 46: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Theoretical Comparison

Comparing tails: positive real line

0 5 10 15 20 25 30

−7−6

−5−4

−3−2

−10

x

log(pd

f)

exponentialPareto 2log Normal

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 46 / 58

Page 47: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Types of tails

Types of tails

There are three main forms for logfY (y) for a tail of Y

Type I: −k2 (log |y |)k1 ,

Type II: −k4 |y |k3 ,

Type III: −k6 ek5|y |,

decreasing k’s results in a heavier tail,

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 47 / 58

Page 48: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Types of tails

Types of tails: classification (real line)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 48 / 58

Page 49: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Types of tails

Types of tails: classification (positive real line)

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 49 / 58

Page 50: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

How to fit distributions in R

optim() or mle() requires initial parameter values

gamlsssML() mle a using variation of the mle() function

gamlss(): mle using RS or CG or mixed algorithms,

histDist() mle using RS or CG or mixed algorithms, for a univariatesample only, and plots a histogram of the data with thefitted distribution

fitDist() fits a set of distributions and chooses the one with thesmallest GAIC

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 50 / 58

Page 51: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data

Data summary:

R data file: glass in package gamlss.data of dimensions 63× 1

sourse: Smith and Naylor (1987)

strength: the strength of glass fibres (the unitof measurement are not given).

purpose: to demonstrate the fitting of a parametricdistribution to the data.

conclusion: a SEP4 distribution fits adequately

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 51 / 58

Page 52: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data: SBC

glass strength

glass$strength

Freque

ncy

0.5 1.0 1.5 2.0

05

1015

20

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 52 / 58

Page 53: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data: creating truncateddistribution

data(glass)

library(gamlss.tr)

gen.trun(par = 0, family = TF)

A truncated family of distributions from TF

has been generated

and saved under the names:

dTFtr pTFtr qTFtr rTFtr TFtr

The type of truncation is left and the

truncation parameter is 0

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 53 / 58

Page 54: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data: fitting the models

> m1<-fitDist(strength, data=glass, k=2, extra="TFtr")

# AIC

> m2<-fitDist(strength, data=glass, k=log(length(strength)),

extra="TFtr") # SBC

> m1$fit[1:8]

SEP4 SEP3 SHASHo EGB2 JSU BCPEo ST2 BCTo

27.68790 27.98191 28.01800 28.38691 30.41380 31.17693 31.40101 31.47677

> m2$fit[1:8]

SEP4 SEP3 SHASHo EGB2 GU WEI3 JSU PE

36.26044 36.55444 36.59054 36.95945 38.19839 38.69995 38.98634 39.00792

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 54 / 58

Page 55: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data: Results

histDist(glass$strength, SEP4, nbins = 13,

main = "SEP4 distribution",

method = mixed(20, 50))

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 55 / 58

Page 56: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data: SBC

0.5 1.0 1.5 2.0 2.5

0.00.5

1.01.5

2.02.5

SEP4 distribution

glass$strength

f()

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 56 / 58

Page 57: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

Continuous distributions Fitting a distribution

Strength of glass fibres data: summary

Call: gamlss(formula = y~1, family = FA)

Mu Coefficients:

(Intercept)

1.581

Sigma Coefficients:

(Intercept)

-1.437

Nu Coefficients:

(Intercept)

-0.1280

Tau Coefficients:

(Intercept)

0.3183

Degrees of Freedom for the fit: 4 Residual Deg. of Freedom 59

Global Deviance: 19.8111

AIC: 27.8111

SBC: 36.3836

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 57 / 58

Page 58: Flexible Regression and Smoothing - gamlss.com€¦ · Flexible Regression and Smoothing The gamlss.family Distributions Bob Rigby Mikis Stasinopoulos Dipartimento di Scienze Cliniche

End

ENDfor more information see

www.gamlss.org

Bob Rigby, Mikis Stasinopoulos Flexible Regression and Smoothing 2016 58 / 58