28
1 Parametric Distributions

Parametric Distributions - Trinity College Dublin

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Parametric Distributions - Trinity College Dublin

1

Parametric Distributions

Page 2: Parametric Distributions - Trinity College Dublin

2

Definitions

• A random variable, X is a map from the

result of an experiment or observation to the

real numbers.

• The cumulative distribution function of a

random variable is defined through the

probability measure as FX(z)=P(X≤z).

• This is often written F(z).

Page 3: Parametric Distributions - Trinity College Dublin

3

Properties of F

• F() is non-decreasing.

• F() vanishes to 0 on lhs and increases to 1

on rhs.

• Note that F() is right continuous.

• For any such F(), a random variable can be

created. (Skorokhod Representation)

Page 4: Parametric Distributions - Trinity College Dublin

4

pdf

• Where F(x) can be written as the integral

from minus infinity to x of some function,

f(z),

– Then f(z) is termed a density (or pdf).

– Where this is expressible as a discrete sum, the

discrete function f(j) is also termed a pdf.

• A pdf will tell us which values of the R.V.

are most likely.

Page 5: Parametric Distributions - Trinity College Dublin

5

Important Note

• Thus, this idea is very general.

• Lots of F()s are possible.

• A closed functional form for F() and f() is not required.

• Exercise:

– Draw some ‘possible’ cdfs.

– Check that they fulfil the conditions.

– A note about empirical cdfs.

Page 6: Parametric Distributions - Trinity College Dublin

6

Example

• A lecturer is thinking of doing building work in his house, but is waiting to hear about the profits from a venture he was involved in before deciding whether to proceed.

• He knows he will get at least €10,000 net from the project.

• Things are going well, and it is likely that the actual returns will be around €20,000.

• There is an outside chance that €40,000 could be returned, but this is unlikely.

Page 7: Parametric Distributions - Trinity College Dublin

7

Example II

• A lecturer takes about 22 minutes to cycle

to work.

• On a good day, and pedaling hard, he can

make it in 15 minutes. The fastest he has

done it is 12 minutes.

• It would take 90 minutes to walk, so this is

a realistic upper bound for cycling.

Page 8: Parametric Distributions - Trinity College Dublin

8

Example III

• An SS MSISS is going on a J1 trip to America for

90 days.

• They expect to “socialise” with friends about three

times a week.

• The consequences of socialising results in a severe

hangover about 25% of the time.

• How many paracetamol are likely required for the

duration of the visit? (Assume 1 tablet per

hangover)

Page 9: Parametric Distributions - Trinity College Dublin

9

Parametric Forms

• Over the years, mathematicians have examined

functions that have the properties described.

• Many of these have arisen through considering

combinations of other simple functions.

• These functions have parameters, which can be

modified to change the shape of the curve.

• However, the overall functional form stays the

same.

Page 10: Parametric Distributions - Trinity College Dublin

10

Advantages Parametric Dists

• Properties and behaviours well understood.

• Moments can readily be calculated.

• Black box software available.

• Can readily communicate models to colleagues.

• Sufficiently flexible for most purposes.

• As realistic as empirical functions and may be more physically justifiable.

Page 11: Parametric Distributions - Trinity College Dublin

11

Disadvantages

• May not exactly match application (ease of

use vs tool availability compromise.)

• Results may be sensitive to distributional

assumptions.

• Sometimes easy to program without a full

understanding of what is going on –

downside of black box.

Page 12: Parametric Distributions - Trinity College Dublin

12

Some Models

• Bernoulli - Br(x|q) - dbinom(size = 1)

• Binomial - Bi(x|q,n) - dbinom()

• Poisson - Pn(x|l) - dpois()

• Beta - Be(x|a,b) - dbeta()

• Uniform - Un(x|a,b) - dunif()

• Gamma - Ga(x|a,b) - dgamma()

• Exponential Ex(x|q) - dexp()

• Normal - N(x|m,s) - dnorm()

Page 13: Parametric Distributions - Trinity College Dublin

13

Binomial

• Bi(x|q,n)

• Pdf f(x) =

– nCxqx(1-q)(n-x)

• E(x) = nq

• Var(x) =nq(1-q)

• Graph for n=9 and

q=0.5.

Page 14: Parametric Distributions - Trinity College Dublin

14

Binomial

• Cdf

• This is a step function,

since can only have

integer values.

Page 15: Parametric Distributions - Trinity College Dublin

15

Normal (Gaussian)

• N(x|m,s)

• Pdf f(x) =

– cexp{-0.5 s-2(x-m)2}

• E(x) = m

• Var(x) =s2

• Graph for m=0 and

s=1.0.

Page 16: Parametric Distributions - Trinity College Dublin

16

Normal

• Cdf

• This is smooth since

the underlying rv is

continuous.

• Note that neither 0 nor

1 is reached in the

plotted region.

Page 17: Parametric Distributions - Trinity College Dublin

17

Choosing Models

• Thus, for example, if one is interested in a

smoothly varying quantity, such as response rate,

then one might consider ‘modeling’ it using a

Normal distribution.

• If an ‘expert’ tells you that response rate is likely

to be around 7%, but could go from 5% to 9%,

neither of which is very likely, what values of

parameters for a Normal model might represent

this ‘belief’?

Page 18: Parametric Distributions - Trinity College Dublin

18

R

• Access via web page - also on lab machines.

• Command line interface.x <- seq(0, 5, length.int = 1000)

y <- dgamma(x, 2, 2)

plot(x, y, type = "l", col = 1)

• Sets up vector, x, taking sequential values between

0 and 5.

• Sets up y to be the pdf of x.

• Plots y as a function of x, as a line plot, in black.

Page 19: Parametric Distributions - Trinity College Dublin

19

Norm (7,0.5) vs (7,0.8)

Page 20: Parametric Distributions - Trinity College Dublin

20

Issues

• What if the ‘belief’ says that high response rates are more likely than low ones (skew)?

• Can you draw a density that might match?

• What if there is likely to be a response rate of around 6%, but if by chance a marketing stunt that is being run next week gets air time on radio, then the rate will be around 10%?

Page 21: Parametric Distributions - Trinity College Dublin

21

Exercise

• Write down a pdf for

– Skewed distribution

– Truncated distribution

– Mixture of distributions

• Show (in outline) that there exists a random

variable, which has as its pdf the quantity

that you have written down.

Page 22: Parametric Distributions - Trinity College Dublin

22

Gamma Distribution

• Ga(x|a,b) – shape

a and rate b

• Pdf f(x) =

– c x(a –1)exp(-bx)

• E(x) = a/ b

• Var(x) = a/(b2)

• Graph a=2, b=2

Page 23: Parametric Distributions - Trinity College Dublin

23

Use in modeling

• Thus, instead of fixing deterministic aspects of the

model, we can allow inputs to be defined by

parametric distributions.

• We still need to fix the parameters of the

distributions, but this may be much more realistic

than fixing values.

• Elicitation is the term given to the assignment of

parameters based on ‘expert opinion.’

Page 24: Parametric Distributions - Trinity College Dublin

24

Method

• Thus, we have the following method at the modeling step;

– Determine a ‘realistic’ model for the situation (conditional on particular values of inputs.)

– Examine which inputs have the biggest impact on the output variable of main interest.

– Model the uncertainty of the inputs through a probability distribution.

– Examine the impact on outputs.

Page 25: Parametric Distributions - Trinity College Dublin

25

Practicalities

• This can be done by;

– Examining the moments of the combinations of

random variables.

– Analytically (gives exact answer, but messy.)

– Simulation from the distributions of interest.

Page 26: Parametric Distributions - Trinity College Dublin

26

Simulation

• In order to ‘simulate’ values from the distribution of interest we need a system of generating random numbers.

• It suffices to be able to generate numbers from a uniform[0,1).

• If this can be done, then any random variable can be simulated.

• Example: Normsinv(Rand())

Page 27: Parametric Distributions - Trinity College Dublin

27

Exercise

• Examine each of the distributions listed

earlier in lectures.

• For each one, you should produce a pdf and

cdf for various parameters of interest.

• These graphs can readily be constructed in R. E.g., dnorm, pnorm.

Page 28: Parametric Distributions - Trinity College Dublin

28

Exercise II

• For the Norseman problem, examine the

impact of a response rate which is

unknown, but apriori believed to be

Normal, with mean 6% and standard

deviation 0.6% .

• Additionally, you might consider the impact

of Gamma distributed orders, with shape 10

and rate 12.