39
Estimating Parameters for Incomplete Data William White

Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Estimating Parameters for Incomplete Data

William White

Page 2: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Insurance Agent

• Auto Insurance Agency • Task

– Claims in a week 294 340 384 457 680 855 974 1193 1340 1884 2558 9743

– Boss, “Is this a good representation of the population?”

Page 3: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Insurance Agent

• Things to think of. – How should it look? – The distribution should be skewed right.

294 340 384 457 680 855 974 1193 1340 1884 2558 9743

$ per Claim

Freq

uenc

y of

Cla

ims

Page 4: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

-  If is .0001

Insurance Agent

• Exponential Distribution, – If is 1

10 5

.35

.15

80,000 40,000

.00005

.00002

Page 5: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Insurance Agent

• How can we estimate the value of ? – Find an estimator

• What is an estimator? – Uses sample data to find approximations of actual parameters

Page 6: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Estimator

•  What do we need to look for? – Consistent

•  The estimator value converges to the population value.

Estimate

True Parameter

Sample Size

Error

Page 7: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Estimator

•  What do we need to look for? – Efficient

•  For a fixed sample size, there is less variability in the estimator.

•  Sample means have less variability than sample medians.

Sample Median

Sample Mean

Page 8: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Estimator

•  What do we need to look for? – Unbiased

•  As people take more samples, the expected value of the parameter will become the population parameter.

Estimate

True Parameter

Sample Size

True Parameter

Sample Size

Page 9: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

Sir Ronald A. Fisher (1890-1962) – Maximum Likelihood Estimator (MLE)

– Solve the problems of estimation

– Written in 1912 – Completed in 1922

Page 10: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

•  Characteristics of the MLE – Very versatile – Applies to most types of data – Simplistic

•  Can be very efficient with little calculations

Page 11: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

•  Uses the likelihood function – Finds the probability of obtaining the

sample results that were obtained – Product of probability density functions

(pdf) with independent random variables

Page 12: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator Pr

obab

ility

Page 13: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

• Likelihood function – Sample Data- Claims

294 340 384 457 680 855 974 1193 1340 1884 2558 9743

– What parameter is most likely for our sample?

– If we knew

is the probability density not the probability

Page 14: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

• Likelihood function – Probability density function

• Our samples are identically distributed •  • 

– Restate: If we had a value for the parameter, what is the likelihood we would get the sample set? – Because the events are independent of each other

is the probability density not the probability

Page 15: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

• Likelihood function

is the probability density not the probability

Page 16: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

•  What makes our product maximized? Pr

obab

ility

Page 17: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

• Loglikelihood Function – Taking the product can be cumbersome – Often easier due to properties of Logarithms

•  • 

– Do logarithms change up our evaluation? • No, because logarithms are increasing, we are still looking for the maximum value.

Page 18: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

• Example using the Exponential Distribution

Page 19: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

Page 20: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Maximum Likelihood Estimator

• With calculus we can find the MLE by taking the derivative, setting it equal to 0, and solving for the parameter. (We can use the 2nd derivative to check maximum.)

Because this is are estimate for the population parameter we are also concluding that the sample mean is an estimate for the population mean.

Page 21: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

What Do We Think?

• Let’s use our claims with the Exponential Distribution, sample mean= 1725.2

Page 22: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

What Do We Think?

•  Why are there no claims below 294? 294 340 384 457 680 855 974 1193 1340

1884 2558 9743

$ per Claim

Prob

abili

ty o

f Cl

aim

Page 23: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Deductible

•  We forgot there is a $250 deductible! – No one is going to file a claim if the damage

is not worth $250. •  Incomplete data- Truncated

10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743

Page 24: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

•  The MLE also works with incomplete data. •  Incomplete data occurs when specific

observations are either lost or are not recorded exactly.

•  Two Types –  Truncated data

•  When data is excluded.

–  Censored •  When the number of observations is known, but the values of the

observations are unknown.

Page 25: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• Truncated Data – Vehicle insurance with a Deductible of $250 – Claims are filed when greater than $250

10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743

Page 26: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• This is an example of data that is truncated from below, or the left, since the data below the set value, $250, is truncated.

• Truncated from above, the right, is when data is truncated above a set value.

$ per Claim

Prob

abili

ty o

f Cl

aim

$250 $5,000

=undefined

Page 27: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• Censored data 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 – Policy Limit

• All values above $1,000, are set equal to $1,000.

10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1000 1000 1000 1000 1000

Page 28: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

•  This example would be considered censored from above, or the right, since the data above the set value, 1000, is censored.

•  Censored from below, or the left, would be the case when data is censored below a set value.

$ per Claim

Prob

abili

ty o

f Cl

aim

$1,000 $500

=$1,000

Page 29: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• Estimate with deductible and policy limit- 294 340 384 457 680 855 974 1000 1000 1000 1000 1000

• What are we estimating for? – We want to estimate for our entire sample using truncated and censored data. 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 – We want our estimate to be unbiased.

Page 30: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• Estimating with incomplete data – Group X- modified value, claim amount – Group Y- modified values, amount paid Group X- 294 340 384 457 680 855 974 1000 1000 1000 1000 1000

Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Page 31: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

Prob

abili

ty 1

y 250 250+y 750 Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Page 32: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• Estimating with incomplete data

Prob

abili

ty 1

y 250+y 750

Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

250

Page 33: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

• Solving with incomplete

Prob

abili

ty

y 750

Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Page 34: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Page 35: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

What’s Our Result?

Boss, “Is this a good representation of the population?”

What do we need to tell the boss? Estimated mean is $854.86. If we compare this too what our complete data set mean, $565.05, we observe that our estimate is too high. This may mean that we have a considerably high amount of accidents below the deductible.

Excel File

Page 36: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

What’s Our Result?

•  The results show that it is a good representation of our received claims, but it is not a good representation for our population.

Page 37: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Incomplete Data

•  Why should we use the MLE? – “One of the major attractions of this

estimator is that it is almost always available. That is, if you can write an expression for the desired probabilities, you can execute this method. If you cannot write and evaluate an expression for probabilities using your model, there is no point in postulating that model in the first place because you will not be able to use it to solve your problem.” (Klugman, Panjer, and Willmot)

Page 38: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

Thanks!

•  Dr. Troy Riggs- Project Advisor

•  Dr. Matt Lunsford, Seminar Instructor

Page 39: Estimating Parameters for Incomplete Data · Insurance Agent • Things to think of. – How should it look? – The distribution should be skewed right. 294 340 384 457 680 855 974

References

Klugman, Stuart A., Harry H. Panjer, and Gordon E. Willmot. Loss Models: From Data to Decisions. New York: John Wiley and Sons, Inc, 1998.

---. Loss Models: From Data to Decisions. 2nd ed. New York: John Wiley and Sons, Inc, 2004.

Myung, In Jae. "Tutorial on Maximum Likelihood Estimation." Journal of Mathematical Psychology. 47 (2003): 93.