34
Statistics for Market Research A Brief Refresher Course by Brian Neill

Statistics Review

  • Upload
    bpneill

  • View
    1.013

  • Download
    0

Embed Size (px)

DESCRIPTION

Brief Review of Market Research concepts for Market Research

Citation preview

Page 1: Statistics Review

Statistics for Market Research

A Brief Refresher Course

by Brian Neill

Page 2: Statistics Review

2

Contents

Quick Review of Basic Concepts in Statistics: – Mode, Median, Mean, Sampling, Normal Curve

Intermediate Concepts– Confidence Intervals, Probability Sampling

Formulas & Calculations

________________________ CAGR explained

Page 3: Statistics Review

3

Purpose

To provide a brief refresher of the basics of statistics as it pertains to Market Research. Geared towards those with previous training in statistics.

Page 4: Statistics Review

4

Statistics Basic Concepts

Page 5: Statistics Review

5

Definitions & Basic Concepts

Mean - the average of all the data points. Mode - the most common data point to occur

– (e.g. $4.99 might be most common price at Wal-Mart) Median - the middle number in a ranking of data points (e.g.

7 students ranked from tallest to shortest, 4th one’s height is the median)

Parent Population - the totality of cases being studied. (e.g. All the Volvo owners in the world). This is the group of people or things we are trying to find out about.

Its not practical to measure the entire parent population, so Sampling is used to reflect the Parent Population. (e.g. We interview a sample of Volvo owners )

Page 6: Statistics Review

6

Sampling used to measure Parent Population

*******>>>^^^^^^^

********^^^^^^*******>>>>

********^^^^^^^^^^^^^^^^^^^^>>>>>>>

******^^*

>***>^^^^**^

Sample is taken from parent population. Measurements are taken on the sample. If

sampling was done correctly. measurements are representative of the Parent

Population

Page 7: Statistics Review

7

Definitions & Basic Concepts

Sampling Distribution - If we sample correctly, we can generalize our findings to the “real world” population we are studying.

e.g. We take a sample of Volvo owners. We ask them how many miles they drove in 1999. (Their answers are a Statistic - a measurement that we perform on the sample)

If we did the sampling correctly, we can make inferences about ALL Volvo drivers (The Parent Population)

Page 8: Statistics Review

8

Definitions & Basic Concepts

Probability samples – each population element has a known chance of being

included in the sample. We can then make inferences about the Parent Population.

Non-probability samples – samples using personal judgement. Since selection is

non-random, there is no way of estimating the probability that any element is sampled.

– Cannot estimate the adequacy of the sample result. Therefore cannot be sure your data reflects the entire population.

Page 9: Statistics Review

9

Definitions & Basic Concepts the Normal Distribution

Central Limit Theorem: Take a big enough random sample (probability sample) and your sample distribution will look like a normal curve. – you can calculate the mean and the variance, and

confidence limits.– You can make inferences about the Parent Population.

To make inferences using statistics your sample must be “large enough” and it must be random

Page 10: Statistics Review

10

Definitions & Basic Concepts

How Large a Random Sample do you need? Answer: Depends on how much variation you see

in the group you are studying. – If a lot of variation, you will need a larger sample size.– If little variation, need smaller sample size.

– Although, larger populations usually have larger variations, so size has indirect effect at times.

THE SIZE OF THE PARENT POPULATION HAS NO DIRECT EFFECT ON SIZE OF SAMPLE

NEEDED!

Page 11: Statistics Review

11

Definitions & Basic Concepts

Example of Little Variation: Avg.Diameter of U.S. Dime Coins. These coins are made to exacting standards, so even though there are many millions of dimes (i.e.. large parent population), you would need only a small sample to test diameter.

Example of Wide Variation: Average Height of City of Dallas’ 8,200 employees. Much greater variance than size of US dimes, so a bigger random sample is needed to test.

Example of Wider Variation. Average height of City of Dallas employees and their children. Even greater variance, so even bigger sample needed.

Page 12: Statistics Review

12

Practical Example

Nein! (NO!)

1) Sample not large enough (10 Companies is not enough, unless there is little variation among Company attitudes, which seems unlikely).

2) Plus your sample of respondents not chosen at random.

Pop Quiz: In a Concept test we talk to 10 Companies. We find that they would spend an average of $1 million on product x this year.Q: Can we conclude that Companies in U.S. would, on average, spend $1 million on product x?

Page 13: Statistics Review

13

Intermediate Concepts

Onward

and

Upward

Page 14: Statistics Review

14

Confidence Intervals

If sample is large and randomly selected, it looks like a normal curve. Normal curves have helpful properties...

In any Normal curve, 95 percent of the values are within 1.96 Standard Deviations () of the mean.

• 68.26 percent of the sample means will be within 1.0 () of the population mean.

Note:

S is our estimate of the standard deviation

of the parent population

() is the actual standard deviation

Page 15: Statistics Review

15

Confidence Intervals

1.96 std deviations S () = 95% of all observations

The Normal Curve

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2

1 stddeviation

2 std deviations

3 std deviations

Page 16: Statistics Review

16

Example of Confidence Intervals

Let’s say we wanted to find the average number of miles a Volvo is driven in a year (to plan our new warranty)

We randomly sample a large group of Volvo drivers and find that the average number of miles driven is 17,000 miles.

• (But their was a considerable variance)

– We choose a range 1.96S 3,000 miles. This means we can be 95% sure that the TRUE average number of miles is within plus or minus 3,000 miles of 17,000

– So are 95% confidence interval is -> 14,000 to 20,000 miles per year.

Page 17: Statistics Review

17

Confidence Intervals

1.96 std deviations S ()

= 95% of all observations

The Normal Curve

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2

1 stddeviation

2 std deviations

14,000 20,000Est. Mean = 17,000

95% chance the true mean falls within here

Page 18: Statistics Review

18

Example of Confidence Intervals

Suppose we decide this is too big of a spread. So we will accept less certainty in order to narrow our range

So we decide to go with 68.2 % of all owners (1 std deviation away from the average) =1,500 miles

Then we can say that there is a 68.2% chance of a Volvo being driven 15,500 to 18,500 miles per year.

Page 19: Statistics Review

19

Confidence Intervals

1 S () std deviations = 68.2% of all

observations

The Normal Curve

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2

1 stddeviation

2 std deviations

15,500 18,50017,000

68% chance the true mean falls within here

Page 20: Statistics Review

20

Example of Confidence Intervals

Alternative: If the Marketing department at Volvo did not want

to sacrifice precision and confidence, they could go out and re-test using a much larger sample of respondents.

Remember, where there is a lot of variance, you need a bigger random sample to make up for it.

Page 21: Statistics Review

21

Formulas & Calculations

“oooh me brain hurts”

Mr. Gumby—character on “Monty Python’s

Flying Circus”

Page 22: Statistics Review

22

Finding Confidence Intervals

When Population Variance unknown First find sample variance Š, by

Š²= (X- Avg.) ²/(n-1)

Then find Standard error of estimate of the mean S = Š/n for 95% confidence, 1.96 standard deviations, S so limits equal: Avergage-1.96S and Average + 1.96S

Page 23: Statistics Review

23

Sample Size - When finding an average or mean

When variance unknown, use

n = (z ²/r ²) * C² where z =confidence interval in std deviations r = relative precision desired C = your estimate of the variation of the sample (one

deviation away from the mean equals how many miles)

Page 24: Statistics Review

24

Sample Size - When finding an average or mean

Example want to find average miles driven by Volvo owners to plus/minus 500 miles, want 95% confidence in the results, and we think 1 std deviation will equal 3000 miles

n=(2²/500²)*3000² n = 144 we need a sample of 144 to get this level of

accuracy in estimating the average miles

Page 25: Statistics Review

25

Sample Size - When finding a proportion

When variance is unknown n = (z ²/r ²) * Π(1-Π) where z=confidence interval in std deviations r = relative precision desired Π = this is not “pi”. It’s your estimate of the proportion of

the population that has the characteristic you are looking for (e.g.. Percent of Companies in our sales area who are interest in a product). That’s right, you must estimate the That’s right, you must estimate the very quantity you are trying to find, in order to determine very quantity you are trying to find, in order to determine sample size!sample size!

Page 26: Statistics Review

26

Sample Size - When finding a proportion

For example, if we want to find how many Companies in our Sales area will want to buy product Q. We want 95 % confidence & a precision of plus/minus 5 percent. We guess that 20% will be interested.

n = (z ²/r ²) * Π(1- Π) n=(2²/0.05²)* (0.2)(1-0.2 ) n = 256 we must survey a sample of at least 256

Companies to get this level of accuracy in estimating the proportion that want product Q.

Page 27: Statistics Review

27

CAGR Explained Measuring Growth Rate

Page 28: Statistics Review

28

Measuring Growth Rate

ExampleYear Revenues

($Billion)

1997 1.28

1998 1.57

1999 1.78

2000 2.01

2001 2.22

2002 2.43

2003 2.64

2004 2.842005 3.04

Compound Annual Growth Rate (CAGR)Compound Annual Growth Rate is the most common measure of growth in most industries. CAGR removes the compounding portion to show a more accurate picture of growth (removes “the interest on interest”).

Example: Let’s calculate the average annual growth 1997-2005.

Using two methods:

1) Arithmetic (wrong) way

2) CAGR

Page 29: Statistics Review

29

A Common mistake when calculating the Growth Rate

Your first instinct might be to just take an “average” of growth and divide by number of time periods- but that will yield a wrong answer!

= (YRn-YR1)/ YR1

n

=(3.04 -1.28)/1.28

8 years

= 17% avg. growth rate

Where: •YR1 = Year 1 (first year) revenues

•YRn= Year n revenues (the last year of the forecast period)•n = number of years used for growth rate (count forward from YR1 to YRn, beginning year to ending year)You end up with an answer that is too large because

you have not removed the compounding portion (“the interest on interest”).

Page 30: Statistics Review

30

Solution

– The correct way to calculate CAGR in our example:

CAGR= ((YRn/YR1)1/n ) - 1

•Where: YR1 = Year 1 (first year) revenuesYRn= Year n revenues (the last year of the forecast period)n = number of years used for growth rate (count forward

from YR1 to YRn, beginning year to ending year)

CAGR = (3.04/1.28) 1/8 ) -1 = 1.114 - 1 CAGR = 0.114186 = 11.4186%

The correct answer in our example is that revenues grew 11.4% per year.

Page 31: Statistics Review

31

Proof that CAGR is right

Revenues ($)1997 start 1.2801998 add11.4186% 1.4261999 add11.4186% 1.5892000 add11.4186% 1.7702001 add11.4186% 1.9732002 add11.4186% 2.1982003 add11.4186% 2.4492004 add11.4186% 2.7282005 add11.4186% 3.040

We can check our answer by adding 11.4186% growth every year and you’ll arrive at the correct final year revenue

3.04B, exactly right!

Page 32: Statistics Review

32

Plotting These Average Growth Rates

17%

Page 33: Statistics Review

33

Reference

Churchill, Gilbert, A., Marketing Research, Methodological Foundations, 7th Ed. Dryden Press. Chapters 10 and 11.

(Gilbert teaches at U of Wisconsin, Madison)

Page 34: Statistics Review

34

The

End