Statistics Review

Statistics for Market Research

A Brief Refresher Course

by Brian Neill

2

Contents

Quick Review of Basic Concepts in Statistics: – Mode, Median, Mean, Sampling, Normal Curve

Intermediate Concepts– Confidence Intervals, Probability Sampling

Formulas & Calculations

________________________ CAGR explained

3

Purpose

To provide a brief refresher of the basics of statistics as it pertains to Market Research. Geared towards those with previous training in statistics.

4

Statistics Basic Concepts

5

Definitions & Basic Concepts

Mean - the average of all the data points. Mode - the most common data point to occur

– (e.g. $4.99 might be most common price at Wal-Mart) Median - the middle number in a ranking of data points (e.g.

7 students ranked from tallest to shortest, 4th one’s height is the median)

Parent Population - the totality of cases being studied. (e.g. All the Volvo owners in the world). This is the group of people or things we are trying to find out about.

Its not practical to measure the entire parent population, so Sampling is used to reflect the Parent Population. (e.g. We interview a sample of Volvo owners )

6

Sampling used to measure Parent Population

*******>>>^^^^^^^

********^^^^^^*******>>>>

********^^^^^^^^^^^^^^^^^^^^>>>>>>>

******^^*

>***>^^^^**^

Sample is taken from parent population. Measurements are taken on the sample. If

sampling was done correctly. measurements are representative of the Parent

Population

7


Sampling Distribution - If we sample correctly, we can generalize our findings to the “real world” population we are studying.

e.g. We take a sample of Volvo owners. We ask them how many miles they drove in 1999. (Their answers are a Statistic - a measurement that we perform on the sample)

If we did the sampling correctly, we can make inferences about ALL Volvo drivers (The Parent Population)

8


Probability samples – each population element has a known chance of being

included in the sample. We can then make inferences about the Parent Population.

Non-probability samples – samples using personal judgement. Since selection is

non-random, there is no way of estimating the probability that any element is sampled.

– Cannot estimate the adequacy of the sample result. Therefore cannot be sure your data reflects the entire population.

9

Definitions & Basic Concepts the Normal Distribution

Central Limit Theorem: Take a big enough random sample (probability sample) and your sample distribution will look like a normal curve. – you can calculate the mean and the variance, and

confidence limits.– You can make inferences about the Parent Population.

To make inferences using statistics your sample must be “large enough” and it must be random

10


How Large a Random Sample do you need? Answer: Depends on how much variation you see

in the group you are studying. – If a lot of variation, you will need a larger sample size.– If little variation, need smaller sample size.

– Although, larger populations usually have larger variations, so size has indirect effect at times.

THE SIZE OF THE PARENT POPULATION HAS NO DIRECT EFFECT ON SIZE OF SAMPLE

NEEDED!

11


Example of Little Variation: Avg.Diameter of U.S. Dime Coins. These coins are made to exacting standards, so even though there are many millions of dimes (i.e.. large parent population), you would need only a small sample to test diameter.

Example of Wide Variation: Average Height of City of Dallas’ 8,200 employees. Much greater variance than size of US dimes, so a bigger random sample is needed to test.

Example of Wider Variation. Average height of City of Dallas employees and their children. Even greater variance, so even bigger sample needed.

12

Practical Example

Nein! (NO!)

1) Sample not large enough (10 Companies is not enough, unless there is little variation among Company attitudes, which seems unlikely).

2) Plus your sample of respondents not chosen at random.

Pop Quiz: In a Concept test we talk to 10 Companies. We find that they would spend an average of $1 million on product x this year.Q: Can we conclude that Companies in U.S. would, on average, spend $1 million on product x?

13

Intermediate Concepts

Onward

and

Upward

14

Confidence Intervals

If sample is large and randomly selected, it looks like a normal curve. Normal curves have helpful properties...

In any Normal curve, 95 percent of the values are within 1.96 Standard Deviations () of the mean.

• 68.26 percent of the sample means will be within 1.0 () of the population mean.

Note:

S is our estimate of the standard deviation

of the parent population

() is the actual standard deviation

15


1.96 std deviations S () = 95% of all observations

The Normal Curve

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2

1 stddeviation

2 std deviations

3 std deviations

16

Example of Confidence Intervals

Let’s say we wanted to find the average number of miles a Volvo is driven in a year (to plan our new warranty)

We randomly sample a large group of Volvo drivers and find that the average number of miles driven is 17,000 miles.

• (But their was a considerable variance)

– We choose a range 1.96S 3,000 miles. This means we can be 95% sure that the TRUE average number of miles is within plus or minus 3,000 miles of 17,000

– So are 95% confidence interval is -> 14,000 to 20,000 miles per year.

17


1.96 std deviations S ()

= 95% of all observations

The Normal Curve

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2

1 stddeviation

2 std deviations

14,000 20,000Est. Mean = 17,000

95% chance the true mean falls within here

18


Suppose we decide this is too big of a spread. So we will accept less certainty in order to narrow our range

So we decide to go with 68.2 % of all owners (1 std deviation away from the average) =1,500 miles

Then we can say that there is a 68.2% chance of a Volvo being driven 15,500 to 18,500 miles per year.

19


1 S () std deviations = 68.2% of all

observations

The Normal Curve

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2

1 stddeviation

2 std deviations

15,500 18,50017,000

68% chance the true mean falls within here

20


Alternative: If the Marketing department at Volvo did not want

to sacrifice precision and confidence, they could go out and re-test using a much larger sample of respondents.

Remember, where there is a lot of variance, you need a bigger random sample to make up for it.

21

Formulas & Calculations

“oooh me brain hurts”

Mr. Gumby—character on “Monty Python’s

Flying Circus”

22

Finding Confidence Intervals

When Population Variance unknown First find sample variance Š, by

Š²= (X- Avg.) ²/(n-1)

Then find Standard error of estimate of the mean S = Š/n for 95% confidence, 1.96 standard deviations, S so limits equal: Avergage-1.96S and Average + 1.96S

23

Sample Size - When finding an average or mean

When variance unknown, use

n = (z ²/r ²) * C² where z =confidence interval in std deviations r = relative precision desired C = your estimate of the variation of the sample (one

deviation away from the mean equals how many miles)

24

Sample Size - When finding an average or mean

Example want to find average miles driven by Volvo owners to plus/minus 500 miles, want 95% confidence in the results, and we think 1 std deviation will equal 3000 miles

n=(2²/500²)*3000² n = 144 we need a sample of 144 to get this level of

accuracy in estimating the average miles

25

Sample Size - When finding a proportion

When variance is unknown n = (z ²/r ²) * Π(1-Π) where z=confidence interval in std deviations r = relative precision desired Π = this is not “pi”. It’s your estimate of the proportion of

the population that has the characteristic you are looking for (e.g.. Percent of Companies in our sales area who are interest in a product). That’s right, you must estimate the That’s right, you must estimate the very quantity you are trying to find, in order to determine very quantity you are trying to find, in order to determine sample size!sample size!

26

Sample Size - When finding a proportion

For example, if we want to find how many Companies in our Sales area will want to buy product Q. We want 95 % confidence & a precision of plus/minus 5 percent. We guess that 20% will be interested.

n = (z ²/r ²) * Π(1- Π) n=(2²/0.05²)* (0.2)(1-0.2 ) n = 256 we must survey a sample of at least 256

Companies to get this level of accuracy in estimating the proportion that want product Q.

27

CAGR Explained Measuring Growth Rate

28

Measuring Growth Rate

ExampleYear Revenues

($Billion)

1997 1.28

1998 1.57

1999 1.78

2000 2.01

2001 2.22

2002 2.43

2003 2.64

2004 2.842005 3.04

Compound Annual Growth Rate (CAGR)Compound Annual Growth Rate is the most common measure of growth in most industries. CAGR removes the compounding portion to show a more accurate picture of growth (removes “the interest on interest”).

Example: Let’s calculate the average annual growth 1997-2005.

Using two methods:

1) Arithmetic (wrong) way

2) CAGR

29

A Common mistake when calculating the Growth Rate

Your first instinct might be to just take an “average” of growth and divide by number of time periods- but that will yield a wrong answer!

= (YRn-YR1)/ YR1

n

=(3.04 -1.28)/1.28

8 years

= 17% avg. growth rate

Where: •YR1 = Year 1 (first year) revenues

•YRn= Year n revenues (the last year of the forecast period)•n = number of years used for growth rate (count forward from YR1 to YRn, beginning year to ending year)You end up with an answer that is too large because

you have not removed the compounding portion (“the interest on interest”).

30

Solution

– The correct way to calculate CAGR in our example:

CAGR= ((YRn/YR1)1/n ) - 1

•Where: YR1 = Year 1 (first year) revenuesYRn= Year n revenues (the last year of the forecast period)n = number of years used for growth rate (count forward

from YR1 to YRn, beginning year to ending year)

CAGR = (3.04/1.28) 1/8 ) -1 = 1.114 - 1 CAGR = 0.114186 = 11.4186%

The correct answer in our example is that revenues grew 11.4% per year.

31

Proof that CAGR is right

Revenues ($)1997 start 1.2801998 add11.4186% 1.4261999 add11.4186% 1.5892000 add11.4186% 1.7702001 add11.4186% 1.9732002 add11.4186% 2.1982003 add11.4186% 2.4492004 add11.4186% 2.7282005 add11.4186% 3.040

We can check our answer by adding 11.4186% growth every year and you’ll arrive at the correct final year revenue

3.04B, exactly right!

32

Plotting These Average Growth Rates

17%

33

Reference

Churchill, Gilbert, A., Marketing Research, Methodological Foundations, 7th Ed. Dryden Press. Chapters 10 and 11.

(Gilbert teaches at U of Wisconsin, Madison)

34

The

End

Documents

Statistics Review