40
ValWkPHL1012S2 1 Basic Statistics for Quality Control and Validation Studies: Session 2 Steven S. Kuwahara, Ph.D. GXP BioTechnology, LLC PMB 506, 1669-2 Hollenbeck Ave. Sunnyvale, CA 94087-5402 Tel. & FAX (408) 530-9338 E-Mail: [email protected] Website: www.gxpbiotech.org

Statistics for qc 2

Embed Size (px)

Citation preview

Page 1: Statistics for qc 2

ValWkPHL1012S2 1

Basic Statistics for Quality Control and Validation Studies: Session 2

•  Steven S. Kuwahara, Ph.D.

•  GXP BioTechnology, LLC •  PMB 506, 1669-2 Hollenbeck Ave.

•  Sunnyvale, CA 94087-5402

•  Tel. & FAX (408) 530-9338 •  E-Mail: [email protected]

•  Website: www.gxpbiotech.org

Page 2: Statistics for qc 2

2

Sample Number Determination 1.

•  One of the major difficulties with setting the number of samples to take lies in determining the levels of risk that are acceptable. It is in this area that managerial inaction is often found, leaving a QC supervisor or senior analyst to make the decision on the level of risk the company will accept. If this happens, management has failed its responsibility.

ValWkPHL1012S2

Page 3: Statistics for qc 2

3

Sample Number Determination 2.

•  The problem is that all sampling plans, being statistical in nature, will possess some risk. For instance, if we randomly draw a new sample from a population we could assume or predict that a test result from that sample will fall within ±3σ of the true average 99.7% of the time, but there is still 0.3% (3 parts-per-thousand) of the time when the result will be outside the range for no reason other than random error. Thus a good lot could be rejected. This is known as a false positive or a Type I error.

•  This is the type of error that is most commonly considered, but there is type II error also.

ValWkPHL1012S2

Page 4: Statistics for qc 2

4

Sample Number Determination 3.

•  False positives occur when you declare that there is a difference when one does not really exist (example given in the previous slide). Sometimes called producer’s risk, because the producer will dump a lot that was okay.

•  False negatives occur when you declare that a difference does not exist when, in fact, the difference does exist. Sometimes called customer’s risk, because the customer ends up with a defective product. It is also known as a Type II error.

ValWkPHL1012S2

Page 5: Statistics for qc 2

5

SIMPLIFIED FORM OF n CALCULATION n for an to compare with a µ

( ) 2

222222

Δ=Δ=−=

−=⎟⎠⎞

⎜⎝⎛−

=

stnxnst

xn

stn

sxt ii

µ

µµ

ValWkPHL1012S2

Page 6: Statistics for qc 2

6

EXAMPLE OF SIMPLIFIED METHOD WITH ITERATION

•  Δ = 51- 50 = 1 s = ± 2 Z0.025=1.96 •  n = (1.96)2 (2)2 / 1 = 3.8416 X 4 = 15.4 ~ 16 •  t0.025,15= 2.131 (2.131)2 = 4.541161 •  n = 4.54116 X 4 = 18.16 ~ 19 •  t0.025,18= 2.101 (2.101)2 = 4.414201 •  n = 4.414201 X 4 = 17.66 ~ 18 •  t0.025,17= 2.110 (2.110)2 = 4.4521 •  n = 4.4521 X 4 = 17.81 ~ 18

ValWkPHL1012S2

Page 7: Statistics for qc 2

7

Sample Number Determination 6.

•  Because of the need to define risk and consider the level of variation that is present, sampling plans that do not allow for these factors are not valid.

•  Examples of these are: Take 10% of the lot below N=200 and then 5% thereafter. The more famous one is to take :

•  in samples. 1+N

ValWkPHL1012S2

Page 8: Statistics for qc 2

DEVELOPMENT OF A SAMPLING PLAN

•  Consider a situation where a product must contain at least 42 mg/mL of a drug. At 41 mg/mL the product fails. Because we want to allow for the test and product variability, we decide that we want a 95% probability of accepting a lot that is at 42 mg/mL, but we want only a 1% chance of accepting a lot that is at 41 mg/mL.

•  For the sampling plan we need to know the number (n) of test results to take and average.

•  We will accept the lot if the average () exceeds k mg/mL.

8 ValWkPHL1012S2

Page 9: Statistics for qc 2

SAMPLING PLAN CALCULATIONS A. You will need the table of the normal distribution for this.

• Suppose we have a lot that is at 42.0 mg/mL. •  would be normally distributed with µ=42.0

– And the SEM = s/n. We want >k

From a “normal” table (or “x” with ν = ∞) we want a probability of 0.95 that “x” will be greater than the “k” expression.

deviate normal standard x

0.420.42

=

−>

−=

ns

k

ns

xx

9 ValWkPHL1012S2

Page 10: Statistics for qc 2

SAMPLING PLAN CALCULATIONS A1. You will need a normal distribution table for this

•  x0.95,∞ = 1.645 (cumulative probability of 0.95) •  We know that this must be greater than the “k”

expression. •  We also know that k must be less than 42.0 since

the smallest acceptable will be 42.0. •  Therefore:

xk since 645.10.42<=

ns

k

10 ValWkPHL1012S2

Page 11: Statistics for qc 2

SAMPLING PLAN CALCULATIONS B.

• Now suppose that the correct value for the lot is 41.0 mg/mL. So now µ = 41.0 and we want a probability of 0.01 that >k. Now:

59.41

707.0326.2645.1

0.410.42

326.20.410.41

=

−=−

=−

−=−

>−

=

k

kk

ns

k

ns

xx

11 ValWkPHL1012S2

Page 12: Statistics for qc 2

SAMPLING PLAN CALCULATIONS C.

• Going back to the original equation for a passing result and knowing that s = ± 0.45 (From our assay validation studies?)

( ) [ ][ ]( )( )

24.31681.0

544644.041.0

45.064.1nor 41.0]64.1[

64.141.00.4259.410.42

2

2

==

−=−=

−=−

=−

=−

n

ns

ns

ns

ns

k

12 ValWkPHL1012S2

Page 13: Statistics for qc 2

SAMPLING PLAN

•  The sampling plan now says: To have a 95% probability of accepting a lot at 42.0 mg/mL or better and a 1% probability of accepting a lot at 41.0 mg/mL or worse, given a standard deviation of ± 0.45 mg/mL for the test method; run four samples and average them. Accept the lot if the mean is 41.59 mg/mL or better.

•  Note that the calculated value of n is close enough to 3 that some would argue for 3 samples.

13 ValWkPHL1012S2

Page 14: Statistics for qc 2

SAMPLE SIZES FOR MEANS

• Suppose we want to determine µ using a test where we know the standard deviation (s) of the population. • How many replicates will we need in the sample? • The length of a confidence interval = L

Δ==== 2L 4tn 4L 22

22222

Ls

nst

ntsL

14 ValWkPHL1012S2

Page 15: Statistics for qc 2

Recalculation of Earlier Problem.

L = 2, s = ±2, t0.95,∞=1.960 (two sided)

2

224tnLs

=

( ) ( )( )

( ) ( )( ) 18n so 17.81n 110.2t

17.66n 101.2t 18.16,n 131.2t :Iterate16or 4.15

44656.61

296.124

17,95.0

18,95.015,95.0

2

22

===

====

=

==

n

n

15 ValWkPHL1012S2

Page 16: Statistics for qc 2

Sample size for estimating µ

• Note the statement: We are determining the % of drug present and we wish to bracket the true amount (µ%) by ± 0.5% and do this with 95% confidence, so L = 2 x 0.5 = 1.0 • We have 22 previous estimates for which s = 0.45 • Now at the 95% level of significance (1–0.95), t0.975,21 = 2.080.

( ) ( )( )

5.30.1

45.0080.242

22

==n

16 ValWkPHL1012S2

Page 17: Statistics for qc 2

17

POOLED VARIANCE

( ) ( )211

21

222

211

−+−+−

=nn

snsnsp

ValWkPHL1012S2

Page 18: Statistics for qc 2

Calculating the Confidence Interval, Sp

• The results of the four determinations are: 42.37%, 42.18%, 42.71%, 42.41%. •  = 42.42% and s = 0.22% (n2 – 1) = 3 • Using the extra 3 df and s = 0.22% we have:

( ) ( ) 43.032122.0345.021 22

=+

+=pS

18 ValWkPHL1012S2

Page 19: Statistics for qc 2

Calculating the Confidence Interval, L

• Sp = s, the new estimate of the standard deviation, so a new confidence interval can be calculated with 24 df. t(0.975, 24)= 2.064.

( )( )

( )

L. gcalculatinfor 25not 4n that Note42.87 - 41.97or 0.4542.42 C.I.

0.45or 44376.02 C.I.

1.0.n rather tha ,88752.04

43.0064.22

95%

=

±=

±=±=

=

=

LL

L

19 ValWkPHL1012S2

Page 20: Statistics for qc 2

Sample Sizes for Estimating Standard Deviations. I.

•  The problem is to choose n so that s at n – 1 will be within a given ratio of s/σ.

•  Examples are found in reproducibility, repeatability, and intermediate precision measurements.

•  s = standard deviation experimentally determined. σ = population or true standard deviation. s2 and σ2 are corresponding variances.

•  You will use n to derive s.

20 ValWkPHL1012S2

Page 21: Statistics for qc 2

Sample Sizes for Estimating Standard Deviations. χ2

• This is the asymmetric distribution for σ2. • Now as an example, assume n-1 = 12. At 12 df, χ2 will exceed 21.0261 5% of the time and it will exceed 5.2260 95% of the time. Therefore 90% of the time, χ2 will lie between 5.2260 and 21.0261 for 12 df. • Check your tables to confirm this.

( )

( )

( )⎟⎟⎠

⎞⎜⎜⎝

−=⎟

⎞⎜⎝

=−

−=

1

1

1

21

2

2

221

2

221

ns

sn

sn

n

n

n

χσ

σχ

σχ

21 ValWkPHL1012S2

Page 22: Statistics for qc 2

Confidence interval for the standard deviation.

•  Given the data in the previous slide, we know that (s2/σ2) will lie between (5.2260/12) and (21.0261/12), or between 0.4355 and 1.7552.

•  Thus the ratio of s/σ will lie between the square roots of these numbers or between 0.66 and 1.32 or 0.66 < s/σ < 1.32. This gives:

•  s/1.32 < σ < s/0.66. If you know s this gives you a 90% confidence interval for the standard deviation.

•  Now let’s reverse our thinking. 22 ValWkPHL1012S2

Page 23: Statistics for qc 2

Sample Sizes for Estimating Standard Deviations. Continued. I.

•  Instead of the confidence interval, suppose we say that we want to determine s to be within ± 20% of σ with 90% confidence. So:

•  1 – 0.2 < s/σ < 1+ 0.2 or 0.8 < s/σ < 1.2 •  This is the same as: 0.64 < (s/σ)2 < 1.44 •  Since we want 90% confidence we use levels of

significance at 0.05 and 0.95. •  Now go to the χ2 table under the 0.95 column and

look for a combination where χ2/df is not < 0.64, but df is as large as possible.

23 ValWkPHL1012S2

Page 24: Statistics for qc 2

Sample Sizes for Estimating Standard Deviations. Continued. II.

•  Trial and error shows this number to be about 50. •  Next we go to the column under 0.05 and look for

a ratio that does not exceed 1.44, but df is as small as possible.

•  Trial and error will show this number to be between 30 and 40.

•  You must take the larger of the two numbers and since df = n – 1, n = 51 replicates.

24 ValWkPHL1012S2

Page 25: Statistics for qc 2

Do Not Panic. Consider This!

•  Instead of the confidence interval, suppose we say that we want to determine s to be within ± 50% of σ with 95% confidence. So:

•  1 – 0.5 < s/σ < 1+ 0.5 or 0.5 < s/σ < 1.5 •  This is the same as: 0.25 < (s/σ)2 < 2.25 •  Since we want 95% confidence we use levels of

significance at 0.025 and 0.975. •  Now go to the χ2 table under the 0.975 column and

look for a combination where χ2/df is not < 0.25, but df is as large as possible.

25 ValWkPHL1012S2

Page 26: Statistics for qc 2

Greater Confidence, But Lesser Certainty

•  Trial and error shows this number to be 8. •  Next we go to the column under 0.025 and look for

a ratio that does not exceed 2.25, but df is as small as possible.

•  Trial and error will show this number to be 8. The same as the other df.

•  You must take the larger of the two numbers and but in this case df = 8 and n = 9.

•  You have a greater confidence interval for a smaller n.

26 ValWkPHL1012S2

Page 27: Statistics for qc 2

n for Comparing Two Averages

ValWkPHL1012S2 27

( )

( )2

22

21

2,

222

21

2df,

22

21

22

.

2121

2

22

1

21

21,

nt

n x

Δ

+=

Δ=++

Δ=

=−=Δ

+

−=

σσ

σσσσ

σσ

α

αα

α

df

df

df

tn

nn

t

nx

nn

xxt

Page 28: Statistics for qc 2

Introduction to the Analysis of Variance (ANOVA) I.

This method was aimed at deciding whether or not differences among averages were due to experimental or natural variations or true differences among averages. R.A. Fisher developed a method based on comparing the variances of the treatment means and the variances of the individual measurements that generated the means. The technique has been extended into the field known as DOE or factorial experiments

28 ValWkPHL1012S2

Page 29: Statistics for qc 2

Introduction to the Analysis of Variance (ANOVA) II.

•  The method is based on the use of the F-test and the F-distribution (Named after him.) –  The F-distribution, and all distributions related to

errors, is a skewed, unsymmetrical distribution.

–  S2y represents the variance among the treatments and

s2pooled is the variance of the individual results (system

noise).

2

2

pooled

y

sns

F =

29 ValWkPHL1012S2

Page 30: Statistics for qc 2

Introduction to the Analysis of Variance (ANOVA) III.

•  F increases as the number of replicates increases. –  In simple ANOVA systems n is the same for all

treatments. –  By increasing n you amplify small differences between

the variances of the treatment means and the system noise.

–  An F value of 1.0 or less says that the system noise is greater than the variance of the means. This suggests that the differences among the means are due to experimental or environmental variations.

30 ValWkPHL1012S2

Page 31: Statistics for qc 2

Introduction to the Analysis of Variance (ANOVA) IV.

•  Because of the importance of system noise, before doing an ANOVA or factorial experiment, you should reduce variation in the system to a minimum. –  You should remove all special cause variation and

minimize common cause variation. –  Methods such as Statistical Process Control (SPC)

should be used to reduce variations. •  Note: A system where special cause variation has been

eliminated and only common cause variation is left is known as a system under statistical control.

31 ValWkPHL1012S2

Page 32: Statistics for qc 2

Introduction to the Analysis of Variance (ANOVA) V.

•  The F-distribution depends on the number of degrees of freedom of the numerator and denominator and the level of type 1 error that you will accept. –  For each level of type 1 error there are different

distribution tables. The exact value of F then depends on the number of degrees of freedom of the numerator and denominator.

•  If the calculated F exceeds the tabular F, it is then significant at the1-α level. Where α is the level of type 1 error that you are willing to accept.

•  α is the p value. Most statistical software programs will calculate the p value. Normally, you want 0.05 or 0.01.

•  Type-1 error is where you falsely conclude that there is a difference. AKA: False positive, producer’s risk.

ValWkPHL1012S2 32

Page 33: Statistics for qc 2

Fairness of 4 sets of dice. (Taken from Anderson, MJ and Whitcomb, PJ, DOE Simplified, CRC Press, Boca Raton, FL, 2007.)

•  Frequency distribution for 56 rolls of dice.

•  Grand average = Total of all dots/56 dice (4X14) ValWkPHL1012S2 33

Dots White Blue Green Purple

6 6+6 6+6 6+6 6 5 5 5 5 5 4 4 4+4 4+4 4 3 3+3+3+3+3 3+3+3+3 3+3+3+3 3+3+3+3+3 2 2+2+2 2+2+2+2 2+2+2+2 2+2+2+2+2 1 1+1 1 1 1

Mean (y) 3.14 3.29 3.29 2.93 Var. (s2) 2.59 2.37 2.37 1.76 n = 14 Grand Ave. = 3.1625

Page 34: Statistics for qc 2

Fairness of 4 sets of dice. Calculation of F. Note differences in denominator.

Since F is much less than 1.0 we can assume that there is no significant difference among the colors even without looking

at an F table.

ValWkPHL1012S2 34

( ) ( ) ( ) ( )

18.028.2029.0*14*

28.2476.137.237.259.2

029.014

1625.393.21625.329.31625.329.31625.314.3

2

2

2

2

22222

===

=++++=

=−

−+−+−+−=

pooled

y

pooled

y

y

ssn

F

s

s

s

Page 35: Statistics for qc 2

Fairness of 4 sets of dice. How about a loaded set?

Dots White Blue Green Purple 6 1 3 6 1 5 1 2 5 2 4 1 3 1 3 3 2 4 1 1 2 5 1 0 2 1 4 1 1 5

Mean (y) 2.50 3.93 4.93 2.86 Var. (s2) 2.42 2.38 2.07 3.21 n = 14 δ = δ2 = Σδ2 =

Grand Ave. -1.055 1.1130 3.6245

= 3.555 0.375 0.1406

Σδ2/3 = s2y =

1.375 1.8906 1.2082

-0.695 0.4830

ValWkPHL1012S2 35

Page 36: Statistics for qc 2

Fairness of 4 sets of dice. How about a loaded set? ANOVA

ValWkPHL1012S2 36

0.001 pat t Significan .F toFfor is 0.1%.at 6.171 - 6.595 and 1%,at 4.126 - 4.313 and

0.05p 5%,at 758.2839.2FTabular

71.6 71.652.2

21.1*14

1)-(4 3 df 21.1

521)-4(14df 52.24

21.307.238.242.2

3,603,40

52,3

52,3

2

2

=

=−=

===

==

===+++

=

Range

FF

s

s

y

pooled

Page 37: Statistics for qc 2

Least Significant Difference Lucy in the Sky with Diamonds (LSD)

•  DO NOT EVER USE THIS METHOD WITHOUT THE PROTECTION OF A SIGNIFICANT ANOVA RESULT ! ! !

•  There are 45 combinations of 10 results taken in pairs. If you focus mainly on the high and low results, you are almost guaranteed to encounter a type-1 error. –  This is why you need to use the ANOVA coupled with an LSD

determination. •  The LSD is based on the equations for confidence intervals.

ValWkPHL1012S2 37

( ) ns

nstLSDni

pooleddf∑=×±= −

12

pooled,1 s /2 α

Page 38: Statistics for qc 2

LSD for the Current Problem

•  The (1-α) level of the t determines the level of significance for the LSD.

•  n = 14 for replicates, but s2pooled had 4X(14-1)

= 52 df.

ValWkPHL1012S2 38

( ) 68.2for t 1.333LSD 99%at

21.114259.101.2

59.152.24

21.307.238.242.2

52df0.99, ≅±=

±=×=

==+++

=

=

LSD

spooled

Page 39: Statistics for qc 2

So where are the bad dice?

•  Given the LSD = ±1.333, the result can be displayed in different ways.

•  Plot the result as the mean of the average count of the treatments (colors) ± ½ LSD. –  Then look for overlaps. A significant difference will not have

an overlap. •  Or take the difference between means and compare

them to the LSD. –  In the present case, the white and purple dice are similar, but

the green dice are definitely higher, with the blue dice different from the white, but not from the green and only marginally different from the purple.

ValWkPHL1012S2 39

Page 40: Statistics for qc 2

For 95% confidence, the LSD is ± 1.21 and for 99%, the LDS is ± 1.33. So blue and green are different from white, and green is different from purple and white at the 99% level. White and purple are the same as are blue and green. Purple is also similar to blue, but not to green. All of this holds at the 99% level, thus at p = 0.01 we conclude that blue and green dice run to higher numbers than white and purple.

ValWkPHL1012S2 40

White = 2.50 Blue = 3.93 Green=4.93 Purple=2.86 White = 2.50 1.43 2.43 0.36 Blue = 3.93 1.43 1.00 1.07 Green=4.93 2.43 1.00 2.07 Purple=2.86 0.36 1.07 2.07