Upload
elijah-heath
View
225
Download
0
Embed Size (px)
Citation preview
Monitoring High-yield processes
MONITORING HIGH-YIELD PROCESSES
Cesar Acosta-Mejia
June 2011
Monitoring High-yield processes
EDUCATION
– B.S. Catholic University of Peru
– M.A. Monterrey Tech, Mexico
– Ph.D. Texas A&M University
RESEARCH
– Quality Engineering - SPC, Process monitoring
– Applied Probability and Statistics – Sequential analysis
– Probability modeling – Change point detection, process surveillance
Monitoring High-yield processes
MOTIVATION
– High-yield processes
– Monitor the fraction of nonconforming units p
– Very small p (ppm)
– To detect increases or decreases in p
– A very sensitive procedure
MONITORING HIGH-YIELD PROCESSES
Monitoring High-yield processes
MONITORING HIGH-YIELD PROCESSES
ASSUMPTIONS
• Process is observed continuously
• Process can be characterized by Bernoulli trials
• Fraction of nonconforming units p is constant, but
may change at an unknown point of time
Monitoring High-yield processes
Hypothesis Testing
For (level ) two-sided tests
the region R is made up of two subregions R1 and R2
with limits L and U such that
P[X ≤ L] = / 2
P[X ≥ U] = / 2
L U
Monitoring High-yield processes
Hypothesis Testing
Consider testing the proportion p
Monitoring High-yield processes
Hypothesis Testing
The test may be based on different random variables
• Binomial (n, p)
• Geometric (p)
• Negative Binomial (r, p)
• Binomial – order k (n, p)
• Geometric – order k (p)
• Negative Binomial – order k (r, p)
Monitoring High-yield processes
Binomial tests
when p is very small
Monitoring High-yield processes
Test 1
• proportion p0 = 0.025 (25000 ppm)
• test H0 : p = 0.025
against
H1 : p 0.025
• X n. of nonconforming units in 500 items
0.0027
Monitoring High-yield processes
Test 1
Let X Binomial (500,p)
To test the hypothesis
H0 : p = 0.025 against H1 : p 0.025
the rejection region is
R = {x ≤ 2} {x ≥ 25}
since
P[X ≤ 2] = 0.000300 < 0.00135 = /2
P[X ≥ 25] = 0.001018 < 0.00135 = /2
Monitoring High-yield processes
Test 1
Plot of P[rejecting H0] vs. p is
0.0027
0.00000
0.00200
0.00400
0.00600
0.00800
0.01000
0.01200
5000 10000 15000 20000 25000 30000 35000 40000 45000
parts per million
prob
abili
ty
of
reje
ctin
g H
o
Monitoring High-yield processes
Hypothesis Testing
Now consider testing
p0 = 0.0001 (100 ppm)
Monitoring High-yield processes
Test 1
Let X Binomial (n = 500,p)
To test the hypothesis
H0 : p = 0.0001 against H1 : p 0.0001
the rejection region is
R = {X ≥ 2}
since
P [X ≥ 2] = 0.0012
For n=500 there is no two-sided test for p = 0.0001.
Monitoring High-yield processes
Test 1
Binomial (n = 500, p = 0.025) Binomial (n = 500, p = 0.0001)
Monitoring High-yield processes
Test 1
For this test a plot of P[rejecting H0] vs. p is
0.0027
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
20 40 60 80 100 120 140 160 180 200 220 240 260
parts per million
P [
rej
ectin
g H
o]
Monitoring High-yield processes
Consider a geometric test for p
when p is very small
Monitoring High-yield processes
Test 2
Let X Geo(p)
To test the hypothesis ( = 0.0027)
H0 : p = 0.0001 against H1 : p 0.0001
the rejection region is
R = {X ≤ 13} {X ≥ 66075}
since
P[X ≤ 13] = 0.0013
P[X ≥ 66075] = 0.00135
An observation in {X ≤ 13} leads to conclude that p > 0.0001
Monitoring High-yield processes
0.00270
0.00000
0.00200
0.00400
0.00600
0.00800
0.01000
0.01200
50 100 150 200 250 300
p
P[r
eje
ctin
g H
o]
Test 2
For this test a plot of P[rejecting H0] vs. p is
Monitoring High-yield processes
Another performance measure
of a sequential testing procedure
Monitoring High-yield processes
Hypothesis Testing
Let X1, X2, … Geo(p) iid
Let T number of observations until H0 is rejected
Consider the random variables for j = 1,2,…
Aj = 1 if Xj R P[Aj = 0] = PR
Aj = 0 otherwise
then the probability function of T is
P[T= t] = P[A1 = 0] P[A2 = 0]… P[At-1 = 0] P[At = 1]
= PR [1-PR]t-1
Monitoring High-yield processes
Hypothesis Testing
therefore
T Geo(PR)
Let us consider E[T] = 1/PR as a performance measure
then
E[T] = 1/PR mean number of tests until H0 is rejected
when p = p0
E[T] = 1/
Monitoring High-yield processes
Test 2
Let X Geo(p) q = 1 - p
P [X ≤ x] = 1 – qx
Let the rejection region R = {X < L} {X > U}
then
PA = P [not rejecting H0]
= P [ L ≤ X ≤ U]
= 1 – qU – (1 – qL-1)
= qL-1 – qU
PR = 1 – (1- p )L-1 + (1 - p)U
Monitoring High-yield processes
Test 2
Let X Geo(p)
To test the hypothesis ( = 0.0027)
H0 : p = 0.0001 against H0 : p 0.0001
the rejection region is
R = {X < 14} {X > 66074}
then P[rejecting H0] is
PR = 1 – (1 – p)13 + (1 – p)66074
E[T] = 1/PR
when p = p0 E[T] = 1/ = 370.4
Monitoring High-yield processes
Test 2
we want E[T] < 370.4 when p > 0.0001
Monitoring High-yield processes
Test 2
How can we improve upon this test ?
we want E[T] < 370.4 when p > 0.0001
Monitoring High-yield processes
run sum procedure
Monitoring High-yield processes
Geometric chart
A sequence of tests of hypotheses
Monitoring High-yield processes
THE RUN SUM – for the mean
Monitoring High-yield processes
THE GEOMETRIC RUN SUM
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - DEFINITION
• Let us denote the following cumulative sums
SUt = SUt-1 + qt if Xt falls above the center line
= 0 otherwise
SLt = SLt-1 - qt if Xt falls below the center line
= 0 otherwise
where qt is the score assigned to the region in which Xt falls
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - DEFINITION
• The run sum statistic is defined, for t = 1,2,…, by
St = max {SUt, -SLt}
with SU0 = 0, SL0 = 0
and limit sum L
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - DESIGN
• Need to define
region limits (l1, l2, l3 and l5, l6, l7)
region scores (q1, q2, q3 and q4)
limit sum L
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - DESIGN
• Region limits above and below the center line are not symmetric around the center line.
• To define the region limits we use the cumulative probabilities of the distribution of X Geo (p0)
• Such probabilities were chosen to be the same as those of a run sum for the mean with the same scores
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - DESIGN
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - EXAMPLE
• If X Geo (p0 = 0.0001)
the region limits are given by
0.00123 = P [X ≤ l1 ]
0.02175 = P [X ≤ l2 ]
0.15638 = P [X ≤ l3 ]
0.50000 = P [X ≤ l4 ]
0.84362 = P [X ≤ l5 ]
0.97825 = P [X ≤ l6 ]
0.99877 = P [X ≤ l7 ]
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - EXAMPLE
• If X Geo (p0 = 0.0001)
the region limits are given by
0.00123 = P [X ≤ 13 ]
0.02175 = P [X ≤ 220 ]
0.15638 = P [X ≤ 1701 ]
0.50000 = P [X ≤ 6932 ]
0.84362 = P [X ≤ 18554 ]
0.97825 = P [X ≤ 36280 ]
0.99877 = P [X ≤ 67007 ]
Monitoring High-yield processes
THE GEOMETRIC RUN SUM - EXAMPLE
• Conclude H1: p p0 when St L
• Let T number of samples until H0 is rejected
• What is the distribution of T ?
• What is the mean and standard deviation?
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
• Markov chain
• States defined by the values that St can assume
• State space
= {-4,-3,-2,-1,0,1,2,3,4,C}
where
C ={n N | n = …,-6,-5,5,6,…}
is an absorbing state
• Transition probabilities
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
• Let p1 = P [ X ≤ l1 ]
p2 = P [ l1 ≤X ≤ l2]
p3 = P [ l2 ≤X ≤ l3 ]
p4 = P [ l3 ≤X ≤ l4]
p5 = P [ l4 ≤X ≤ l5]
p6 = P [ l5 ≤X ≤ l6]
p7 = P [ l6 ≤X ≤ l7]
p8 = P [ X > l8 ]
where X Geo (p0)
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
Transitions from St = 0
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
Transitions from St = 1
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
Transitions from St = 2
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
Monitoring High-yield processes
RUN SUM (0,1,2,3) L = 5 - MODELING
• Let T be the first passage time to state C
n. of observations until the run sum rejects H0
• Let Q be the sub matrix of transient states, then
P [T ≤ t] = e ( I – Qt ) J
G (s) = se ( I – s Q )-1 ( I – Q) J
E [T] = e ( I – Q )-1 J
e is a row vector defining the initial state {S0}
Monitoring High-yield processes
Geometric Run sum
For this chart a plot of E[T] vs. p is
0
100
200
300
400
500
600
20 30 40 50 60 70 80 90 100
110
120
130
140
150
160
170
180
ppm
aver
age
run
leng
th
Monitoring High-yield processes
Geometric Run sum
A comparison with Test 2
370.47
0
100
200
300
400
500
600
20 30 40 50 60 70 80 90 100
110
120
130
140
150
160
170
180
ppm
aver
age
run
leng
th
Monitoring High-yield processes
RUN SUM – FURTHER IMPROVEMENT
• Consider a geometric run sum
– No regions
– Center line equal to l4– Scores are equal to X
– Design – limit sum L
Monitoring High-yield processes
NEW GEOMETRIC RUN SUM - DEFINITION
• Let us denote the following cumulative sums
SUt = SUt-1 + Xt if Xt falls above the center line
= 0 otherwise
SLt = SLt-1 - Xt if Xt falls below the center line
= 0 otherwise
Monitoring High-yield processes
NEW GEOMETRIC RUN SUM - DEFINITION
• The run sum statistic is defined, for t = 1,2,…, by
St = max {SUt, -SLt}
with SU0 = 0, SL0 = 0
and limit sum L
Monitoring High-yield processes
NEW GEOMETRIC RUN SUM - MODELING
• Markov chain – not possible
– huge number of states
• Need to derive the distribution of T
• Can show that
Monitoring High-yield processes
NEW GEOMETRIC RUN SUM - MODELING
Monitoring High-yield processes
CONCLUSIONS
• The run sum is an effective procedure
for two-sided monitoring
• For monitoring very small p,
it is more effective than
a sequence of geometric tests
• If limited number of regions
it can be modeled by a Markov chain
Monitoring High-yield processes
TOPICS OF INTEREST
• Estimate (the time p changes – the change point)
• Bayesian tests
• Lack of independence (chain dependent BT)
• Run sum can be applied to other instances
- monitoring - arrival process