Upload
lucinda-edwards
View
215
Download
2
Embed Size (px)
Citation preview
Matlab: Statistics
1. Probability distributions
2. Hypothesis tests
3. Response surface modeling
4. Design of experiments
Statistics Toolbox Capabilities
Descriptive statistics
Statistical visualization
Probability distributions
Hypothesis tests
Linear models
Nonlinear models
Multivariate statistics
Statistical process control
Design of experiments
Hidden Markov models
Probability Distributions
21 continuous distributions for data analysis » Includes normal distribution
6 continuous distributions for statistics» Includes chi-square and t distributions
8 discrete distributions» Includes binomial and Poisson distributions
Each distribution has functions for:» pdf — Probability density function» cdf — Cumulative distribution function» inv — Inverse cumulative distribution» functionsstat — Distribution statistics function» fit — Distribution fitting function» like — Negative log-likelihood function» rnd — Random number generator
Normal Distribution Functions
normpdf – probability distribution function normcdf – cumulative distribution function norminv – inverse cumulative distribution
function normstat – mean and variance normfit – parameter estimates and confidence
intervals for normally distributed data normlike – negative log-likelihood for
maximum likelihood estimation normrnd – random numbers from normal
distribution
Hypothesis Tests
17 hypothesis tests available chi2gof – chi-square goodness-of-fit test. Tests if a
sample comes from a specified distribution, against the alternative that it does not come from that distribution.
ttest – one-sample or paired-sample t-test. Tests if a sample comes from a normal distribution with unknown variance and a specified mean, against the alternative that it does not have that mean.
vartest – one-sample chi-square variance test. Tests if a sample comes from a normal distribution with specified variance, against the alternative that it comes from a normal distribution with a different variance.
Mean Hypothesis Test Example
>> h = ttest(data,m,alpha,tail) data: vector or matrix of data m: expected mean alpha: significance level Tail = ‘left’ (left handed alternative), ‘right’ (right
handed alternative) or ‘both’ (two-sided alternative) h = 1 (reject hypothesis) or 0 (accept hypothesis) Measurements of polymer molecular weight
Hypothesis: 0 = 1.3 instead of 1 < 0
>> h = ttest(x,1.3,0.1,'left')
h = 1
26.131.127.127.112.133.119.122.136.125.1
0049.0258.1 2 sx
Variance Hypothesis Test Example
>> h = vartest(data,v,alpha,tail) data: vector or matrix of data v: expected variance alpha: significance level Tail = ‘left’ (left handed alternative), ‘right’ (right
handed alternative) or ‘both’ (two-sided alternative)
h = 1 (reject hypothesis) or 0 (accept hypothesis) Hypothesis: 2 = 0.0049 and not a different
variance
>> h = vartest(x,0.0049,0.1,'both')
h = 0
Goodness of Fit
Perform hypothesis test to determine if data comes from a normal distribution
Usage: [h,p,stats]=chi2gof(x,’edges’,edges) » x: data vector» edges: data divided into intervals with the
specified edges» h = 1, reject hypothesis at 5% significance» h = 0, accept hypothesis at 5% significance» p: probability of observing the given statistic» stats: includes chi-square statistic and degrees of
freedom
Goodness of Fit Example
Find maximum likelihood estimates for µ and σ of a normal distribution
>> data=[320 … 360];
>> phat = mle(data)
phat = 364.7 26.7 Test if data comes from a normal distribution
>> [h,p,stats]=chi2gof(data,’edges’,[-inf,325:10:405,inf]);
>> h = 0
>> p = 0.8990
>> chi2stat = 2.8440
>> df = 7
Response Surface Modeling
Develop linear and quadratic regression models from data
Commonly termed response surface modeling Usage: rstool(x,y,model)
» x: vector or matrix of input values» y: vector or matrix of output values» model: ‘linear’ (constant and linear terms), ‘interaction’
(linear model plus interaction terms), ‘quadratic’ (interaction model plus quadratic terms), ‘pure quadratic’ (quadratic model minus interaction terms)
» Creates graphical user interface for model analysis
effects Quadratic
effectsn interactioBinary effectsMain Bias2333
2222
2111
322311321123322110
xxx
xxxxxxxxxy
Response Surface Model Example
VLE data – liquid composition held constant
>> x = [300 1; 275 1; 250 1; 300 0.75; 275 0.75; 250 0.75; 300 1.25; 275 1.25; 250 1.25];
>> y = [0.75; 0.77; 0.73; 0.81; 0.80; 0.76; 0.72; 0.74; 0.71];
Experiment 1 2 3 4 5 6 7 8 9
Temperature 300 275 250 300 275 250 300 275 250
Pressure 1.0 1.0 1.0 0.75 0.75 0.75 1.25 1.25 1.25
Vapor Composition
0.75 0.77 0.73 0.81 0.80 0.76 0.72 0.74 0.71
Response Surface Model Example cont.
>> rstool(x,y,'linear')
>> beta = 0.7411 (bias)
0.0005 (T)
-0.1333 (P)
>> rstool(x,y,'interaction')
>> beta2 = 0.3011 (bias)
0.0021 (T)
0.3067 (P)
-0.0016 (T*P)
>> rstool(x,y,'quadratic')
>> beta3 = -2.4044 (bias)
0.0227 (T)
0.0933 (P)
-0.0016 (T*P)
-0.0000 (T*T)
0.1067 (P*P)
Design of Experiments
Full factorial designs
Fractional factorial designs
Response surface designs» Central composite designs
» Box-Behnken designs
D-optimal designs – minimize the volume of the confidence ellipsoid of the regression estimates of the linear model parameters
Full Factorial Designs
>> d = fullfact(L1,…,Lk)
L1: number of levels for first factor
Lk: number of levels for last (kth) factor
d: design matrix
>> d = ff2n(k)
k: number of factors
d: design matrix for two levels
>> d = ff2n(3)
d =0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1
Fractional Factorial Designs
>> [d,conf] = fracfact(gen)
gen: generator string for the design
d: design matrix
conf: cell array that describes the confounding pattern
>> [x,conf] = fracfact('a b c abc')
x =-1 -1 -1 -1-1 -1 1 1-1 1 -1 1-1 1 1 -1 1 -1 -1 1 1 -1 1 -1 1 1 -1 -1 1 1 1 1
Fractional Factorial Designs cont.
>> gen = fracfactgen(model,K,res) model: string containing terms that must be estimable
in the design K: 2K total experiments in the design res: resolution of the design gen: generator string for use in fracfact
>> gen = fracfactgen('a b c d e f g',4,4)
gen = 'a''b''c''d''bcd''acd''abd'