50
2DS00 Statistics 1 for Chemical Engineering

k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

2DS00

Statistics 1 for Chemical

Engineering

Page 2: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Lecturers• Dr. A. Di Bucchianico

– Department of Mathematics,

– Statistics group

– HG 9.24

– phone (040) 247 2902

[email protected]

• Ir. G.D. Mooiweer,

– Department of Mathematics

– ICTOO

– HG 9.12

– phone 040 247 4277

(Thursdays)

[email protected]

•Dr. R.W. van der Hofstad

– Department of Mathematics,

– Statistics group

– HG 9.04

– phone (040) 247 2910

[email protected]

Page 3: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Goals of this course

• to prepare students for (first-year) laboratory assignments

• to learn students how to perform basic statistical analyses of

experiments

• to learn students how to use software for data analysis

• to learn students how to avoid pitfalls in analysing measurements

Page 4: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Important to remember

• Web site for this course: www.win.tue.nl/~sandro/2DS00/

• No textbook, but handouts (Word) + Powerpoint sheets through

web site

• Bring notebook to both lectures and self-study

• (Optional) buy lecture notes 2256 “Statgraphics voor regulier

onderwijs”

• (Optional) buy lectures notes 2218 “Statistisch Compendium”

Page 5: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

How to study

• read lecture notes briefly before lecture

• ask questions during lecture

• study lecture notes carefully after lecture

• make excercises during guided self-study

• reread lecture notes after guided self-study

• try out previous examinations shortly before the examination

N.B. Lecture notes (pdf documents) PowerPoint files

Page 6: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Week schedule

Week 1: Measurement and statistics

Week 2: Error propagation

Week 3: Simple linear regression analysis

Week 4: Multiple linear regression analysis

Week 5: Nonlinear regression analysis

Page 7: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Detailed contents of week 1

• measurement errors

• graphical displays of data

• summary statistics

• normal distribution

• confidence intervals

• hypothesis testing

Page 8: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Measurements and statistics

• perfect measurements do not exist

• possible sources of measurement errors:

– reading

– environment

• temperature

• humidity

• ...

– impurities

– ...

Page 9: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Necessity of good measurement system

Page 10: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Three experiments

Experiment 1

4,5 5 5,5 60

Experiment 3

4 4,5 5 5,50

1

Experiment 2

4,5 5 5,5 6 6,5 70

Page 11: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Types of measurement errors

• Random errors

– always present

– reduce influence by averaging repeated measurements

• Systematic errors

– requires adjustment/repair of measuring devices

• Outliers

– recording errors

– mistakes in applying procedures

Page 12: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Illustration of measurement concepts

Page 13: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Accuracy

difference between average of measured values and true value

Page 14: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Accuracy

• relates to systematic errors

• absolute error

• relative error

ti ie x x

rel,i t

it

x xe =

x

Page 15: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Location statistics

• mean

• median

• trimmed means

Page 16: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Precision

the degree in which consistent results are obtained

Page 17: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Accurate and precise

Page 18: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Statistics for precision: standard deviation & co• standard deviation

• standard error

• variation coefficient

• variance

•range

2 2 2

11

1 1

nn

iiii

x

x nxx xs

n n

/x xs s n/xCV xs

minmaxR

22 2 2

1 1

1 1

1 1

n n

x x i ii i

v s x x x nxn n

Page 19: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Robust statistics for precision

• robust statistics

– less sensitive to outliers

– difficult mathematical theory

– requires use of statistical software

•interquartile range

– IQR = 75% quantile – 25% quantile = 3rd quartile – 1st quartile

• mean absolute deviation

1

1

1

n

ii

MAD x xn

Page 20: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Graphical displays

• always make graphical displays for first impression

• “one picture says more than 1000 words”

Plot of calcium vs time

time

calc

ium

0 3 6 9 12 15-0,1

0,9

1,9

2,9

3,9

4,9

5,9

2 3.1 4 1.9 2.8

Page 21: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Basic graphical displays

• scatter plot

– watch out for scale (automatic resizing)

• time sequence plot

– for detecting time effects like warming up

• Box-and-Whisker plot

– outliers

– quartiles

– skewness

Page 22: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Time sequence plot

Nummer van de waarneming

met

ingTime Sequence Plot

1 3 5 7 9 114,7

4,8

4,9

5

5,1

5,2

Page 23: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Box-and-Whisker plot

Page 24: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Box-and-Whisker Plot

4,7 4,9 5,1 5,3 5,5 5,7

Page 25: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Probability theory

(cumulative) distribution

function

density

density to distribution

function

( ) ( ).F t P X t

( ) ( ).d

f t F tdt

( ) ( )t

F t f x dx

Page 26: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

The concept of probability density

density function

area denotes probability thatobservation falls between a and b

a b

Page 27: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Normal distribution

Page 28: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Normal distributionbell shaped curve

Important because of Central Limit Theorem

Normal distribution

• symmetric around µ (location of centre)

• spread parametrised by 2

– http://www.win.tue.nl/~marko/statApplets/functionPlots.html

– http://www-stat.stanford.edu/~naras/jsm/NormalDensity/NormalDensity.

html

• µ=0 and 2=1: standard normal distribution Z

2

2

1( ) exp

22

- t μf t

σ π

Page 29: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

More on normal distribution

Area between

0,67 is 0,500

1,00 is 0,683

1,645 is 0,975

1,96 is 0,950

2,00 is 0,954

2,33 is 0,980

2,58 is 0,990

3,00 is 0,997

Page 30: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Standardisation

X normally distributed with parameters en 2, then (X-)/ standard

normal

suppose

=3

2=4

6,2 6,2 3( 6,2) ( 1,6) 0,9452.

2

XP X P P Z P Z

Page 31: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Testing normality

• many statistical procedures implicitly assume normality

• if data are not normally distributed, then outcome of procedure may be

completely wrong

• user is always responsible for checking assumptions of statistical procedures

•Graphical checks:

– normal probability plot

– density trace

• Formal check

– Shapiro-Wilks test

Page 32: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Estimation of density function: histogram

Histogram for width

width

freq

uenc

y

265.3 265.5 265.7 265.9 266.10

5

10

15

20

25 curve: normal distribution withsample mean and variance as parameters

Page 33: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Drawbacks of the histogram

• misused for investigating normality• time ordering of data is lost• shape depends heavily on bin width + bin location:

Histogram for strength

strength

freq

uenc

y

24 29 34 39 44 49 540

1

2

3

4

5

Histogram for strength

strength

frequ

ency

24 29 34 39 44 49 540

1

2

3

4

5

• shape is stable for data sets of size 75 or larger• optimal number of bins n

samedata set

Page 34: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Alternative to histogram: Density Trace

Density Trace (also called naive density estimator):

• use moving bins instead of fixed bins

• choose bin width (automatically in Statgraphics)

• count number of observations in bin at each point

• divide by length of bin

Page 35: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Density Trace

Example dataset: 3.45 1.98 2.92 4.67 2.41

1.07 5.34 3.24 3.93

1 2 3 4 5 6

1/9

2/9

3/9

4/9

*

Page 36: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Choice of bin widths in density trace

• too small bin width yields too fluctuating curve

• too large bin width yields too smooth curve

Page 37: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Patterns in distribution – normal curve

• Depicted by a bell-shaped curve

• Indicates that measurement process is running normally

Page 38: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Patterns in distribution – bi-modal curve

• Distribution appears to have two peaks

• May indicate that data from more than process are mixed

together

Page 39: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Patterns in distribution – saw-toothed

Also commonly referred to as a comb distribution, appears as an

alternating jagged pattern

Often indicates a measuring problem

– improper gauge readings

– gauge not sensitive enough for readings

Page 40: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Testing normality

Mean,Std. dev.0,1

Normal Distribution

x-5 -3 -1 1 3 5

0

0.2

0.4

0.6

0.8

1

Page 41: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Normal Probability Plot

265.3 265.4 265.5 265.6 265.7 265.8 265.9

width

0.1

15

2050

80

9599

99.9

perc

enta

ge

Page 42: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Normally distributed?

-8 -4 0 4 80

0.1

0.2

0.3

0.4

Page 43: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Normal Probability Plot of not normally distributed data

Normal Probability Plot

-10 -7 -4 -1 2 50.1

1

520

50

8095

99

99.9

per

cen

tag

e

Page 44: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

• statistical test for Normality: Shapiro-Wilks

• idea: sophisticated regression analysis in the spirit of normal

probability plot

• makes Normal Probability Plot objective

• check outliers (measurement error?; normality sometimes disturbed

by single observation)

• analyse if not normally distributed

Test for Normality: Shapiro-Wilks

Page 45: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Tests for Normality for width

Computed Chi-Square goodness-of-fit statistic = 254.667P-Value = 0.0

Shapiro-Wilks W statistic = 0.921395P-Value = 0.000722338

Statgraphics: Shapiro Wilks

Interpretation: • value statistic itself cannot need be interpreted• P-value indicates how likely normal distribution is• use = 0.01 as critical value in order to avoid too strict rejections of

normality

Page 46: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Dixon’s test

• Box-and-Whisker plot graphical test of outliers

• if data are normally distributed, then formal test may be used:

Dixon's Test (assumes normality)------------------------------------------------------------------ Statistic 5% Test 1% Test1 outlier on right 0,612903 Significant Significant1 outlier on left 0,314286 Not sig. Not sig.2 outliers on right 0,66129 Significant Not sig.2 outliers on left 0,342857 Not sig. Not sig.1 outlier on either side 0,520548 Significant Not sig.

Page 47: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Disadvantages of point estimators

Page 48: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

• 95% confidence interval for µ: probability 0.95 that interval contains

true value µ

• more observations narrower interval (effect in particular for n <

20)

• higher confidence wider interval

• example : =0,05

Confidence intervals

/ 2 σ

x zn

/ 2 1,96z

Page 49: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Confidence intervals: example

Confidence Intervals for meting-------------------------------95,0% confidence interval for mean: 4,994 +/- 0,0875612 [4,90644;5,08156]95,0% confidence interval for standard deviation: [0,0841923;0,223458]

Confidence Intervals for meting-------------------------------99,0% confidence interval for mean: 4,994 +/- 0,125791 [4,86821;5,11979]99,0% confidence interval for standard deviation: [0,0756051;0,278784]

Summary Statistics for meting

Count = 10Average = 4,994Median = 5,01Variance = 0,0149822Standard deviation = 0,122402Standard error = 0,0387069Minimum = 4,78Maximum = 5,15Range = 0,37Interquartile range = 0,2

Page 50: k 2DS00 Statistics 1 for Chemical Engineering /k Lecturers Dr. A. Di Bucchianico – Department of Mathematics, – Statistics group –HG 9.24 – phone (040)

Hypothesis testing

• example: test whether there is a systematic error

Hypothesis Tests for metingSample mean = 4.994Sample median = 5.01t-test------Null hypothesis: mean = 5.0Alternative: not equalComputed t statistic = -0.155011P-Value = 0.880233Do not reject the null hypothesis for alpha = 0.05.