12
Statistics made easy Statistics made easy Basic principles Basic principles

Basic principles - University of Sussexusers.sussex.ac.uk/~khs20/mtcs/mtcs/statistics.pdfStatistics made easy Assume we are interested in getting some information about the average

  • Upload
    others

  • View
    14

  • Download
    7

Embed Size (px)

Citation preview

Statistics made easyStatistics made easyBasic principlesBasic principles

Statistics made easyStatistics made easy

Assume we are interested in getting someAssume we are interested in getting someinformation about the average height ofinformation about the average height ofpeople in this roompeople in this room

One basic information would be providedOne basic information would be providedby the by the mean mean : adding individual heights: adding individual heightsand divide it by the number of peopleand divide it by the number of people

The mean gives us some idea about theThe mean gives us some idea about theaverage height, but how much does itaverage height, but how much does itreally say?really say?

MeanMean

Which information can be derived by theWhich information can be derived by themean of the class?mean of the class?

Can we generalize this information?Can we generalize this information? What sort of generalization can we reallyWhat sort of generalization can we really

made?made? This group is not truly a random sample.This group is not truly a random sample. It is predominant male but there is aIt is predominant male but there is a

percentage of women.percentage of women. It is multi-cultural.It is multi-cultural.

How statistics can go wrongHow statistics can go wrong

Statistics are often used in Statistics are often used in ““pop sciencepop science””.. The idea is that we select aThe idea is that we select arepresentativerepresentative sample of a population, sample of a population,perform some tests and then convert theperform some tests and then convert theresults into numbers and performresults into numbers and performstatistics, i.e. calculating the mean.statistics, i.e. calculating the mean.

How can we be sure that the sample doesHow can we be sure that the sample doesrepresent the behavior of the population?represent the behavior of the population?

How do we interpret the mean?How do we interpret the mean?

Second order StatisticsSecond order Statistics

Even if the sample is representative ofEven if the sample is representative ofthe population, is the mean (first orderthe population, is the mean (first orderstatistics) enough to provide information?statistics) enough to provide information?

Think of i.e. certain countries where thereThink of i.e. certain countries where thereis a social polarization, i.e. the rich areis a social polarization, i.e. the rich arevery rich, the poor are very poor andvery rich, the poor are very poor andthere is no middle class.there is no middle class.

This is why we introduce the meaning ofThis is why we introduce the meaning ofvariance.variance.

Variance and Standard DeviationVariance and Standard Deviation

Variance:Variance:

When variance is computed over a When variance is computed over asample: denominator N-1sample: denominator N-1

Standard Deviation: the square root ofStandard Deviation: the square root ofvariancevariance

( )N

i

N

i

x! "µ2

The so-called The so-called ““normalnormal”” distribution distribution

-5 -4 -3 -2 -1 0 1 2 3 4 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

!=1!=0.5

CorrelationCorrelation

Second order statisticsSecond order statistics Assume that we have two observedAssume that we have two observed

variables. Correlation reveals the degreevariables. Correlation reveals the degreeof of linear linear relationship between them.relationship between them.

The correlation coefficient may take onThe correlation coefficient may take onany value between plus and minus one.any value between plus and minus one.

MatlabMatlab functions functions

Mean : Mean : mean(xmean(x), x: a vector or matrix), x: a vector or matrix Standard deviation: Standard deviation: std(xstd(x)) Correlation coefficients: Correlation coefficients: corrcoef(x,ycorrcoef(x,y),),

where x and y two column vectorswhere x and y two column vectorsrepresenting the observationsrepresenting the observations

MatlabMatlab examples (1) examples (1) X=[0:10]; X=[0:10]; mean(Xmean(X))ansans = = 5 5 std(Xstd(X))ansans = = 2.8872 2.8872 Y=5*X; Y=5*X; corrcoef(x,ycorrcoef(x,y))ans =ans = 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

MatlabMatlab examples (2) examples (2)

Y=10*x; Y=10*x; corrcoef(X,Ycorrcoef(X,Y))ans =ans = 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 Y=10*(x.^5); corrcoef(X,Y) Y=10*(x.^5); corrcoef(X,Y)ans =ans = 1.0000 0.8207 1.0000 0.8207 0.8207 1.0000 0.8207 1.0000

MatlabMatlab examples (3) examples (3)

Y=10*(X.^10); Y=10*(X.^10); corrcoef(X,Ycorrcoef(X,Y))ans =ans = 1.0000 -0.0000 1.0000 -0.0000 -0.0000 1.0000 -0.0000 1.0000 Y=-10*X; Y=-10*X; corrcoef(X,Ycorrcoef(X,Y))ans =ans = 1.0000 -1.0000 1.0000 -1.0000 -1.0000 1.0000 -1.0000 1.0000