14
Multivariate Methods ans Thulin Department of Mathematics, Uppsala University [email protected] Multivariate Methods 22/3 2011 1/14

Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University [email protected] Multivariate Methods

  • Upload
    others

  • View
    31

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Multivariate Methods

Mans Thulin

Department of Mathematics, Uppsala University

[email protected]

Multivariate Methods • 22/3 2011

1/14

Page 2: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Basic information

I 10 credit points

I 10 lectures, 3 computer exercises and 2 problem solvingsessions

I The course book is Johnson & Wichern: Applied MultivariateStatistical Analysis, 6th ed, Pearson

I Some reference literature that might be of interest is listed onthe course information hand-out

I The course is (informally) divided into four blocks

2/14

Page 3: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 1: Multivariate data

I Can we visualize 16-dimensional data? How?

I How can we handle multivariate random variables?

I What is the multivariate analogue to the normal distribution?

●●

0 50 100 150 200 250 300

700

800

900

1000

1100

1200

SO2

Mor

talit

y

●●

0 1 2 3 4 5 6

05

1015

Andrews' Curves

setosaversicolorvirginica

−3 −2 −1 0 1 2 30.

00.

10.

20.

30.

4

x

dnor

m (

x)

3/14

Page 4: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 1: Multivariate data

Course goals: In order to pass the course (grade 3) the studentshould...

I have a knowledge of methods of visualizing multivariate datasets

I be familiar with the multivariate normal distribution

We look at ways to describe multivariate data (graphically andnumerically) and study the properties of multivariate distributionsin general and the multivariate normal distribution in particular.

4/14

Page 5: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 2: Inference under MND

I Assuming a multivariate normal distribution, how can we testhypotheses?

I Can we perform t-tests and ANOVA?

I How do we know that the data is normal?

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

0 2 4 6 8

−2

02

46

8

x[,1]

x[,2

]

●●

●●

●●

5/14

Page 6: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 2: Inference under MND

Course goals: In order to pass the course (grade 3) the studentshould...

I know how to perform statistical tests of the mean value vectorof a multivariate normal distribution

I know how to perform statistical tests of two or severalpopulations of a multivariate normal distribution

I know methods and techniques for validation of multivariatenormal distribution

We learn how to estimate the parameters of the MND and how toperform multivariate analogues of the t-test, ANOVA and more.

6/14

Page 7: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 3: PCA, FA and CCA

I Can we find dependencies within sets of points? Between setsof points?

I Can we use these dependencies to reduce the dimensionalityof the data?

Neg.Temp

0 500 1000 2000 3000

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

6 7 8 9 10 11 12

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

40 60 80 100 120 140 160

−75

−65

−55

−45

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

010

0025

00

●●

● ● ●

●●●

●●●

●●

●● ●

●●

●●● ●

●●

●● ●●

Manuf

●●

●●●

●●●

●●●

●●

●●●

●●

●● ●●

● ●

● ●●●

●●

● ●●

● ●●

●● ●

●●

●●●

●●

● ●●●

●●

● ●●●

●●

●● ●

● ●●

●●●

● ●

●● ●

● ●

● ● ●●

● ●

● ●●●

●●

● ● ●

● ●●

●●●

●●

●● ●

●●

●●● ●

● ●

● ●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●● ●

Pop●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●●

010

0025

00

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

68

1012

●●

● ●●●

●● ●

●●

●●

●●

● ●

● ●

●●

●●●●●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●●●

●● ●

●●

●●

●●

●●

●●

● ●

Wind

●●

● ●●●

● ●●

● ●

●●

●●

● ●

●●

● ●

●●

● ●●●

● ●●

●●

●●

●●

● ●

● ●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●Precip

1030

50

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

−75 −70 −65 −60 −55 −50 −45

4080

120

160

●●●

●●●

●● ●

●●

●●

● ●

● ●●

●●●

●● ●

●●

●●

● ●

0 500 1000 2000 3000

● ●●

●●●

●● ●

●●

●●

●●

● ●●

●●●

●● ●

●●

●●

●●

10 20 30 40 50 60

●●●

●● ●

●●●

●●

●●

●●

Days

7/14

Page 8: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 3: PCA, FA and CCA

Course goals: In order to pass the course (grade 3) the studentshould...

I be able to use principal component and factor analysis fortypical problems

I be able to use canonical correlation analysis

We learn techniques for reducing problems to lower dimensions andfor studying dependencies between sets of observations.

8/14

Page 9: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 4: Classification & cluster analysis

I Using information about different categories, can we classifynew observations as belonging to one of the categories?

I Can we identify clusters of points in the dataset –observations with similar properties?

●●

●●

●●

●●

●●

●●

●●

●●

−5 0 5 10 15

05

1015

x[,1]

x[,2

]

●●

●●

●●

●●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

●●

●●●

●● ●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

−400 −300 −200 −100 0 100 200

−15

0−

100

−50

050

100

150

CLUSPLOT( votes.diss )

Component 1

Com

pone

nt 2

These two components explain 18.87 % of the point variability.

9/14

Page 10: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Block 4: Classification & cluster analysis

Course goals: In order to pass the course (grade 3) the studentshould...

I be able to use classification techniques

I be familiar with methods for multivariate cluster analysis

We study old-fashioned and modern classification techniques andlook at different methods for clustering.

10/14

Page 11: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Examination

I Course goal: be able to present mathematical statisticalarguments to others

I Four mandatory homeworksI Bonus problems can give higher gradesI Feedback – possible to hand in more than once

I Oral presentations of clustering methodsI Take-home exam

I Homeworks and oral presentation must be OKI Date for exam?

11/14

Page 12: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Previous course evaluations

I ”The book was not very up-to-date on some topics.”I We’re still using the same book, since we haven’t found a

suitable replacement. More recent development will bediscussed during the lectures.

I ”Some homework problems and a computer exercise aboutclassification and discrimination would be good.”

I This has been added!

I ”The take-home exam was a great idea for a course like this.”I We’ll have a take-home exam this time as well.

12/14

Page 13: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Computer exercises

I Three scheduled computer exercises

I Physical presence at the exercises is not mandatory, butstrongly recommended

I Software: R

I Download it for free from www.r-project.org

I If you’re not familiar with R, take a look at the filer-intro.pdf in the student portal!

I Remember: you can always contact me if you have questionsabout R or other parts of the course!

13/14

Page 14: Multivariate Methods - Uppsala Universitythulin/mm/intro.pdf · Multivariate Methods M ans Thulin Department of Mathematics, Uppsala University thulin@math.uu.se Multivariate Methods

Course homepage

I Information, files and more is found at:

I studentportalen.uu.se

14/14