12
tom.h.wilson [email protected]. edu Department of Geology and Geography West Virginia University Morgantown, WV

Eigenvalues/Eigenvectors & Discriminant Analysis

  • Upload
    sarai

  • View
    75

  • Download
    6

Embed Size (px)

DESCRIPTION

Geology 659 - Quantitative Methods. Eigenvalues/Eigenvectors & Discriminant Analysis. tom.h.wilson [email protected]. Department of Geology and Geography West Virginia University Morgantown, WV. Eigenvalue and eigenvector problems. - PowerPoint PPT Presentation

Citation preview

Page 1: Eigenvalues/Eigenvectors & Discriminant Analysis

[email protected]

Department of Geology and GeographyWest Virginia University

Morgantown, WV

Page 2: Eigenvalues/Eigenvectors & Discriminant Analysis

The eigenvalue/eigenvector problems of concern to us in statistical analysis are associated with matrices of correlation coefficients.

Consider the 4 x 4 matrix on page 147.

1 0.28

0.28 1

The matrix is symmetrical. The diagonal elements with value 1 represent the correlation of a sample with itself, while the remaining elements represent correlations of 1 sample to another

Page 3: Eigenvalues/Eigenvectors & Discriminant Analysis

The plots represent different states of correlation between two variables. The eigenvectors define the directions of maximum and minimum variance.

High correlation

Low correlation

Page 4: Eigenvalues/Eigenvectors & Discriminant Analysis

The example in the text illustrates grain size and sorting variations associated with two samples. One sample is taken from an offshore environment and the other from a beach environment. The plot suggests that neither grain size or sorting uniquely differentiate the offshore sands form the beach sands.

Page 5: Eigenvalues/Eigenvectors & Discriminant Analysis

The result of discriminant analysis to find a linear combination of the sorting and grain size characteristics that helps differentiate between the clusters of offshore sands and beach sands appearing in the scatter plot.

Page 6: Eigenvalues/Eigenvectors & Discriminant Analysis

As discussed by Davis, it is possible to derive a discriminant function along which there is maximum difference in the locations of beach and offshore sand clusters as defined by sorting and grain size.

1 1 2 2i i iR X X

The discriminant function (below) yields a “score” for each observation or sorting/grain size pair.

Beach Sands

Offshore Sands

i is the observation and j (1 or 2) is the variable (grain size or sorting).

The score is calculated as

Page 7: Eigenvalues/Eigenvectors & Discriminant Analysis

1

2

where is the average of mean grain size, and

is average of mean sorting.

i

i

X

X

The cutoff score is the score calculated using the average of the group averages for sorting and grain size.

1 21 2i ioR X X

Page 8: Eigenvalues/Eigenvectors & Discriminant Analysis

Ro helps differentiate the two groups and may suggest that some observations in the offshore sands, for example, may actually be beach sands. As you can see below, the discriminant scores suggest that three observations classified as offshore sands have characteristics similar to those of beach sands. There are also three observations classed as beach sands that have characteristics more like those of the offshore sand.

Page 9: Eigenvalues/Eigenvectors & Discriminant Analysis

Discriminant analysis allows you to maximize the difference between clusters in the multidimensionalmultidimensional space defined by the measured variables.

Discriminant analysis provides a one dimensional measure of cluster separation along the discriminant score axis.

Page 10: Eigenvalues/Eigenvectors & Discriminant Analysis

The data provided by Davis consists of three columns: 1) Group (A (beach) or B (offshore)), 2) median grain size, 3) sorting coefficient.

Beach Sands vs. Offshore Sands

1

1.05

1.1

1.15

1.2

1.25

1.3

1.35

0.31 0.32 0.33 0.34 0.35 0.36

Porosity

So

rtin

g

Using the original classifications the two groups plot as shown below.

Page 11: Eigenvalues/Eigenvectors & Discriminant Analysis

In today’s lab we’ll show you how to use the statistical analysis package MiniTab and work through examples in multiple linear regression and discriminant analysis using data sets from Davis (2002).

Note that Minitab has abundant help files that will provide answers to many of your questions.

Page 12: Eigenvalues/Eigenvectors & Discriminant Analysis

Help files for Multiple Regression and Discriminant Analysis are included in today’s handout.

The simple exercises begun in today’s class should be handed in on Thursday.