Introduction to Data Sciencegiri/teach/5768/F19/lecs/UnitX7-PCA-SVD.pdf · PCA and Matrices FROM JOHNSON & WICHERN, APPLIED MULTIVARIATE STATISTICAL ANALYSIS, 6TH ED. CAP 5510 / CGS

Introduction to  Data ScienceGIRI NARASIMHAN, SCIS, FIU

PCA and MatricesFROM JOHNSON & WICHERN, APPLIED MULTIVARIATE STATISTICAL ANALYSIS, 6TH ED

CAP 5510 / CGS 5166

PCA: Principal Component Analysis

! Tool for Dimensionality Reduction ❑ Reduces impact of curse of dimensionality

! Tool for finding Subspace in which data lies ! Summarization of data to find important variables ! Compares relative importance of variables ! Explains the most amount of variation in data

11/13/19

!3

CAP 5510 / CGS 5166

Principal Components

11/13/19

!4

CAP 5510 / CGS 5166

Principal Components

11/13/19

!5

CAP 5510 / CGS 5166

PCA Animation

11/13/19

!6

https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues

CAP 5510 / CGS 5166

Points and Vectors

! Every point can be thought of a vector from the origin to that point

! p = (1, 3, 2)

11/13/19

!7

CAP 5510 / CGS 5166

Scalar Multiplication and Vector Addition

11/13/19

!8

CAP 5510 / CGS 5166

Dot Product, Angles, Projections

11/13/19

!9

CAP 5510 / CGS 5166

Matrices & Transformations

! Arrays of Values, A ! Linear Transformations

❑ Ax = y

! Matrix Product ❑ Composing transforms

! Matrix Inverse: AB = I → B = A-1

11/13/19

!10

CAP 5510 / CGS 5166

Data as Matrices

11/13/19

!11

CAP 5510 / CGS 5166

Eigenvalues and Eigenvectors

! Under transform A, eigenvectors experience change in magnitude only, but not direction

! Ax = λx; (A – λI)x = 0 ! Characteristic Eq: |A – λI| = 0 ! Eigenvalues: λ ! Eigenvectors: x, e

11/13/19

!12

CAP 5510 / CGS 5166

Eigenvalues and Eigenvectors

11/13/19

!13

CAP 5510 / CGS 5166

Spectral Decomposition

11/13/19

!14

! If A is symmetric, then the following decomposition holds true:

CAP 5510 / CGS 5166

Quadratic Form

! The scalar x’Ax is called quadratic form ! A is positive definite

❑ if x’Ax > 0, whenever x is a nonzero vector

! Equivalently, A is positive definite ❑ if all its eigenvalues are positive

11/13/19

!15

CAP 5510 / CGS 5166

Matrix Inverse & Square Root

11/13/19

!16

CAP 5510 / CGS 5166

Dimension Reduction Revisited

! If we take r eigenvectors, then ❑ Pr = [e1, e2, …, er], and

❑ A can be approximated by taking r eigenvectors

11/13/19

!17

(r X r)

r

(k X r) (r X r) (r X k)

CAP 5510 / CGS 5166

Random Matrices

11/13/19

!18

CAP 5510 / CGS 5166

Covariance Matrix

11/13/19

!19

CAP 5510 / CGS 5166

Correlation Matrix, ⍴

11/13/19

!20

CAP 5510 / CGS 5166

Singular Value Decomposition

! Spectral Decomp. for sq. symm. matrices

! Non-sq. asymmetric matrices? ❑ Use sq. root of eigenvalues of AA’ ❑ Singular values of A

11/13/19

!21

(k X r) (r X r) (r X k)

CAP 5510 / CGS 5166

Dimensionality Reduction

! Given m X k matrix A, we can approximate it by m X s matrix B with s < k = rank(A). Then

! Here we are picking s singular values from SVD

11/13/19

!22

Documents

Introduction to Data Sciencegiri/teach/5768/F19/lecs/UnitX7-PCA-SVD.pdf · PCA and Matrices FROM JOHNSON & WICHERN, APPLIED MULTIVARIATE STATISTICAL ANALYSIS, 6TH ED. CAP 5510 / CGS