Principal Component Analysis · 2016. 5. 16. · –Represented by matrix X of size DxN –Let’s...

Unsupervised LearningPrincipal Component Analysis

CMSC 422

MARINE CARPUAT

marine@cs.umd.edu

Slides credit: Maria-Florina Balcan

Unsupervised Learning

• Discovering hidden structure in data

• Last time: K-Means Clustering

– What is the objective optimized?

– How can we improve initialization?

– What is the right value of K?

• Today: how can we learn better

representations of our data points?

Dimensionality Reduction

• Goal: extract hidden lower-dimensional

structure from high dimensional datasets

• Why?

– To visualize data more easily

– To remove noise in data

– To lower resource requirements for

storing/processing data

– To improve classification/clustering

Examples of data points in D dimensional

space that can be effectively represented in

a d-dimensional subspace (d < D)

Principal Component Analysis

• Goal: Find a projection of the data onto

directions that maximize variance of the

original data set

– Intuition: those are directions in which most

information is encoded

• Definition: Principal Components are

orthogonal directions that capture most of

the variance in the data

PCA: finding principal

components

• 1st PC

– Projection of data points along 1st PC

discriminates data most along any one

direction

• 2nd PC

– next orthogonal direction of greatest

variability

• And so on…

PCA: notation

• Data points

– Represented by matrix X of size DxN

– Let’s assume data is centered

• Principal components are d vectors: 𝑣1, 𝑣2, … 𝑣𝑑– 𝑣𝑖 . 𝑣𝑗 = 0, 𝑖 ≠ 𝑗 and 𝑣𝑖 . 𝑣𝑖 = 1

• The sample variance data projected on vector v

is 1𝑛 𝑖=1𝑛 (𝑣𝑇𝑥𝑖)2 = 𝑣𝑇𝑋𝑋𝑇 𝑣

PCA formally

• Finding vector that maximizes sample

variance of projected data:

𝑎𝑟𝑔𝑚𝑎𝑥𝑣 𝑣𝑇𝑋𝑋𝑇 𝑣 such that 𝑣𝑇𝑣 = 1

• A constrained optimization problem

Lagrangian folds constraint into objective:

𝑎𝑟𝑔𝑚𝑎𝑥𝑣 𝑣𝑇𝑋𝑋𝑇 𝑣 − 𝜆𝑣𝑇𝑣

Solutions are vectors v such that 𝑋𝑋𝑇 𝑣 = 𝜆𝑣

i.e. eigenvectors of 𝑋𝑋𝑇 (sample covariance matrix)

PCA formally

• The eigenvalue 𝜆 denotes the amount of variability

captured along dimension 𝑣

– Sample variance of projection 𝑣𝑇𝑋𝑋𝑇 𝑣 = 𝜆

• If we rank eigenvalues from large to small

– The 1st PC is the eigenvector of 𝑋𝑋𝑇 associated

with largest eigenvalue

– The 2nd PC is the eigenvector of 𝑋𝑋𝑇 associated

with 2nd largest eigenvalue

– …

Alternative interpretation of PCA

• PCA finds vectors v such that projection on

to these vectors minimizes reconstruction

Resulting PCA algorithm

How to choose the

hyperparameter K?

• i.e. the number of dimensions

• We can ignore the components of smaller

significance

An example: Eigenfaces

PCA pros and cons

• Pros

– Eigenvector method

– No tuning of the parameters

– No local optima

• Cons

– Only based on covariance (2nd order statistics)

– Limited to linear projections

What you should know

• Formulate K-Means clustering as an optimization

problem

• Choose initialization strategies for K-Means

• Understand the impact of K on the optimization

objective

• Why and how to perform Principal Components

Analysis

Principal Component Analysis · 2016. 5. 16. · –Represented by matrix X of size DxN –Let’s...

Documents

Our Principal Assistant Principal Ms. Nancy Puckett

Principal & Assistant Principal

PRINCIPAL: From The Principal

Principal/ Vice-Principal Performance Appraisal

Wethersfield High School Thomas R. Moore, Principal Andrew Komar, Assistant Principal Holly Herzman, Assistant Principal Michael Maltese, Assistant Principal

Principal minors, Part II: The principal minor …dmitryk/Analysis seminar/principal...Principal minors, Part II: The principal minor assignment problem Kent Grifﬁn, Michael J. Tsatsomeros

Principal & Assistant Principal Student Growth Goals

principal components regression by using generalized principal

Clark Doten, Principal Jenny Berg, Associate Principal

ggpsbokaro.orgggpsbokaro.org/images/Prospectus.pdf · 6. 8. president Vice president Secretary principal, Bokaro Principal, Chas Principal, Daltonganj principal, Dhanbad Sri Tarsem

Principal Components and Factor Analysis Principal components

Borrower and Borrower Principal CertificateBorrower and Borrower Principal Certificate Author Freddie Mac Multifamily Subject Borrower and Borrower Principal Certificate Keywords borrower,principal,principal

Principal Prosperity Series Principal Asia Pacific …...Principal Prosperity Series Principal Asia Pacific High Dividend Equity Fund | page 1 Principal Asset Management Company (Asia)

Lecture 20: Rotational kinematics and energeticsweb.mst.edu/~vojtaa/engphys1/lectures/lec20.pdfRelationship between angular and linear motion 𝑎 = 𝑣2 =𝜔2 Angular speed 𝜔is

· carpa principal / stand vatel club mÉxico carpa tar carpa principal / escenario principal lateral derecho - de la carpa principal carpausos

DR. AHMAD SABRI UNIVERSITAS GUNADARMA dengan …nurma.staff.gunadarma.ac.id/Downloads/files/72939/...Dengan kata lain, vektor solusi merupakan kombinasi linier dari vektor 𝑣1= −1

Principal MPF - Smart Plan · 2020-01-02 · Principal MPF – Smart Plan exclusively distributed by AXA “Principal, Principal and symbol design and Principal Financial Group are

OUSD Principal & Assistant Principal Recruitment, 2017-18

10(m) - FC2daigakunyuushikouryakunoheya.web.fc2.com/...(5) 𝑣2 𝑣 0 2 2 2𝑎 102 20 2𝑎 75 2𝑎 2.0 m/s2 2.0 m/s (6) 𝑣0𝑡 1 2 𝑎𝑡2 105 20 3.0 + 1 2 𝑎 3.02 2𝑎

Campus Principal Associate Principal Assistant Principal (A-GO) 9th -12 th Grade Assistant Principal (Gr-Or) 9th -12 th Grade Assistant Principal (Os-Z)