Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 1
Multivariate AnalysisMultivariate AnalysisPast, Present and Future Past, Present and Future
Harrison B. ProsperFlorida State University
PHYSTAT 2003PHYSTAT 200310 September 2003
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 2
OutlineOutline
Introduction Historical Note Current Practice Issues Summary
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 3
IntroductionIntroduction
Data are invariably multivariate
Particle physics (, , E, f)
Astrophysics (θ, , E, t)
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 4
Introduction – II Introduction – II A Textbook ExampleA Textbook Example
Objects Jet 1 (b) 3 Jet 2 3 Jet 3 3 Jet 4 (b) 3 Positron 3 Neutrino 2
17
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 5
Introduction – IIIIntroduction – III
Astrophysics/Particle physics: Similarities Events Interesting events occur at random Poisson processes Backgrounds are important Experimental response functions Huge datasets
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 6
Introduction – IVIntroduction – IV
Differences In particle physics we control when
events occur and under what conditions
We have detailed predictions of the relative frequency of various outcomes
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 7
Introduction – VIntroduction – VAll we do is Count!All we do is Count!
Our experiments are ideal Bernoulli trials At Fermilab, each collision, that is, trial, is
conducted the same way every 400ns400ns
de Finetti’s analysis of exchangeable trials is an de Finetti’s analysis of exchangeable trials is an accurate model of what we doaccurate model of what we do
)()(,
),(
)(),,(),...,(1
01
pfnk
npkPoisson
dfnkBinomialeeP n
)()(,
),(
)(),,(),...,(1
01
pfnk
npkPoisson
dfnkBinomialeeP n
Time →
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 8
Introduction – VIIntroduction – VI
Typical analysis tasks Data Compression Clustering and cluster characterization Classification/Discrimination Estimation Model selection/Hypothesis testing
Optimization
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 9
Historical NoteHistorical NoteKarl Pearson (1857 – 1936)
P.C. Mahalanobis (1893 – 1972)
R.A. Fisher (1890 – 1962)
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 10
Historical Note – Iris DataHistorical Note – Iris Data
Iris Sotosa
Iris Versicolor
R.A. Fisher, The Use of Multiple Measurements in Taxonomic Problems,Annals of Eugenics, v. 7, p. 179-188 (1936)
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 11
Iris DataIris Data
Variables X1 Sepal length X2 Sepal width X3 Petal length X4 Petal width
“What linear function of the four measurements will maximize the ratio of the difference between the specific means to the standard deviations within species?” R.A. Fisher
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 12
Fisher Linear Discriminant (1936)Fisher Linear Discriminant (1936)
xy BA 1)( xy BA 1)(
4321 1036.101299.79037.5 xxxxy 4321 1036.101299.79037.5 xxxxy
Solution:
bxw
xGaussian
xGaussian
)()(
,|
,|log
12
22
2
1
Which is the same, within a constant, as
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 13
Current Practice in Particle PhysicsCurrent Practice in Particle Physics
Reducing number of variables Principal Component Analysis (PCA)
Discrimination/Classification Fisher Linear Discriminant (FLD) Random Grid Search (RGS) Feedforward Neural Network (FNN) Kernel Density Estimation (KDE)
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 14
Current Practice – IICurrent Practice – II
Parameter Estimation Maximum Likelihood (ML) Bayesian (KDE and analytical methods)
e.g., see talk by Florencia Canelli (12A)
Weighting Usually 0, 1, referred to as “cutscuts” Sometimes use the R. Barlow method
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 15
Points that liebelow the cutsare “cut out”
Cuts (0, 1 weights)Cuts (0, 1 weights)
We refer to ((xx00, , yy00))as a cut-pointcut-point
S = B =
0
0
yy
xx
0y0y
x0x0yy
xx
0011
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 16
Apply cuts at each grid point
Grid Search Grid Search
x
yx x
y yi
i
Curse of dimensionality: number of cut-points ~ NbinNbinNdimNdim
S = B =
compute some measure of theireffectivenessand choose mosteffective cuts
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 17
Random Grid SearchRandom Grid SearchS
igna
l fra
ctio
n
Background fraction
0
0
1
1
n = # events in samplek = # events after cutsfraction = n/k
Take each point each point ofthe signal classsignal class as a cut-pointa cut-point x x
y yi
i
H.B.P. et al, Proceedings, CHEP 1995 x
y
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 18
Example: DExample: DØ Ø Top Discovery (1995)Top Discovery (1995)
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 19
Optimal Discrimination Optimal Discrimination
xx
yy
r(x,y) = constantconstant defines the optimaldecision boundarydecision boundary
r p x y s( , | ) p s( )p x y s( , | ) p s( )BayesBayes
DiscriminantDiscriminant
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 20
FeedForward Neural NetworksFeedForward Neural Networks
Applications Discrimination Parameter estimation Function and density estimation
Basic Idea Encode mapping (Kolmogorov, 1950s).
using a set of 1-D functions.],..,[)(: 1
1K
N FxfUUf
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 21
Example: Example: DDØØ Search for LeptoQuarksSearch for LeptoQuarks
q
g
LQ
q
q
l
LQ
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 22
IssuesIssues
Method choice Life is short and data finite; so how
should one choose a method?
Model complexity How to reduce dimensionality of data,
while minimizing loss of “information”? How many model parameters? How should one avoid over-fitting?
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 23
Issues – IIssues – III
Model robustness Is a cut on a multivariate discriminant
necessarily more sensitive to modeling errors than a cut on each of its input variables?
What is a practical, but useful, way to assess sensitivity to modeling errors and robustness with respect to assumptions?
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 24
Issues - IIIIssues - III
Accuracy of predictions How should one place “error bars” on
multivariate-based results? Is a Bayesian approach useful?
Goodness of fit How can this be done in multiple
dimensions?
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 25
SummarySummary
After ~ 80 years of effort we have many powerful methods of analysis
A few of which are now used routinely in physics analyses
The most pressing need is to understand some issues better so that when the data tsunami strikes we can respond sensibly
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 26
Minimize the empirical risk function with respect to
i
iiN xntR 21 )],([)(
FNN – Probabilistic InterpretationFNN – Probabilistic Interpretation
Solution (for large N)
dtxtpxtxn )|()(),( dtxtpxtxn )|()(),(
k
kpkxpkpkxpxkpxn )()|(/)()|()|(),( k
kpkxpkpkxpxkpxn )()|(/)()|()|(),( If t(x) = k[1I(x)], where I(x) = 1 if x is of class k, 0 otherwise
D.W. Ruck et al., IEEE Trans. Neural Networks 1(4), 296-298 (1990)E.A. Wan, IEEE Trans. Neural Networks 1(4), 303-305 (1990)
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 27
Self Organizing MapSelf Organizing Map
Basic Idea (Kohonen, 1988) Map each of K feature vectors X =
(x1,..,xN)T into one of M regions of interest defined by the vector wm so that all X mapped to a given wm are closer to it than to all remaining wm.
Basically, perform a coarse-graining of the feature space.
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 28
Support Vector MachinesSupport Vector Machines
Basic Idea Data that are non-separable in N-
dimensions have a higher chance of being separable if mapped into a space of higher dimension
Use a linear discriminant to partition the high dimensional feature space.
bxwxD )()(
HugeN :
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 29
Independent Component AnalysisIndependent Component Analysis
Basic Idea Assume X = (x1,..,xN)T is a linear sum X
= AS of independent sources S = (s1,..,sN)T. Both A, the mixing
matrix, and S are unknown. Find a de-mixing matrix T such that the
components of U = TX are statistically independent
Multivariate Analysis PHYSTAT 2003 Harrison B. Prosper 30