A Plot for Visualizing Multivariate Data

Preview:

DESCRIPTION

A Plot for Visualizing Multivariate Data. Rida E. A. Moustafa George Mason University ADM Group,AAL rmoustaf@galaxy.gmu.edu rmustafa@aalcpas.com. Talk Outline. The Theory of MV-Plot. Detecting Linear Structures with MV-plot. Detecting Non-Linear Structures with MV-plot. - PowerPoint PPT Presentation

Citation preview

A Plot for Visualizing Multivariate Data

Rida E. A. Moustafa

George Mason UniversityADM Group,AAL

rmoustaf@galaxy.gmu.edurmustafa@aalcpas.com

Talk Outline

The Theory of MV-Plot. Detecting Linear Structures with MV-plot. Detecting Non-Linear Structures with MV-plot. Comparisons with other methods and application on real data.

MV-Plot Theory

d

jjd

d

jjd

xfxxfxgv

xxfm

1

21

1

1

|)(|))(,(

||)(

Given an observation x=(x1,x2,…,xd)We define m and v as follows:

Computing m and v for every observation produces vector of m and v.

What is the relationship between m and v?

MV-Relationship in 2-d

21212

2

121

2121

2

121

||

|)||(|||

iiij

iji

iij

iji

xxmxv

xxxm

• Normalizing the data in range (0,1) avoid the abs-value in computing m.• Close to the PC in 2-d

MV- detects linear structure(s)

011011

00111

1

01121

01121

0112

;;)1()1(

if

)1(

;)1(

axavaxamawaww

w

wxwv

wxwmwxwx

iiii

ii

iiii

If the data is linear in the original space

It will be linear in the MV-space!!

MV- detects linear structure(s)

1

10

1

1

10

1

)1()1)1(

)1(

2

d

jijjd

dj

d

jijjdj

wdxwdv

wxwm

1

10

1

10

d

jijjj

d

jijjj

axav

axam

Detecting Linear structure(s)Example I

Detecting Linear structure(s) Example II

Detecting Linear structure(s) Example III

Detecting nonlinear datawith MV-plot

MV- plot can detect nonlinear structure in the data set without any changes in the equations.

Detecting nonlinear structure

|)sin(|),sin()sin(,|)cos(|),cos()cos(,

xxvxxmxxxxvxxmxx

Detecting Sphere(s)

.222

1

2212

1

12

dR

ii

d

jiijd

d

jiijdi

mv

dmxmxv

Case I: • The sphere radius R• The sphere center is the origin

Detecting Sphere(s)

.

)()(

222

1

221

2

1

12

dR

ii

d

ji

cj

cjijd

d

ji

cj

cjijdi

mv

mxdxx

mxxxv

Case II: • The sphere radius R• The sphere center is not the origin

Detecting Sphere(s)

Fisher’s IRIS data (150x4) 3-classes of( 50 point each)

Process control data (600x60)6-classes of (100 points each)

Pollen data (3,848x5) (Wegman’s data)2-classes (linear and nonlinear)

Application on Real data

Multidimensional Scaling Fisher Discriminate Analysis Principal Component

Related Dimensional Reduction Methods

IRIS (R. A. Fisher) Dataset150-cases in 4-dim

Time Series Dataset600-cases in 60-dim

Pollen dataset 3,848-points in 5-dim

Other methods:Require more storage and speed.Even if it work, we expect bad results on this particular data.

(Wegman2002)

Pollen dataset

Linear and Nonlinear mixed structures.

The linear structure in the Pollen data set

17+16+18+17+14+16=98 Linear, 3750 nonlinear

Summary MV-algorithm can discover the linear

and nonlinear pattern at the same time.

MV-algorithm can discover symmetric data.

MV-algorithm deals with large multivariate data.

Recommended