23
ENV 2006 4.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

Embed Size (px)

Citation preview

Page 1: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.1

Envisioning Information

Lecture 4 – Multivariate Data Exploration

Glyphs and other methods

Hierarchical approaches

Ken Brodlie

Page 2: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.2

Glyph Techniques

Page 3: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.3

Glyph Techniques

• Map data values to geometric and colour attributes of a glyph – or marker symbol

• Very many types of glyph have been suggested:

– Star glyphs– Faces – Arrows– Sticks– Shape coding

Page 4: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.4

Glyph Layouts

• How do we place the glyphs on a chart?

• Sometimes there will be a natural location – for example?

• If not… two of the variates can be allocated to spatial position, and the remainder to the attrributes of the glyph

Page 5: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.5

Glyph Techniques – Star Plots

• Each observation represented as a ‘star’

• Each spike represents a variable

• Length of spike indicates the value

Page 6: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.6

Glyph Techniques – Star Plots

• Each observation represented as a ‘star’

• Each spike represents a variable

• Length of spike indicates the value

Crime inDetroit

Page 7: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.7

Star Glyphs – Iris Data Set

Page 8: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.8

• Chernoff suggested use of faces to encode a variety of variables - can map to size, shape, colour of facial features - human brain rapidly recognises faces

Chernoff Faces

Page 9: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.9

Chernoff Faces

• Here are some of the facial features you can use

http://www.bradandkathy.com/software/faces.html

Page 10: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.10

Chernoff Faces

• Demonstration applet at:– http://www.hesketh.com/schampeo/projects/Faces/

Page 11: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.11

Chernoff’s Face

• .. And here is Chernoff’s face

http://www.fas.harvard.edu/~stats/People/Faculty/Herman_Chernoff/Herman_Chernoff_Index.html

Page 12: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.12

Stick Figures

• Glyph is a matchstick figure, with variables mapped to angle and length of limbs • As with Chernoff faces, two

variables are mapped to display axes

• Stick figures useful for very large data sets

• Texture patterns emerge

• Idea due to RM Pickett & G Grinstein

- different anglesthat may be variedare shown

Page 13: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.13

5D imagedata fromGreat Lakesregion

Stick Figures

Page 14: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.14

• Suitable where a variable has a Boolean value, ie on/off• A data item is represented as an array of elements, each

element corresponding to a variable

1

2

3

4

5

6

shade in boxif value ofcorrespondingvariable is ‘on’

Arrays laid out in a line, or plane, as with othericon-based methods

Shape Coding

Page 15: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.15

Time series of NASAearthobservationdata

Shape Coding

Page 16: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.16

Dry

Wet

Showery

Saturday

Sunday

Leeds

Sahara

Amazon

* variables and their values placed around circle

* lines connect the values for one observation

This item is { wet, Saturday, Amazon }http://www.daisy.co.uk

Daisy Charts

Page 17: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.17

Daisy Charts - Underground Problems

Page 18: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.18

Daisy Charts – News Analysis

• Four variates: day, source, search terms, keywords

Page 19: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.19

Reducing Complexity in Multivariate Data Exploration

Page 20: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.20

Clustering as a Solution

• Success has been achieved through clustering of observations

• Hierarchical parallel co-ordinates

– Cluster by similarity– Display using translucency

and proximity-based colour

http://davis.wpi.edu/~xmdv/docs/vis99_HPC.pdf

Page 21: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.21

Comparison

One of 3 clusters

Page 22: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.22

Hierarchical Parallel Co-ordinates

Page 23: ENV 20064.1 Envisioning Information Lecture 4 – Multivariate Data Exploration Glyphs and other methods Hierarchical approaches Ken Brodlie

ENV 2006 4.23

Reduction of Dimensionality of Variable Space

• Reduce number of variables, preserve information

• Principal Component Analysis– Transform to new co-ordinate

system– Hard to interpret

• Hierarchical reduction of variable space

– Cluster variables where distance between observations is typically small

– Choose representative for each cluster

• Subgroup has then been identified – showing what?

http://davis.wpi.edu/%7Exmdv/docs/vhdr_vissym.pdf

42 dimensions, 200 observations