59
Visualisatio n 2012 - 2013 Lecture 4 Brian Mac Namee Dublin institute of Technology Applied Intelligence Research Centre Visualising Comparisons

Visualisation 2012 - 2013 Lecture 4

  • Upload
    nyoko

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Visualisation 2012 - 2013 Lecture 4. Visualising Comparisons. Brian Mac Namee Dublin institute of Technology Applied Intelligence Research Centre. Origins. This course is based heavily on a course developed by Colman McMahon ( www.colmanmcmahon.com ) - PowerPoint PPT Presentation

Citation preview

Page 1: Visualisation 2012 - 2013 Lecture  4

Visualisation2012 - 2013

Lecture 4

Brian Mac NameeDublin institute of Technology

Applied Intelligence Research Centre

VisualisingComparisons

Page 2: Visualisation 2012 - 2013 Lecture  4

OriginsThis course is based heavily on a course developed by Colman McMahon (www.colmanmcmahon.com)Material from multiple other online and published sources is also used and when this is the case full citations will be given

22011/12

Page 3: Visualisation 2012 - 2013 Lecture  4

Visualization of the Week

www.pinterest.com/brianmacnamee/great-visualisation-examples/

Page 4: Visualisation 2012 - 2013 Lecture  4

(Un)Visualization of the Week

www.pinterest.com/brianmacnamee/terrible-visualisation-examples/

Page 5: Visualisation 2012 - 2013 Lecture  4

AgendaThis week we are going to look at means through which we can visualise comparisons between variable values

- Single variable exploration- Simple comparisons- Multi distribution comparisons

52011/12

Page 6: Visualisation 2012 - 2013 Lecture  4

SINGLE VARIABLE EXPLORATION

Page 7: Visualisation 2012 - 2013 Lecture  4

Histogram

“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

Page 8: Visualisation 2012 - 2013 Lecture  4

HistogramA histogram gives us an in-depth view of a single numeric variableTo construct a histogram:

- Divide the data range into bins- Count the occurrence frequency of each

bin within the data- Normalize the frequency counts- Plot a bar graph to show the normalised

count for each bin“Visualize This”, N. Yau, Wiley, 2011

http://shop.oreilly.com/product/0636920022060.do

Page 9: Visualisation 2012 - 2013 Lecture  4

Histogram

“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

Page 10: Visualisation 2012 - 2013 Lecture  4

Histogram Shapes

Page 11: Visualisation 2012 - 2013 Lecture  4

Density Plot

2011/12 11“Visualize This”, N. Yau, Wiley, 2011

http://shop.oreilly.com/product/0636920022060.do

Page 12: Visualisation 2012 - 2013 Lecture  4

Density PlotNote that constructing a density plot requires that the probability density function underlying the data in the histogram is constructed – this takes a bit of work!Common approaches include:

- Parzen windows- Clustering- Mixture models

Page 13: Visualisation 2012 - 2013 Lecture  4

Density Plot

From Wikipedia! http://en.wikipedia.org/wiki/Parzen_window

Page 14: Visualisation 2012 - 2013 Lecture  4

Density Plot

From Wikipedia! http://en.wikipedia.org/wiki/Parzen_window

Page 15: Visualisation 2012 - 2013 Lecture  4

Density Plot

2011/12 15“Visualize This”, N. Yau, Wiley, 2011

http://shop.oreilly.com/product/0636920022060.do

Page 16: Visualisation 2012 - 2013 Lecture  4

Histogram & Density Plot Combined

2011/12 16“Visualize This”, N. Yau, Wiley, 2011

http://shop.oreilly.com/product/0636920022060.do

Page 17: Visualisation 2012 - 2013 Lecture  4

HistogramThe histogram is quite possibly your most important visual data exploration tool!!!

Page 18: Visualisation 2012 - 2013 Lecture  4

Box Plot

10

50

30

20

40

0

Page 19: Visualisation 2012 - 2013 Lecture  4

Box PlotVARIABLE VALUESValues displayed for a single variable

MEDIANThe median value for the variable

3rd QUARTILEThe value for the 3rd quartile of the variable values1st QUARTILEThe value for the 1st quartile of the variable values

OUTLIERSValues that fall outside quartile ± 1.5*IQR

MAXMax value below 3rd Q + 1.5*IQR

MINMin value above 1st Q - 1.5*IQR

10

50

30

20

40

0

Page 20: Visualisation 2012 - 2013 Lecture  4

Box PlotThe components of a box plot are:

- A thick dark line at the minimum- A horizontal lines at the 1st quartiles- A horizontal lines at the 3rd quartiles- A whisker down to the low value

• Multiply the IQR by 1.5 to calculate the step• The low value is the lowest value above the 1st quartile

minus the step- A whisker up to the high value

• The high value is the highest value above the 3rd quartile plus the step

- Any values outside low and high are marked as outliers

Page 21: Visualisation 2012 - 2013 Lecture  4

Box PlotSome important points about a box plot:

- 50% of the data occurs between the lower and upper edges of the box

- The lower 50% of the data occurs below the median

- The upper 50% of the data occurs above the median line in the box.

- The lower 25% of the data occurs between the bottom edge of the box and the bottom edge of the lower whisker

- The upper 25% of the data occurs above the top edge of the box and the top edge of the upper whisker

Page 22: Visualisation 2012 - 2013 Lecture  4

Bar Chart

Page 23: Visualisation 2012 - 2013 Lecture  4

Bar Chart

Page 24: Visualisation 2012 - 2013 Lecture  4

Bar Chart

Page 25: Visualisation 2012 - 2013 Lecture  4

Box Plots & Density Functions

From Wikipedia! http://en.wikipedia.org/wiki/Probability_density_function

Page 26: Visualisation 2012 - 2013 Lecture  4

SIMPLE COMPARISONS

Page 27: Visualisation 2012 - 2013 Lecture  4

Simple Bar Graph

“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

A DCB E F

CATEGORY AXISA value is displayed for each category

Categories

Page 28: Visualisation 2012 - 2013 Lecture  4

Simple Bar Graph

Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007

Aver

age

Scor

e

Rating

Page 29: Visualisation 2012 - 2013 Lecture  4

Simple Bar Graph

Edward Tufte, “The Quantittative Display of Information”, 2009

Page 30: Visualisation 2012 - 2013 Lecture  4

Simple Bar Graph

Page 31: Visualisation 2012 - 2013 Lecture  4

Simple Bar Graph

Page 32: Visualisation 2012 - 2013 Lecture  4

Pie Chart

“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

Page 33: Visualisation 2012 - 2013 Lecture  4

Pie Charts

http://www.uh.edu/engines/epi1712.htm

Google Analytics http://analytics.google.com

Page 34: Visualisation 2012 - 2013 Lecture  4

Pie Charts

http://www.uh.edu/engines/epi1712.htm

Google Analytics http://analytics.google.com

Page 35: Visualisation 2012 - 2013 Lecture  4

Pie Charts

Google Analytics http://analytics.google.com

Page 36: Visualisation 2012 - 2013 Lecture  4

Pie Charts

http://www.uh.edu/engines/epi1712.htm

Google Analytics http://analytics.google.com

Page 37: Visualisation 2012 - 2013 Lecture  4

Pie Charts

http://www.uh.edu/engines/epi1712.htm

Google Analytics http://analytics.google.com

Page 38: Visualisation 2012 - 2013 Lecture  4

Pie Chart

Page 39: Visualisation 2012 - 2013 Lecture  4

Pie Chart

William Playfair's "Statistical Breviary,” 1801 via The New York Timeshttp://www.nytimes.com/2012/04/22/magazine/who-made-that-pie-chart.html?_r=0

Page 40: Visualisation 2012 - 2013 Lecture  4

Pie Chart

Florence Nightingales’ Crimean War Death Charts via:http://www.uh.edu/engines/epi1712.htm

Page 41: Visualisation 2012 - 2013 Lecture  4

Pie ChartsPie charts are the subject of a lot of negative comment

- http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00018S

- http://www.juiceanalytics.com/writing/the-problem-with-pie-charts/

- The main reason is that their descriptive power is based on our ability to interpret differences in angle

Pie charts are useful when:- We have a small number of categories (< 8)- The values sum to a meaningful whole- The differences are coarse

Page 42: Visualisation 2012 - 2013 Lecture  4

Doughnut Chart

“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

Page 43: Visualisation 2012 - 2013 Lecture  4

Doughnut Chart

Page 44: Visualisation 2012 - 2013 Lecture  4

Doughnut Chart

Page 45: Visualisation 2012 - 2013 Lecture  4

Tree Map

“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do

Page 46: Visualisation 2012 - 2013 Lecture  4

Billion-Dollar-O-Gram

www.informationisbeautiful.net/2009/the-billion-dollar-gram/

Page 47: Visualisation 2012 - 2013 Lecture  4

TreemapsTreemaps were originally designed to handle hierarchical structures – such as disk drives – but can be used for non-hierarchical dataTreemaps rely on a tiling algorithm to figure out how to position the rectangles

- We will come back to this!

TreeMap page by Ben Schneiderman (TreeMap Pioneer): http://www.cs.umd.edu/hcil/treemap-history/index.shtml Early paper on TreeMaps:

http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isNumber=4467&arNumber=175815&isnumber=4467&arnumber=175815

Page 48: Visualisation 2012 - 2013 Lecture  4

MULTI DISTRIBUTION COMPARISONS

Page 49: Visualisation 2012 - 2013 Lecture  4

Achtung!

Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007

Aver

age

Scor

e

Rating

Page 50: Visualisation 2012 - 2013 Lecture  4

Achtung!

Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007

Aver

age

Scor

e

Rating

Watch out for bar charts that show an average or other aggregate – these can hide a multitude of

detail

Page 51: Visualisation 2012 - 2013 Lecture  4

Six Nation Points

Page 52: Visualisation 2012 - 2013 Lecture  4

Six Nation Points

Page 53: Visualisation 2012 - 2013 Lecture  4

Six Nation Points

Multiple box plots are a great way to show

multiple distributions

Page 54: Visualisation 2012 - 2013 Lecture  4

Overlaid Histograms

2011/12 54

http://blogs.sas.com/content/graphicallyspeaking/2012/02/06/comparative-densities/

Page 55: Visualisation 2012 - 2013 Lecture  4

Back-to-Back Histograms

2011/12 55

http://blogs.sas.com/content/graphicallyspeaking/2012/02/06/comparative-densities/

Page 56: Visualisation 2012 - 2013 Lecture  4

Back-to-Back Histograms

2011/12 56

http://blogs.sas.com/content/graphicallyspeaking/2012/02/06/comparative-densities/

Page 57: Visualisation 2012 - 2013 Lecture  4

Beware of Stacked Histograms

Page 58: Visualisation 2012 - 2013 Lecture  4

Don’t Forget Small Multiples

Page 59: Visualisation 2012 - 2013 Lecture  4

ConclusionsWe often need to create visualisations to compare valuesThere are a range of ways to do thisKey things to keep in mind are:

- Are you comparing values or proportions?

- Are you comparing single values or distributions?

- Are you comparing across one or many dimensions?