Upload
nyoko
View
26
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Visualisation 2012 - 2013 Lecture 4. Visualising Comparisons. Brian Mac Namee Dublin institute of Technology Applied Intelligence Research Centre. Origins. This course is based heavily on a course developed by Colman McMahon ( www.colmanmcmahon.com ) - PowerPoint PPT Presentation
Citation preview
Visualisation2012 - 2013
Lecture 4
Brian Mac NameeDublin institute of Technology
Applied Intelligence Research Centre
VisualisingComparisons
OriginsThis course is based heavily on a course developed by Colman McMahon (www.colmanmcmahon.com)Material from multiple other online and published sources is also used and when this is the case full citations will be given
22011/12
Visualization of the Week
www.pinterest.com/brianmacnamee/great-visualisation-examples/
(Un)Visualization of the Week
www.pinterest.com/brianmacnamee/terrible-visualisation-examples/
AgendaThis week we are going to look at means through which we can visualise comparisons between variable values
- Single variable exploration- Simple comparisons- Multi distribution comparisons
52011/12
SINGLE VARIABLE EXPLORATION
Histogram
“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do
HistogramA histogram gives us an in-depth view of a single numeric variableTo construct a histogram:
- Divide the data range into bins- Count the occurrence frequency of each
bin within the data- Normalize the frequency counts- Plot a bar graph to show the normalised
count for each bin“Visualize This”, N. Yau, Wiley, 2011
http://shop.oreilly.com/product/0636920022060.do
Histogram
“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do
Histogram Shapes
Density Plot
2011/12 11“Visualize This”, N. Yau, Wiley, 2011
http://shop.oreilly.com/product/0636920022060.do
Density PlotNote that constructing a density plot requires that the probability density function underlying the data in the histogram is constructed – this takes a bit of work!Common approaches include:
- Parzen windows- Clustering- Mixture models
Density Plot
From Wikipedia! http://en.wikipedia.org/wiki/Parzen_window
Density Plot
From Wikipedia! http://en.wikipedia.org/wiki/Parzen_window
Density Plot
2011/12 15“Visualize This”, N. Yau, Wiley, 2011
http://shop.oreilly.com/product/0636920022060.do
Histogram & Density Plot Combined
2011/12 16“Visualize This”, N. Yau, Wiley, 2011
http://shop.oreilly.com/product/0636920022060.do
HistogramThe histogram is quite possibly your most important visual data exploration tool!!!
Box Plot
10
50
30
20
40
0
Box PlotVARIABLE VALUESValues displayed for a single variable
MEDIANThe median value for the variable
3rd QUARTILEThe value for the 3rd quartile of the variable values1st QUARTILEThe value for the 1st quartile of the variable values
OUTLIERSValues that fall outside quartile ± 1.5*IQR
MAXMax value below 3rd Q + 1.5*IQR
MINMin value above 1st Q - 1.5*IQR
10
50
30
20
40
0
Box PlotThe components of a box plot are:
- A thick dark line at the minimum- A horizontal lines at the 1st quartiles- A horizontal lines at the 3rd quartiles- A whisker down to the low value
• Multiply the IQR by 1.5 to calculate the step• The low value is the lowest value above the 1st quartile
minus the step- A whisker up to the high value
• The high value is the highest value above the 3rd quartile plus the step
- Any values outside low and high are marked as outliers
Box PlotSome important points about a box plot:
- 50% of the data occurs between the lower and upper edges of the box
- The lower 50% of the data occurs below the median
- The upper 50% of the data occurs above the median line in the box.
- The lower 25% of the data occurs between the bottom edge of the box and the bottom edge of the lower whisker
- The upper 25% of the data occurs above the top edge of the box and the top edge of the upper whisker
Bar Chart
Bar Chart
Bar Chart
Box Plots & Density Functions
From Wikipedia! http://en.wikipedia.org/wiki/Probability_density_function
SIMPLE COMPARISONS
Simple Bar Graph
“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do
A DCB E F
CATEGORY AXISA value is displayed for each category
Categories
Simple Bar Graph
Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007
Aver
age
Scor
e
Rating
Simple Bar Graph
Edward Tufte, “The Quantittative Display of Information”, 2009
Simple Bar Graph
Simple Bar Graph
Pie Chart
“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do
Pie Charts
http://www.uh.edu/engines/epi1712.htm
Google Analytics http://analytics.google.com
Pie Charts
http://www.uh.edu/engines/epi1712.htm
Google Analytics http://analytics.google.com
Pie Charts
http://www.uh.edu/engines/epi1712.htm
Google Analytics http://analytics.google.com
Pie Charts
http://www.uh.edu/engines/epi1712.htm
Google Analytics http://analytics.google.com
Pie Chart
Pie Chart
William Playfair's "Statistical Breviary,” 1801 via The New York Timeshttp://www.nytimes.com/2012/04/22/magazine/who-made-that-pie-chart.html?_r=0
Pie Chart
Florence Nightingales’ Crimean War Death Charts via:http://www.uh.edu/engines/epi1712.htm
Pie ChartsPie charts are the subject of a lot of negative comment
- http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00018S
- http://www.juiceanalytics.com/writing/the-problem-with-pie-charts/
- The main reason is that their descriptive power is based on our ability to interpret differences in angle
Pie charts are useful when:- We have a small number of categories (< 8)- The values sum to a meaningful whole- The differences are coarse
Doughnut Chart
“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do
Doughnut Chart
Doughnut Chart
Tree Map
“Visualize This”, N. Yau, Wiley, 2011 http://shop.oreilly.com/product/0636920022060.do
Billion-Dollar-O-Gram
www.informationisbeautiful.net/2009/the-billion-dollar-gram/
TreemapsTreemaps were originally designed to handle hierarchical structures – such as disk drives – but can be used for non-hierarchical dataTreemaps rely on a tiling algorithm to figure out how to position the rectangles
- We will come back to this!
TreeMap page by Ben Schneiderman (TreeMap Pioneer): http://www.cs.umd.edu/hcil/treemap-history/index.shtml Early paper on TreeMaps:
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isNumber=4467&arNumber=175815&isnumber=4467&arnumber=175815
MULTI DISTRIBUTION COMPARISONS
Achtung!
Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007
Aver
age
Scor
e
Rating
Achtung!
Survey Research & Design in Psychology Course Evaluation http://ucspace.canberra.edu.au/display/7126/Evaluation+2007
Aver
age
Scor
e
Rating
Watch out for bar charts that show an average or other aggregate – these can hide a multitude of
detail
Six Nation Points
Six Nation Points
Six Nation Points
Multiple box plots are a great way to show
multiple distributions
Overlaid Histograms
2011/12 54
http://blogs.sas.com/content/graphicallyspeaking/2012/02/06/comparative-densities/
Back-to-Back Histograms
2011/12 55
http://blogs.sas.com/content/graphicallyspeaking/2012/02/06/comparative-densities/
Back-to-Back Histograms
2011/12 56
http://blogs.sas.com/content/graphicallyspeaking/2012/02/06/comparative-densities/
Beware of Stacked Histograms
Don’t Forget Small Multiples
ConclusionsWe often need to create visualisations to compare valuesThere are a range of ways to do thisKey things to keep in mind are:
- Are you comparing values or proportions?
- Are you comparing single values or distributions?
- Are you comparing across one or many dimensions?