Effective Data Visualization (David Giard)

Preview:

DESCRIPTION

We spend much of our time collecting and analyzing data. That data is only useful if it can be displayed in a meaningful, understandable way. Yale professor Edward Tufte presented many ideas on how to effectively present data to an audience or end user. In this session, I will explain some of Tufte's most important guidelines about data visualization and how you can apply those guidelines to your own data. You will learn what to include, what to remove, and what to avoid in your charts, graphs, maps and other images that represent data." "We spend much of our time collecting and analyzing data. That data is only useful if it can be displayed in a meaningful, understandable way.

Citation preview

David Giard

Microsoft Technical Evangelist

blog: DavidGiard.com

tv: TechnologyAndFriends.com

twitter: @DavidGiard

Data VisualizationThe Ideas of Edward Tufte

@DavidGiard

This presentationis dedicated to

Dave Bost

@DavidGiard

I II III IV

x y x y x y x y

10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58

8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76

13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71

9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84

11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47

14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04

6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25

4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50

12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.59

7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91

5.0 5.68 5.0 4.74 5.0 5.72 8.0 6.89

@DavidGiard

0

5

10

0 10 20

I

0

5

10

0 10 20

II

0

5

10

0 10 20

III

0

5

10

0 10 20

IV

@DavidGiard

Dr. Edward Tufte

@DavidGiard

Graphical Excellence

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

500,000

100,000

10,000

@DavidGiard

Graphical Integrity

@DavidGiard

Blatant Lies

Source: Fox News, Dec 2011Reprinted by Washington Post

@DavidGiard

$(11,014)$0 $(11,014)

@DavidGiard

Lie

@DavidGiard

Lie Factor

๐‘†๐‘–๐‘ง๐‘’ ๐‘‚๐‘“ ๐ธ๐‘“๐‘“๐‘’๐‘๐‘ก ๐‘†โ„Ž๐‘œ๐‘ค๐‘› ๐ผ๐‘› ๐บ๐‘Ÿ๐‘Ž๐‘โ„Ž๐‘–๐‘

๐‘†๐‘–๐‘ง๐‘’ ๐‘‚๐‘“ ๐ธ๐‘“๐‘“๐‘’๐‘๐‘ก ๐ผ๐‘› ๐ท๐‘Ž๐‘ก๐‘Ž

@DavidGiard

Lie

Data Increase = 53%

Graphical Increase = 783%Lie Factor=14.8

@DavidGiard

Truth

0

5

10

15

20

25

30

1978 1979 1980 1981 1982 1983 1984 1985

Required Fuel Economy Standards:New cars built from 1978 to 1985

@DavidGiard

Data Change = 125%

Graphical Change = 406%Lie Factor=3.8

@DavidGiard

Data Change = 554%

Graphical Change = 27,000%Lie Factor=48.8

@DavidGiard

@DavidGiard

@DavidGiard

Context

@DavidGiard

275

300

325

1955 1956

Connecticut Traffic Deaths,Before (1955) and After(1956)

Stricter Enforcement by the PoliceAgainst Cars Exceeding Speed Limit

Before stricterenforcement

After stricterenforcement

@DavidGiard

@DavidGiard

225

250

275

300

325

1951 1952 1953 1954 1955 1956 1957 1958 1959

Connecticut Traffic Deaths1951-1959

@DavidGiard

6

8

10

12

14

16

1951 1952 1953 1954 1955 1956 1957 1958 1959

Traffic Deaths per 100,000Persons in Connecticut, Massachusetts, Rhode Island, and New York1951-1959

NY

MA

CT

RI

@DavidGiard

Principles of Graphical Integrity

โ€ข Data Representations proportional to Data

โ€ข #Dimensions in graph = #Dimensions in data

โ€ข Real dollars, instead of deflated dollars

โ€ข Provide context

@DavidGiard

Data-Ink

@DavidGiard

Data-Ink Ratio

= ๐ท๐‘Ž๐‘ก๐‘Ž ๐ผ๐‘›๐‘˜

๐‘‡๐‘œ๐‘ก๐‘Ž๐‘™ ๐ผ๐‘›๐‘˜

@DavidGiard

Redundant Data

@DavidGiard

35.9

@DavidGiard

35.9

@DavidGiard

Metadata

@DavidGiard

0

20

40

60

80

100

120

140

160

0 1 2 3 4 5 6

@DavidGiard

0

20

40

60

80

100

120

140

160

0 1 2 3 4 5 6

@DavidGiard

0

20

40

60

80

100

120

140

160

0 1 2 3 4 5 6

@DavidGiard

0

40

80

120

160

0 2 4 6

@DavidGiard

0

40

80

120

160

0 2 4 6

@DavidGiard

0

40

80

120

160

0 2 4 6

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

Principles

โ€ข Above all else, show the data

โ€ข Maximize the Data-Ink ratio, within reason

โ€ข Erase non-data-ink

โ€ข Erase redundant data-ink

โ€ข Revise and edit

@DavidGiard

Vibrations

@DavidGiard

Vibrations

@DavidGiard

@DavidGiard

@DavidGiard

0

5

10

15

20

25

30

35

40

45

50

55

60

PER

CEN

T C

RIT

ICA

L A

RTI

CLE

S

ISSUE AREAS

INFLATION

UNEMPLOYMENT

SHORTAGES

RACE

CRIME

GOVT. POWER

CONFIDENCE

WATERGATE

COMPETENCE

Linear (RACE)

@DavidGiard

INFL

ATI

ON

UN

EMP

LOYM

ENT

SHO

RTA

GES

RA

CE

CR

IME

GO

VT.

PO

WER

CO

NFI

DEN

CE

WAT

ERG

ATE

CO

MP

ETEN

CE

0

5

10

15

20

25

30

35

40

45

50

55

60

PER

CEN

T C

RIT

ICA

L A

RTI

CLE

S

ISSUE AREAS

@DavidGiard

Chart Junk and Ducks

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

Worst. Graph. Ever.

@DavidGiard

Year % Students < 25

1972 28.0

1973 29.2

1974 32.8

1975 33.6

1976 33.0

@DavidGiard

Multifunctioning Graphical Elements

@DavidGiard

@DavidGiard

@DavidGiard

@DavidGiard

Data Density

@DavidGiard

Data Density

๐‘๐‘ข๐‘š๐‘๐‘’๐‘Ÿ ๐‘œ๐‘“ ๐‘’๐‘›๐‘ก๐‘Ÿ๐‘–๐‘’๐‘  ๐‘–๐‘› ๐‘‘๐‘Ž๐‘ก๐‘Ž ๐‘š๐‘Ž๐‘ก๐‘Ÿ๐‘–๐‘ฅ

๐ด๐‘Ÿ๐‘’๐‘Ž ๐‘œ๐‘“ ๐ท๐‘Ž๐‘ก๐‘Ž ๐บ๐‘Ÿ๐‘Ž๐‘โ„Ž๐‘–๐‘

@DavidGiard

Low Data Density

@DavidGiard

Low Data Density

Number of entries = 4

Graph Area = 26.5 square inches

Data Density = 4 ๐‘‘๐‘Ž๐‘ก๐‘Ž ๐‘’๐‘›๐‘ก๐‘Ÿ๐‘–๐‘’๐‘ 

26.5 ๐‘ ๐‘ž. ๐‘–๐‘›.

=.15 data entries per sq. in.

@DavidGiard

High Data Density

181 Numbers per square inch

@DavidGiard

High Data Density

1,000 Numbers per square inch

@DavidGiard

Small Multiples

@DavidGiard

Small Multiples

@DavidGiard

Small Multiples

@DavidGiard

Tufteโ€™s Graphs

โ€ข Sparkline

โ€ข Slope Graph

@DavidGiard

Sparklines

@DavidGiard

Sparklines

@DavidGiard

Slope Graph

@DavidGiard

Slope Graph

Source: The Atlantic, June 30, 2012

@DavidGiard

Takeaways

โ€ข Maintain Graphical Integrity

โ€ข Maximize Data-Ink Ratio, within reason

โ€ข Avoid Chartjunk and Ducks

โ€ข Use Multifunctioning Graphical Elements, if possible

โ€ข Keep Labels with data

โ€ข Maximize Data Density

@DavidGiard

@DavidGiard

00 -5-9

-21

-11

-20

-24

-30

-26

Temperature ( C )

10/10

10/18

10/24

11/9

11/14

11/20

11/28

12/1

12/6

12/7

100,000

96,000

55,000

37,000

24,000

50,000

25,00020,00012,00010,000

# Troops

10/10

10/18

10/24

11/9

11/14

11/20

11/28

12/1

12/6

12/7

040 90145

180

250

275300

320

365

Distance Traveled (km)

10/10

10/18

10/24

11/9

11/14

11/20

11/28

12/1

12/6

12/7

@DavidGiard

0

20,000

40,000

60,000

80,000

100,000

120,000

10

/10

10

/12

10

/14

10

/16

10

/18

10

/20

10

/22

10

/24

10

/26

10

/28

10

/30

11

/1

11

/3

11

/5

11

/7

11

/9

11

/11

11

/13

11

/15

11

/17

11

/19

11

/21

11

/23

11

/25

11

/27

11

/29

12

/1

12

/3

12

/5

12

/7

# Tr

oo

ps

Date

Troops

Troops

@DavidGiard

0

20,000

40,000

60,000

80,000

100,000

120,000

10

/10

10

/12

10

/14

10

/16

10

/18

10

/20

10

/22

10

/24

10

/26

10

/28

10

/30

11

/1

11

/3

11

/5

11

/7

11

/9

11

/11

11

/13

11

/15

11

/17

11

/19

11

/21

11

/23

11

/25

11

/27

11

/29

12

/1

12

/3

12

/5

12

/7

# Tr

oo

ps

Date

Troops

Troops

@DavidGiard

0

20,000

40,000

60,000

80,000

100,000

120,000

10

/10

10

/12

10

/14

10

/16

10

/18

10

/20

10

/22

10

/24

10

/26

10

/28

10

/30

11

/1

11

/3

11

/5

11

/7

11

/9

11

/11

11

/13

11

/15

11

/17

11

/19

11

/21

11

/23

11

/25

11

/27

11

/29

12

/1

12

/3

12

/5

12

/7

# Tr

oo

ps

Date

Troops

@DavidGiard

0

20,000

40,000

60,000

80,000

100,000

120,000

10

/10

10

/12

10

/14

10

/16

10

/18

10

/20

10

/22

10

/24

10

/26

10

/28

10

/30

11

/1

11

/3

11

/5

11

/7

11

/9

11

/11

11

/13

11

/15

11

/17

11

/19

11

/21

11

/23

11

/25

11

/27

11

/29

12

/1

12

/3

12

/5

12

/7

# Tr

oo

ps

Date

@DavidGiard

0

20,000

40,000

60,000

80,000

100,000

120,000

10/10 10/17 10/24 10/31 11/7 11/14 11/21 11/28 12/5

# Tr

oo

ps

Date

@DavidGiard

-35

-30

-25

-20

-15

-10

-5

0

0

20,000

40,000

60,000

80,000

100,000

120,000

10/10 10/17 10/24 10/31 11/7 11/14 11/21 11/28 12/5

Tem

pe

ratu

re (

Ce

lsiu

s)

# Tr

oo

ps

Date

Troops

Temperature

David Giard

Microsoft Technical Evangelist

blog: DavidGiard.com

tv: TechnologyAndFriends.com

twitter: @DavidGiard

@DavidGiard

@DavidGiard

Recommended