Outflow: Exploring Flow, Factors and Outcome of Temporal Event Sequences

Preview:

DESCRIPTION

My presentation at IEEE VisWeek 2012 in Seattle, WA //// Abstract: Event sequence data is common in many domains, ranging from electronic medical records (EMRs) to sports events. Moreover, such sequences often result in measurable outcomes (e.g., life or death, win or loss). Collections of event sequences can be aggregated together to form event progression pathways. These pathways can then be connected with outcomes to model how alternative chains of events may lead to different results. This paper describes the Outflow visualization technique, designed to (1) aggregate multiple event sequences, (2) display the aggregate pathways through different event states with timing and cardinality, (3) summarize the pathways’ corresponding outcomes, and (4) allow users to explore external factors that correlate with specific pathway state transitions. Results from a user study with twelve participants show that users were able to learn how to use Outflow easily with limited training and perform a range of tasks both accurately and rapidly.

Citation preview

m

Outflow

Krist Wongsuphasawat HCIL, University of Maryland

David Gotz IBM Research

Exploring Flow, Factors and Outcomes of Temporal Event Sequences

InfoVis 2012 Seattle, WA

m

Events

m

Event | 12:15 p.m. Lunch

m

Event Sequences Event Event Event

m

Daily Activity

7:30 a.m. Wake Up

7:45 a.m. Exercise

8:15 a.m. Go to work

m

Soccer Game

90th minute Team A scores

25th minute Team B scores

10th minute Team A scores

m

Game #1

Time

10th minute Goal

90th minute Goal

25th minute Concede

Soccer Game

m

Goal

Game #1

Concede Goal

Goal

Game #2

Goal Concede

Time

Goal

Game #3

Concede Concede

Concede

Game #n

Goal Goal Goal

Many games

m

with outcome

Game #1

Game #2

Time

Game #3

Game #n

Lose (0)

Win (1)

Win (1)

Win (1)

Goal Concede Goal

Goal Goal Concede

Goal Concede Concede

Concede Goal Goal Goal

m

7 event types

823543 combinations

7 events per entity

m

Enjoy!

m

consumable

m

Overview / Summary

Event Sequences with Outcome

m

7 Steps

m

Step 1 | Aggregation

m

Entity #1

Entity #2

Entity #4

Entity #3

Entity #5

Entity #6

Entity #n

Entity #7

Outflow Graph

Event Sequences

m

Assumption •  Events are persistent.

e1

Entity #1

e2 e3

Entity #1

m

Assumption •  Events are persistent.

e1

Entity #1

e2 e3

e1

Entity #1

e1 e1

m

Assumption •  Events are persistent.

e1

Entity #1

e2 e3

e1

Entity #1

e1 e2

e1 e2

m

Assumption •  Events are persistent.

e1

Entity #1

e2 e3

e1

Entity #1

e1 e2

e1 e2 e3

m

Assumption •  Events are persistent.

e1

Entity #1

e2 e3

e1

Entity #1

e1 e2

e1 e2 e3

[e1]

[e1, e2]

[e1, e2, e3] States

m

Select alignment point Pick a state

What are the paths that led to ?

What are the paths after ?

Soccer: Goal, Concede, Goal

Example

m

Select alignment point Pick a state

What are the paths that led to ?

What are the paths after ?

or just an empty state []

m

Outflow Graph

[e1, e2, e3]

Alignment Point

m

Outflow Graph

[e1, e2, e3]

[e1, e2]

[e1, e2, e3, e5]

[e1]

[ ]

Alignment Point

1 entity

m

Outflow Graph

[e1, e3]

Alignment Point

2 entities

[e1, e2, e3]

[e1, e2]

[e1, e2, e3, e5]

[e1]

[ ]

m

Outflow Graph

[e1, e2, e3, e4]

Alignment Point

[e3]

3 entities

[e1, e3]

[e1, e2, e3]

[e1, e2]

[e1, e2, e3, e5]

[e1]

[ ]

m

Outflow Graph

[e2, e3]

[e2]

Alignment Point

n entities

[e1, e2, e3, e4]

[e3]

[e1, e3]

[e1, e2, e3]

[e1, e2]

[e1, e2, e3, e5]

[e1]

[ ]

m

Outflow Graph

[e2, e3]

[e2]

Alignment Point

n entities

Average outcome Average time Number of entities

= 0.4 = 10 days = 10

[e1, e2, e3, e4]

[e3]

[e1, e3]

[e1, e2, e3]

[e1, e2]

[e1, e2, e3, e5]

[e1]

[ ]

layer

m

Soccer Results

2-1

2-0

1-1

0-2

2-2

3-1

1-0

0-1

0-0

Alignment Point

m

Step 2 | Visual Encoding

m

Alignment Future Past

e1!e2!

e1!

e2!

e1!e2!e3!

e1!e2!e4!

Color is outcome measure.

Node’s height is number of entities.

Time edge’s width is duration of transition.

Node’s horizontal position shows sequence of states.

time edge

link edge

End of path

m

Step 3 | Graph Drawing

m

m

m

3.1 Sugiyama’s heuristics •  Directed Acyclic Graph (DAG) layout

–  Sugiyama, K., Tagawa, S. & Toda, M., 1981. Methods for Visual Understanding of Hierarchical System Structures. IEEE Transactions on Systems, Man, and Cybernetics, 11(2), p.109-125.

•  Reduce edge crossing

m

41 crossings

m

12 crossings

m

m

3.2 Force-directed layout •  Spring simulation

x

Each node is particle.

Total force = Force from edges - Repulsion between nodes

m

m

m

3.3 Edge Routing •  Avoid unnecessary crossings

Reroute

m

3.3 Edge Routing •  After routing

m

m

m

Step 4 | Interactions

m

Interactions •  Panning •  Zooming •  Brushing •  Pinning •  Tooltip •  Event type selection

m

Demo

m

Step 5 | Simplification

m

Node Clustering •  Cluster nodes in each layer •  Similarity measure: Outcome, etc. •  Threshold (0-1)

m

m

m

Step 6 | Factors

m

Entity #1

Factors Time

[e1] [e1, e2] [e1, e2, e3]

Factor 1 Factor 2 Factor 3 Factor 4

m

Patient #1

Factors Time

Which factors are correlated to each state?

Yellow Injury Red Substitution

[e1] [e1, e2] [e1, e2, e3]

m

Which keywords are correlated to each document?

Information Retrieval

State 1 … …

State 2 … … …

State 3 … … …

Doc#1 Doc#2 Doc#3

Factor xxx

Which factors are correlated to each state?

m

Present factors

[e1,e2,e3]

[e1,e2]

[e1,e3]

[e2,e3]

[e1,e2,e3,e4]

[e1,e2,e3,e5]

[e1]

[e2]

[e3]

[ ]

Alignment Point

Factor 1

m

Absent factors

[ ]

Alignment Point

Factor 2

Factor 2 [e1,e2,e3]

[e1,e2]

[e1,e3]

[e2,e3]

[e1,e2,e3,e4]

[e1,e2,e3,e5]

[e1]

[e2]

[e3]

m

tf-idf •  Term frequency

Number of times a term t appear in the document

Number of terms in the document

Number of documents

Number of documents that has the term t + 1 log ( )

tf

idf

=

=

•  Inverse document frequency

m

Score based on tf-idf •  Ratio (presence)

Number of entities with factor f before state

Number or entities in the state

Number of states

Number of states preceded by factor f + 1 log ( )

Rp

R-1

=

=

•  Inverse state ratio (presence)

sp

m

m

Step 7 | User Study

m

User Study •  Goal:

Evaluate Outflow’s ability to support event sequence analysis tasks

•  12 participants •  60 minutes each •  9 tasks + 7 training tasks •  Questionnaire

m

Results •  Accurate:

3 mistakes from 108 tasks

•  Fast: Average 5-60 seconds

•  Findings: –  From video –  Different outcomes for each incoming paths –  Etc.

m

Future Work •  Integration with prediction algorithm •  Additional layout techniques •  Advanced factor analysis •  Deeper evaluations with domain experts

m

Conclusions •  Event sequences with outcome •  Outflow

–  Interactive visual summary –  Explore flow & outcome –  Factors –  Multi-step layout process

•  Not specific to sports

Contact: @kristwongz kristw@twitter.com dgotz@us.ibm.com

m

Patient #1

Time

Aug 1998 Ankle Edema

Jan 1999 Weight Loss

Oct 1998 Cardiomegaly

Heart failure (CHF) patient Die (0)

m

Event Sequences

and more…

Medical Transportation

Education

Web logs

Sports

Logistics

m

Acknowledgement •  Charalambos (Harry) Stavropoulos •  Robert Sorrentino •  Jimeng Sun •  Comments from HCIL colleagues

m

Conclusions •  Event sequences with outcome •  Outflow

–  Interactive visual summary –  Explore flow & outcome –  Factors –  Multi-step layout process

•  Not specific to medical or sports

Contact: @kristwongz kristw@twitter.com dgotz@us.ibm.com

m

THANK YOU ขอบคุณครับ