Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Lecture 15 Text and Sets
Oct. 14, 2014
Agenda
• Lab 02 Grades!
• Lab 03 due in 1 week
Lab 2 Summary
Preferences on x-axis label separation
0
3
6
9
12
5 7 10 20
10 could simply indicate laziness…
Preferences on Rotated vs. Normal Text for the y-axis label
0
2
4
6
8
Rotated Normal Both!
Choice for the “Best” Summary View (Note: These OFTEN did not agree with
the answer to Task 2)
0
2.25
4.5
6.75
9
Dots Connected Dots Lines Area Multiple User Selects
Task 2 Answers for “Best”
0
1
2
3
4
Dots Connected Dots Lines Area Not Specified
“Not Selected” means argued for multiple or forgot to answer
Grade Distribution: Mean 8.84
StdDev 0.93
Great Job Overall!
How Did They Look?
Good Colors Effective Legend
Good Colors Interactive Legend
Nice Legend Interactive Mouseover
Different x-axis separation
Interactive Measuring Bar Less Saturated Colors
Effective TransparencyInteractive Legend
Informative Mouseover
Different Aspect Ratio Reinforced Visual Encodings
Task 1: Minimal Ink Histogram of y- values
Combined Line Plots with Area Plots Interactive Legend + Mode Selection
Both Normal and Rotated Text for y-axis
Last Time: Trees and Graphs
Design Choices for Trees and Graphs
NODE LINK LAYOUTS
+ understandable visual mapping
+ can show overall structure, clusters, paths
+ flexible, many variations
- all but the most trivial algorithms are > O(N2)
- not good for dense graphs- hairball problem!
Drawing graphs – node link
Node-Link Layouts• Strengths:
• Understandable visual mapping • Can show overall structure, clusters, paths • Flexible, many variations
• Limitations: • All but the most trivial algorithms are >
O(N2) • Not good for dense graphs: Hairball
problem! • Small changes in the graph can cause
dramatic changes to the layout • See Frishman and Tal. Online Dynamic
Graph Drawing. Proc EuroVis 2007
Matrix Layouts• Instead of node-link diagrams, use the adjacency
matrix to represent
A
CB
D E
A
B
C
D
E
A B C D E
MATRIX REPRESENTATIONS
+ great for dense graphs
+ visually scalable
+ can spot clusters
- abstract visualization
- hard to follow paths
Matrix Representations• Strengths:
• Great for dense graphs
• Visually scalable
• Can spot clusters
• Limitations
• Abstract visualization
• Hard to follow paths
Spotting Patterns in Matrices
Henry 2006
Adjacency Matrices
Attribute-Driven Layout
Attribute-Driven Layout• Large node-link diagrams get messy!
• Are there additional structures we can exploit?
• Idea: use data attributes to perform layout
• e.g., scatterplot based on node values
• Dynamic queries and/or brushing can be used to enhance perception of connectivity
Cerebral
Barsky 2008
Visual Exploration of Multivariate Graphs Martin Wattenberg
Visual Communication Lab, IBM Research 1 Rogers St., Cambridge MA 02142
Figure 1. A PivotGraph visualization of a large graph rolled up onto two categorical dimensions ABSTRACT This paper introduces PivotGraph, a software tool that uses a new technique for visualizing and analyzing graph structures. The technique is designed specifically for graphs that are “multivariate,” i.e., where each node is associated with several attributes. Unlike visualizations which emphasize global graph topology, PivotGraph uses a simple grid-based approach to focus on the relationship between node attributes and connections. The interaction technique is derived from an analogy with methods seen in spreadsheet pivot tables and in online analytical processing (OLAP). Finally, several examples are presented in which PivotGraph was applied to real-world data sets.
Author Keywords information visualization, graph drawing
ACM Classification Keywords H.5.2. [Information Interfaces]: User Interfaces. I.3.6 [Methodology and Techniques]: Interaction Techniques
INTRODUCTION A common goal in exploratory data analysis is to form hypotheses about a graph. In contexts ranging from social networks to Markov chains, it can be important to understand the properties of a graph and make inferences about why it looks the way it does. Because of the complex structure and large size of many graphs, such an analysis can be difficult.
Visualization is one solution to this problem. In one popular approach, nodes of a graph are represented by a dot, and edges by lines or curves linking them. Such “node-and-link diagrams” have been explored intensively throughout this century (see [3, 10, 14, 15]). A second common approach is a matrix view, or density table, that displays a graph’s adjacency matrix overlaid on a grid [4].
Both node-and-link diagrams and matrix views emphasize link structure. In many cases, however, data about a graph
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise,or republish, to post on servers or to redistribute to lists, requires priorspecific permission and/or a fee.CHI 2006, April 22-28, 2006, Montréal, Québec, Canada.Copyright 2006 ACM 1-59593-178-3/06/0004...$5.00.
811
CHI 2006 Proceedings • Visualization 1 April 22-27, 2006 • Montréal, Québec, Canada
Pivot Graph• Task abstraction
• Show relationship between node attributes and connections in a multi-attribute graph
• Data abstraction
• Relational dataset
• Nodes (and edges) have multiple discrete attributes
• Rollup and selection transformations
Visual Encoding• Line (1D) or grid (2D) layout
• Area subdivided by number of values for an attribute
• Number of nodes based on attribute count, not original graph node count
• Size of nodes and edges related to number of aggregated original nodes and edges
• Scalability through abstraction, not layout algorithm
Visual Encoding• Line for 1D rollup, or grid for 2D case
Wattenberg 2006
Interaction• Changing rollup/selection choices
• Animated transitions between states
PIVOT GRAPH-in general, more compact than matrix representation
81
Pivot Graph• In general, more compact than matrix
representation
Critique: What Do You Think?
Tree and Graph Vis Summary
• Trees: • Indentation: simple, effective for small trees • Node-link and layered: look good but needs exponential space • Enclosure (treemaps): great for size related tasks but suffer in
structure related tasks
• Graphs • Node-link: familiar, but problematic for dense graphs • Adjacency matrices: abstract, hard to follow paths • Attribute-driven: not always possible
Takeaway: No Best Solution …or Graph visualization is still a great research area!
Text
Text Data
• No Numbers (implicitly)
• Characters (ASCII)
• Strings
Text Data
• Words
• Sentences • Paragraphs • Chapters
• Lines
I
love
visualization.
Text Data
• Words
• Sentences • Paragraphs • Chapters
• Lines
I
love
visualization.
I love visualization.
Text Data
• Words
• Sentences • Paragraphs • Chapters
• Lines
I
love
visualization.
I love visualization.
Text Data
http://galaxyonfire.wikia.com/wiki/File:Small-ascii-art.jpg
Text Data• Documents
• Books • Papers • Webpages • Emails • Twitter posts
!
• Corpus: collection of documents
Text Data• Documents
• Books • Papers • Webpages • Emails • Twitter posts
!
• Corpus: collection of documents
Text Visualization For Documents
Tag Clouds / Word Clouds
http://www.tagcrowd.com
http://www.wordle.com
Text Arc
Wattenberg, Viegas 2008
DocuBurst
Collins, Carpendale, Penn 2008
Arc Diagrams
Analysis of the Characters from Les Misérables: http://mbostock.github.io/protovis/ex/arc.html
Rule-Based: Poetry
Abdul-Rahman et al. 2008
Text Visualization For Document Collections
Document Cards (small multiples)
Showing Temporal Relationships: ThemeRiver (Stream Graph)
Havre, Hetzler, Nowell 2000
Jigsaw: Many Linked Views
Stasko et al. 2008
Jigsaw: Many Linked Views
Stasko et al. 2008
Lec16 Required Reading
ii
ii
ii
ii
Arrange Spatial Data
Chapter 8
8.1 The Big Picture
For datasets with spatial semantics, the usual choice for arrangeis to use the given spatial information to guide the layout. In thiscase, the choices of express, separate, order, and align do not applybecause the position channel is not available for directly encodingattributes. The two main spatial data types are geometry, whereshape information is directly conveyed by spatial elements thatdo not necessarily have associated attributes, and spatial fields,where attributes are associated with each cell in the field. (See Fig-ure 8.1.) For scalar fields with one attribute at each field cell, thetwo main visual encoding idiom families are isocontours and directvolume rendering. For both vector and tensor fields, with multipleattributes at each cell, there are four families of encoding idioms:flow glyphs that show local information, geometric approaches thatcompute derived geometry from a sparse set of seed points, textureapproaches that use a dense set of seeds, and feature approacheswhere data is derived with global computations using informationfrom the entire spatial field.
8.2 Why Use Given?
The common case with spatial data is that the given spatial po-sition is the attribute of primary importance because the centraltasks revolve around understanding spatial relationships. In thesecases, the right visual encoding choice is to use the provided spa-tial position as the substrate for the visual layout, rather than tovisually encode other attributes with marks using the spatial posi-tion channel. This choice may seem obvious from common sensealone. It also follows from the effectiveness principle, since the
181
8.1 - 8.3 ONLY!!