41
1 Dimensions / Depth James Slack CPSC 533C February 10, 2003

Dimensions / Depth

  • Upload
    ajay

  • View
    45

  • Download
    1

Embed Size (px)

DESCRIPTION

Dimensions / Depth. James Slack CPSC 533C February 10, 2003. Overview. Linear data sources Information processing Aggregate visualization methods Embedding semantics of information Repetition and other patterns Examples in InfoVis. Linear Data Sources. - PowerPoint PPT Presentation

Citation preview

Page 1: Dimensions / Depth

1

Dimensions / Depth

James SlackCPSC 533C

February 10, 2003

Page 2: Dimensions / Depth

2

Overview

• Linear data sources

• Information processing

• Aggregate visualization methods

• Embedding semantics of information

• Repetition and other patterns

• Examples in InfoVis

Page 3: Dimensions / Depth

3

Linear Data Sources• Univariate data arranged spatially or

temporally• Complexity issues:

– Patterns in text are cognitively hard to find– Text input could be viewed spatially– Cognition from visual abstractions of text is

becoming more relevant

Page 4: Dimensions / Depth

4

Information Processing• Why do we need information?• Technical aspects• Characterizing text by language semantics• Browsing versus querying• Interfacing with text visualization

Page 5: Dimensions / Depth

5

Considering Visualization?

The technical considerations:

1. Define what needs to be visualized

2. Transform input; must be possible!

3. Analyze to suit the input

4. Technique & derivative data storage

Page 6: Dimensions / Depth

6

Text Features• 3 general types of features

1.Frequency based

2.Statistics on words or other tokens

3.Semantic features

Page 7: Dimensions / Depth

7

Text Features• Frequency based text features:

– Statistics on presence and count of unique words

– Feature sets are word statistics

Page 8: Dimensions / Depth

8

Text Features• Statistics on words or other tokens

– Occurrence, frequency, and context of individual tokens define feature set

– Sets can be explicitly specified or deterministically partitioned

Page 9: Dimensions / Depth

9

Text Features• Semantic features

– Natural groups of similar topics– Knowledge of language– Words have semantic meaning

Page 10: Dimensions / Depth

10

Characterizing Text• Feature sets of text

– A shorthand description of the original– Reduction in length, not in meaning– Semantics are often important, although

not always necessary– Represented for efficient computation

Page 11: Dimensions / Depth

11

Browsing vs. Querying• Querying is more precise

– Specific results discarded or retained– The most specific features are important– Popularity of query is relative, closeness

ratio compares potential matches– Similarity of results appear

Page 12: Dimensions / Depth

12

Browsing vs. Querying• Browsing is more general

– Choose similarity over exactness– The most common features are important– Clustering is a natural partition– Similarity of clusters appears– Analytical information processing

Page 13: Dimensions / Depth

13

Interfacing With Visualizations• Spatial representations enhance cognition• Clusters can be viewed with browsing• A global overview of data is important• Techniques to visit clusters• Too many data points?

– Display cluster centroids instead

Page 14: Dimensions / Depth

14

Assisting Perception• Interface should provide:

1. Preconscious visual form for information

2. Interactions to sustain, enrich process of knowledge building

3. Fluid environment for reflective cognition

4. Framework for temporal knowledge building

Page 15: Dimensions / Depth

15

Aggregate Visualization• Information overloads cognitive abilities• Understanding global, not local contexts• Visualize abstract representations of complex

underlying structure• What can we gain from global context?

Page 16: Dimensions / Depth

16

Embedding Semantics• Are some visualizations without meaning?• Galaxies, ThemeScapes highlight semantic

meaning with relevant labels• Cluster viewer uses calendar to highlight

temporal univariate patterns• Dot plots, arc diagrams use connectivity of

similar input strings independent of semantics

Page 17: Dimensions / Depth

17

Repetition and Patterns• How can you show something is repeated?

– Place two occurrences close together– Colour two occurrences similarly– Connect two occurrences with a line

• Each method has merits– No method works in all cases– We want to keep spatial/temporal

information

Page 18: Dimensions / Depth

18

Infovis Examples• SPIRE

– Galaxies and ThemeScapes• Calendar Based Visualization• Dot Plots• Arc Diagrams

Page 19: Dimensions / Depth

19

From SPIRE• Spatial Paradigm for Information Retrieval

and Exploration• Galaxies cluster docupoints• ThemeScapes model landscape

Page 20: Dimensions / Depth

20

Galaxies• Projection of clustering algorithms into 2D• Galaxies are clusters of related data• Proximity of galaxies is relevant• Designed to add temporal patterns to

clustering

Page 21: Dimensions / Depth

21

Galaxies

Page 22: Dimensions / Depth

22

ThemeScape• Abstract 3D landscape of information• Reduce cognitive load using terrain• Elevation, colour encode theme strength

redundantly• Landscape metaphor translates well

– Peaks are easy to recognize– Interesting characteristics include ridges

and valleys

Page 23: Dimensions / Depth

23

ThemeScape

Page 24: Dimensions / Depth

24

ThemeScape

Page 25: Dimensions / Depth

25

Calendar Based Visualization• Time is linear, monotonic, scalar• Prediction is a useful side effect of visualizing

the past• Time series data is often univariate• Periodic patterns emerge in time series data

Page 26: Dimensions / Depth

26

Calendar Based Visualization• How about using 3 dimensions?

– X-axis: Time of day– Y-axis: Days of data period– Z-axis: Univariate data samples

Page 27: Dimensions / Depth

27

Calendar Based Visualization

Page 28: Dimensions / Depth

28

Calendar Based Visualization• Weekly variation obscured by pretty graphics• Where are the trends?• Is colour necessary for this?• Is colour sufficient for this?• Can everything be shown without overload?

Page 29: Dimensions / Depth

29

Calendar Based Visualization• A more natural way: use a calendar• Cluster data into meaningful groups

– Decide what the groups mean later?

1. Simple formulae are sufficient for clustering

2. Use robust statistical techniques

3. Generate binary clustering trees

4. Select desired clusters to visualize

5. Show clusters on calendar layout, simple graphs coloured appropriately

Page 30: Dimensions / Depth

30

Calendar Based Visualization

Page 31: Dimensions / Depth

31

Visualizing Structure in Strings• M. Wattenberg: Arc diagrams• Summarize long strings, indicate repetition

Page 32: Dimensions / Depth

32

Dot Plots• Finds structure in string data• Correlation matrix• Diagonal symmetry• Redundant information• Interesting repetitions can be confusing

Page 33: Dimensions / Depth

33

Dot Plots

Page 34: Dimensions / Depth

34

Arc Diagrams• Finds structure in string data• Cognitive improvement over dot plots• Adaptable to reduce noise in data• Applications are varied:

– Music– Text– Compiled code– Nucleotide sequences

Page 35: Dimensions / Depth

35

Arc Diagrams• Interactive demonstration:

– http://www.turbulence.org/Works/song/mono.html

Page 36: Dimensions / Depth

36

Alternate Ending• Something went wrong with the demo, so

here is a synopsis of arc diagrams

Page 37: Dimensions / Depth

37

Arc Diagrams

Page 38: Dimensions / Depth

38

Arc Diagrams

Page 39: Dimensions / Depth

39

Arc Diagrams

Page 40: Dimensions / Depth

40

Arc Diagrams

Page 41: Dimensions / Depth

41

Paper References• Visualizing the non-visual: spatial analysis and interaction with i

nformation from text documents Wise, J.A.; Thomas, J.J.; Pennock, K.; Lantrip, D.; Pottier, M.;

Schur, A.; Crow, V., Proc InfoVis 1995. • Cluster and Calendar based Visualization of Time Series Data

Jarke J. van Wijk Edward R. van Selow, Proc InfoVis 99. • Arc Diagrams: Visualizing Structure in Strings. Martin

Wattenberg, Proc InfoVis 2002.