Visualizing the Non-Visual Spatial Analysis and Interaction with Information from Text Documents...

Preview:

Citation preview

Visualizing the Non-VisualSpatial Analysis and Interaction with

Information from Text Documents

Wise, Thomas, Pennock, Lantrip, Pottier, Schur, and Crow

Presented By: Cyntrica Eaton

Presentation Overview Paper Description

Contributions

Current State

Critique

References

Paper Description Motivation

Approach

Visualization Paradigms Galaxies Themescapes

MVAB Multidimensional Visualization and

Advanced Browsing Project

Researchers at the Pacific Northwest National Laboratories were interested

in solving the problem of information overload for Intelligence analysts.

Motivation Modern information technologies have contributed to an

increased availability of information.

Accompanying the increasing quantity of available information is a subsequently decreasing quantity of time to locate and absorb it.

The ability to overview large document corpora and get information without the heavy cognitive processes involved in language processing will improve the search process.

Approach Problem of processing large amounts of text can

be solved if text is spatialized in manner that takes advantage of human perceptual abilities.

Visual processing take place in parallel on the retinal level and is: Relatively effortless Exceptionally fast Not additive to cognitive workload

ApproachTransform text into visualizations that:

Communicate through images instead of prose.

Preserve information characteristics from documents.

Represent textual content and meaning without the need to read it in the normal manner.

Reveal thematic patterns and relationships between documents in ways in which the natural world is perceived.

SPIRE Spatial Paradigm for Information Retrieval

and Exploration

Developed to facilitate the browsing and selection of documents from large corpora

Two major approaches: Galaxies Themescapes

Galaxies and ThemescapesDisplay metaphor rationale:

Each paradigm offers a rich variety of cognitive spatial affordances that naturally address the problems of text visualization.

Spatial perceptual mechanisms that operate on the real world will respond analogously to synthetic cues.

Paradigm Overviews Galaxies

Point clusters suggest patterns of interest

Themescapes Topographies of peaks and valleys that can

easily be detected based on contour patterns.

Paradigm Overviews Both allow for overview + detail without a

change of view.

Each view offers a different perspective of the same information.

Galaxies Two-dimensional scatterplot of ‘docupoints’ that

appear like stars in the night sky.

Computes word similarities and patterns in documents and communicates similarity via proximity.

Provides a first cut at sifting through information and determining how the contents of a document base are related.

Types

Treatment

Case Studies

…..

Types

Treatment

Case Studies

…..

Themescapes Three-dimensional relief map of themes within the

document corpora themes.

Complex surfaces convey information about topics or themes found within the corpus without cognitive load of reading

Terrain simultaneously communicates: Primary themes of an arbitrarily large collection of

documents. Measure of relevance in the corpus. Similarity of themes.

ThemescapesGlance provides visual thematic summary of the entire corpus Elevation: Theme strength

Shapes: Information distribution

Proximity: Content Similarity

Themescapes Utilizes human abilities for pattern

recognition and spatial reasoning

Employs communicative invariance across levels of textual scale Entire document corpus Cluster of documents Individual documents

Summarization Reading is a slow, serial process of

mentally encoding a document.

Text visualizations can overcome much of the user limitations that result from accessing and trying to read from large document bases.

Summarization

Visual cues can offer readers a way to employ their primarily preattentive, parallel processing powers of visual perception.

Galaxy and landscape metaphors allow the cognitive and visual processes that enable our spatial interactions with the natural world to be applied to the search process.

Contributions Prior visualization approaches offered methods

for visualization of structured, hierarchical text.

Free text visualization was relatively unexamined.

MVAB Project produced novel methods for interaction with large amounts of text.

Current Project Status Correlation Tool

WebTheme

ThemeRiver

Rainbow

Romeo

Tybalt

Love

Caesar

Critique The visualization paradigms were

discussed in a straight-forward manner.

There was, however, a deficiency of example figure explanations.

My Favorite Sentence [The] perceptual processes involved are the

results of millions of years of selective mammalian and primate evolution, and have become biologically tuned to seeing in the natural world.

References Information Retrieval

Information Visualization

Visualizing the Non-VisualSpatial Analysis and Interaction with

Information from Text Documents

Questions?

Technical Considerations Clear definition of text Way to transform text into a different visual form

that retains high dimensional invariants of natural language.

Suitable mathematical procedures and analytical measures must be defined as the foundation of the visualizations

Database management system must be designed to store and manage text

Technical Considerations Way to transform text into a different visual form

that retains high dimensional invariants of natural language. Text has statistical and semantic attributes such as

frequency and context and combination of words in themes and topics

Differences between texts statistical and semantic compositions provide much of opportunity for text visualizations described in this paper.

Approach A set of measures which characterize the text in

meaningful ways provide for multiple perspective of documents and their relationships to one another.

One measure is similarity Based on occurrences and context of key words or

other extracted features measure of similarity can be computed that reflect relatedness between documents.

In a visualization, similarity can be shown as proximity or congruity to form.

Recommended