13
Visualizing textual data CPSC 601.28 A. Butt / Feb. 26 '09

Visualizing textual data

  • Upload
    shiloh

  • View
    154

  • Download
    1

Embed Size (px)

DESCRIPTION

Visualizing textual data. CPSC 601.28 A. Butt / Feb. 26 '09. Overview. Project implications Summarize "Tilebars" Hearst / PARC (Xerox) Summarize "Visualizing the Non-Visual" Wise et al / Pacific Northwest Lab (Battelle) Key Issues Summary References. Project Implications. - PowerPoint PPT Presentation

Citation preview

Page 1: Visualizing textual data

Visualizing textual data

CPSC 601.28

A. Butt / Feb. 26 '09

Page 2: Visualizing textual data

Overview

• Project implications• Summarize "Tilebars"

– Hearst / PARC (Xerox)• Summarize "Visualizing the Non-Visual"

– Wise et al / Pacific Northwest Lab (Battelle)• Key Issues• Summary• References

Page 3: Visualizing textual data

Project Implications

• Research area is partly based on text-based environmental reports– textual reporting feeds into textual (quasi-judicial)

regulatory framework– rooms of binders (e.g. >20,000 pages for Mackenzie

Pipeline Project)• Vocabulary specialized / semantically complete

– "no significant adverse environmental impacts"

Page 4: Visualizing textual data

TileBars

• goals are to simultaneously view:– length of a document– relative frequency of specific words– distribution of words with respect to each other

• benefits include:– enhanced relevancy of search response– patterns of frequency by document / author– compactness of information

Page 5: Visualizing textual data

Tilebars

• Visual representation via– rectangular block: size equates to document length– three bars within the block: each corresponds to a

query– in each bar tiles indicate location, saturation of tile

indicates frequency

•5 articles, 3 search queries•1st, 2nd, 5th appear compact / relevant•1st and 2nd appear to have better concurrency•3rd and 4th potentially less relevant, greater time investment to read

Page 6: Visualizing textual data

Visualizing the Non-Visual

• goals are to:– overcome time constraints in processing textual

information– overcome attention constraints; avoid becoming

overwhelmed by volume of textual information• benefits include:

– escape limitations of traditional text– increase throughput and comprehension of

information processing– feedback on text structure to enhance visualization

Page 7: Visualizing textual data

Visualizing the Non-Visual

• Employ a "natural landscape" metaphor– leverage evolutionary psychological adaptations via

natural landscapes for representation– galaxy or star-fields ("night sky")– themescapes ("cartographic" or "landscape") – although statistical measures used for clustering, they

are not used as directly as in tile bars– self-organizing maps

Page 8: Visualizing textual data

Galaxies

•PNL software development (DOE)•Display is a review of cancer literature•Branched to SPIRE / In-SPIRE for government documents

Page 9: Visualizing textual data

Themescapes

•PNL software development (DOE)•Branched to SPIRE / In-SPIRE for government documents (renamed "Themeview")•Branched into NVAC (National Visual and Analytics Centre) - part of the Homeland Security infrastructure

Page 10: Visualizing textual data

Themescapes (2.0?)

•Branched progeny of themescapes•Used in searching IP / Patents•Subscription service

•Failed metaphors??

Page 11: Visualizing textual data

Key Issues

• Vocabulary / semantics - how do you interpret meaning from text statistics?– earlier failures of natural language processing– contingent semantics

• Employing metaphors (Zhang 2008)– rely on unusual linkages (versus analogy) to highlight– degree of "unusual-ness" is critical: too much or too

little leads to confusion

Page 12: Visualizing textual data

Summary

www.wordle.net

Page 13: Visualizing textual data

References

Marti A. Hearst: TileBars: Visualization of Term Distribution Information in Full Text Information Access. CHI 1995: 59-66

James A. Wise and James J. Thomas and Kelly Pennock and David Lantrip and Marc Pottier and Anne Schur and Vern Crow. Visualizing the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents. Proc. IEEE Symp. Information Visualization, InfoVis, pp. 51-58, IEEE Computer Soc. Press, 30-31, October 1995. (in text pages 442-450)

Jin Zhang. The Implication of Metaphors in Information Retrieval. Visualization in Information Retrieval, Elsevier, 2008. (pages 215-237)