21
REPRESENTING LINGUISTIC DATA Maha Shouman

R EPRESENTING L INGUISTIC D ATA Maha Shouman. T EXT A RC Data target: Raw text Medium-sized Traditional techniques: Structured word lists (indices, concordances)

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

REPRESENTING LINGUISTIC DATA

Maha Shouman

TEXTARC

Data target: Raw text Medium-sized

Traditional techniques: Structured word lists (indices, concordances) Automatic summary generation Exclude original linearity!

Indexhttp://www.i75online.com/

FLAIndexPage1.html

Concordancehttp://www.opensourceshakespeare.com

THEMERIVER

Data target: Large text collections Temporal patterns Thematic changes

Traditional techniques: Histogram

Other visualizations focus on documents

3D THEMERIVER?

www.cs.sunysb.edu/~vislab/papers/3DThemeriver.pdf

THE WORD TREE

Visualization + information retrieval Graphical Key Word In Context (KWIC)

Format for concordance KWIC + suffix tree

THE WORD TREE

THE WORD TREE

Click

Shift-Click