21
Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

  • View
    227

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

KirrkirrA Dictionary Visualization Tool

Conrad WaiAndrei Pop

Page 2: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Presentation Overview

What is Kirrkirr? – General Purpose

– Technical Introduction

Generalizing Kirrkirrr– Allowing heterogeneous dictionaries

– Nahuatl: a case study

Redesigning Kirrkirr’s Network Visualization

Page 3: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

What is Kirrkirr? Going beyond: unlike paper dictionaries, electronic

dictionaries can provide an interactive educational tool customizable to various audiences.

Taking advantage: contrary to the blandness of typical electronic dictionaries, Kirrkirr presents the contents of a dictionary in flexible, interactive, customizable, and (especially) fun ways.

Audience: Kirrkirr has diverse target users, with varying levels of literacy, ranging from professional linguists, elementary school children, teachers, and native speakers.

Page 4: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Technical Details

The dictionary is stored in XML– Rather than load the large (10Mb) XML file in memory,

each headword’s XML entry is loaded individually as needed

Formatted entries are rendered using XSLT

Dictionary is accessed via XPath The program is written in Java

– Java 1.1.8+Swing for backward-compatibility

Page 5: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Original Application

Originally, Kirrkirr was used with the Australian Aboriginal language Warlpiri, spoken by about 3,000 people in northern Australia.

Kirrkirr used a Warlpiri-English dictionary developed by linguists in Australia, with detailed information about each word, including glosses, definitions, dialects, grammatical comments and cross-references between words for synonyms, antonyms, “see also” and other relationships.

Page 6: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Generalizing Kirrkirr

Want to incorporate disparate sources– Generalize to broaden allowable dict. formats, and

consequently, the number of accessible languages

Two ways to generalize dictionary access (issues similar to DB schema integration)– Specify an overarching format to be adhered to

• But, gets unwieldy as complexity grows, and• no single “best” schema for all purposes

– Allow generic format and require conversion specification

• provide just enough info. for program to get out what it needs (not full translation of data)

Page 7: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Generalizing Kirrkirr II

Kirrkirr does the latter (generic+conv)– Allow heterogeneous dictionary formats

Page 8: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Challenges in Conversion

Preprocessing dictionary data– Converting to XML

– Detecting duplicate entries (homophones), and adding uniquifier

– Linking up pictures and sounds

– Alphabetizing/ordering of entries

Page 9: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Challenges in Conversion II

Writing the XML conversion specification– Cross-referencing links between words

– Fuzzy spelling rules (for regexps)

Runtime formatting of dictionary entries– Designing custom XSLTs for HTML display

• Different XSLTs for different audiences: basic stylesheet for schoolchildren and novices to the language, and more complex views for linguists and teachers

Page 10: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Nahuatl: A Case Study Nahuatl: spoken throughout

North and Central America (language of Aztecs)

Dictionary contains parallel data for multiple dialects (in headword and gloss)

An attempt to apply generalization to real-world example

Unforeseen hurdles to conversion

Page 11: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Nahuatl: Some Issues

Special characters / character encoding– During preprocessing (“equivalency” in XML)

• Solution: Temporarily change XML encoding

– During runtime display (unrecognized characters and truncated entries)

• Solution for A: Wrap in Java Stream Readers and Writers

• Solution for B: Issue is multiple character entities under a single element in DOM tree: collate entities

Page 12: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Nahuatl: More Issues

File naming conflicts– Implementation uses headword for HTML filenames

– Nahuatl’s colons not allowed on some platforms• Short-term solution: substitute on the fly• Long-term solution: name auto-generated, or based on

MIME/Base64 encoding

Dictionary anomalies– Tags in fields. Erroneously escaped by Shoebox

• Solution: Regexp replace

– Invalid or special headwords

Page 13: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Exploring New Issues

Redesigning network visualization panel

Usability Note / Design Consideration– Screen size limitations

– General eye-sight varies by target audience

– The above puts an upper bound on what we can do in one panel

Page 14: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Basic Idea

Word links represented visually as nodes and edges.

Edge colors represent link types.

Page 15: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Former Kirrkirr Visualization

HCI issues– Focus and attention

– Word islands

– Visual organization• Random Positioning• Misleading Distance

– Lack of History or Sequence• No tangible sense of back / forward with a single panel

Page 16: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Former Kirrkirr Visualization II

Software Design Shortcomings– Procedural Paradigm - very un-Java / un-OO

– Lack of extensibility and readability

– One large file doing most of work

– Flawed algorithm, and unnecessarily complicated

– First piece of code written for Kirrkirr - became “crufty” as Kirrkirr grew and evolved (also, written when Swing a nascent technology)

Not necessarily evident to user, but makes extension difficult

Page 17: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

New Network Visualization

Basic premise remains the same (nodes and edges representing words and links)

Redesign to address HCI concerns, rewrite (vs. adapt) to improve software design

Basic premise remains the same (nodes and edges representing words and links)

Redesign to address HCI concerns, rewrite (vs. adapt) to improve software design

Page 18: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Addressing HCI Issues

Multiple panels– Reduce visual clutter

– Related words together, unrelated separate

– Provide sense of sequence

– Background panels should perhaps better indicate nodes they contain

• but feedback somewhat limited by screen size

Improve visual organization– Group links by type (vs. random)

– Make distance a relevant factor (vs. random)

– User has freedom to move nodes (vs. spring algorithm)

Page 19: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Improved Software Design Modular, OO approach

– Split up into constituent components: panels, nodes, edges

• Each component handles its own characteristics (e.g., color) and events (e.g., mouse listening)

– Encapsulation: Layered pane contains network panels; each panel consists of nodes and edges

– Model / View separation• Each object has model / view• Views listen to models; models fire changes• Little wrinkle with edges: one view (less overhead)

Easier to maintain and extend

Page 20: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Future Visual Enhancements

Network visualization– Further improve algorithm for placing nodes and links

– Improve feedback for panel switching (animate?)

– Incorporate semantic domains

Wordlist sidebar– Improving navigation and focus

– Perhaps a diamond-shaped “dial” of sorts, w/ words in central area larger

• Idea is to provide both context and focus, overview and detail

Page 21: Kirrkirr A Dictionary Visualization Tool Conrad Wai Andrei Pop

Future Visual Enhancements II

Semantic domain exploration– JTree alternatives

– Want ability to browse entire dictionary, not simply a history