Udhig0613

Preview:

DESCRIPTION

text analysis tools

Citation preview

New Tools in Digital Humanities

UDHIG June 13 2006Zoe Borovsky

New tools

Text:JuxtaTAPoR, HyperPo WordHoardImages:

Image Markup Tool

Why digitize text?

Text analysis: discovering new knowledge by linking information together in interesting ways, not just showing overall trends.

“I think discovering new knowledge vs. showing trends is like the difference between a detective following clues to find the criminal vs. analysts looking at crime statistics to assess overall trends in car theft.” (Marti Hearst, 2003)

The verb “look” occurs more often near words & names of giantesses than giants.

Three volumes of sagas:

Hundreds of giants and giantesses

Types of tools

Concordance, comparison, corpus, critical editions (Juxta)

Search (TAPoR, HyperPo, WordHoard)Key words in context (KWIC)Collocates (associations)Markup: Lemma, Parts of speech,

Speaker

Juxta

Produces critical editions, comparing and collating multiple witnesses of a single work

http://www.patacriticism.org/juxta/

Juxta

Desktop Application: Mac, Windows and Unix/Linux (open source)

Input: plain text (UTF-8), or XMLOutput: HTML critical apparatus

The darker color, the more variants that differ

Toggle between texts

Generate HTML

TAPoRWeb-based text analysis portalSearch and display using online tools

http://test-tapor.mcmaster.ca/portal/portal

Input: XML, HTML, TEI, plain text

TAPoR

Mostly English, some western European languages

Word ListsKWIC (key word in context)Collocates/co occurrences - words

that occur in the proximity

Word ListHyperPo

Key word in context, HyperPo

co occurrences“white”add secondary corpus

WordHoardDesktop application/server versiontexts are annotated or tagged by

morphological, lexical, semantic, prosodic, and narratological criteria.

http://wordhoard.northwestern.edu/userman/index.html

The downloadable version comes with texts

Open source version can be installed on your own server with your texts

Sample WordHoard query

Shakespeare’s use of the word “love” over time

Results….

Image Markup Tool

http://www.tapor.uvic.ca/~mholmes/image_markup/

Windows only

Image Markup toolInput: an image that you want to

make available on a web page with annotations directly on the image

Ex, Robert Watson’s

Back to Nature

Image Markup Tool

Output: sample A copy of your XML data file with an added XSL stylesheet

declaration A copy of the image file you're marking up (usually

reduced to a size suitable for a Web page -- you can control this size in the Options / Web view preferences window).

An XSLT file (copied from the web_view folder in the program folder, with some variables modified to suit your data).

A JavaScript file (copied from the web_view folder in the program folder).

A CSS stylesheet file (copied from the web_view folder in the program folder).