22
Cleveland, Ohio and Western Reserve Digital Text Collection Project Suzhen Chen Richard Wisneski Kevin Smith Library, Case Western Reserve University May 22, 2010 2010 CALA MW Annual Conference

Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

  • Upload
    cala-mw

  • View
    318

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Cleveland, Ohio and Western Reserve Digital Text Collection Project

Suzhen ChenRichard Wisneski

Kevin Smith Library, Case Western Reserve UniversityMay 22, 2010

2010 CALA MW Annual Conference

Page 2: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Institution Case Western Reserve University (CRWU) Founded in 1967 (federation of Case Institute of Technology founded in 1881 and Western Reserve University founded in 1826) A private research university in northeast Ohio ~10,000 students

Kelvin Smith Library Main library of CWRU ~ 1.7 million volumes ~ 60 library staff

Page 3: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Project (What)Cleveland, Ohio and Western Reserve

Digital Text Collection project A collection of digital resources of history

of Cleveland, Ohio and Western Reserve date from early 19th century to early 20th century

The collection covers various topics including women of Cleveland, religion, housing etc.

About 100 text files added to the collection, more to be added including some manuscripts

Page 4: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Project(Why)

Provide resources for historians, scholars and others who are interested in Cleveland, OH and Western Reserve

Online representation of the collection

Serve the learning and teaching purpose

Promote scholarly communication

Long term preservation of regional history

Page 5: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Metadata standard

MetadataStandard

Intendedusers

Types of materials

Preservation needs

Subjects, Genre

Project needs

Page 6: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

A consortium that develops the standard for representing texts in digital form. Maintain encoding guidelines for text Often applies to humanities, social science and linguistics

TEI: Text Encoding Initiative

Page 7: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Example:Projects from other institutions:Shakespeare Quartos ArchiveNewton Manuscript Project (University of Sussex)Early American Digital Archive (University of Maryland)

Page 8: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Example of TEI Header <titleStmt><editionStmt><publicationStmt>

Page 9: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Example of an encoded text

Page 10: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Mark up specific genres such as prose, verse, drama Mark content structure such as paragraphs, divisions Mark up feature of a text such as quotations, footnotes etc. Mark up texts for literary and linguistic analysis

TEI Metadata Standard

Page 11: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Example

http://tbe.kantl.be/TBE/TBE.htm?page=examples

Page 12: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Example of an encoded text <lg> <l rend="font-size(110%) indent(-60)">"Fury said to</l><l rend="font-size(100%) indent(-40px)">a mouse, That</l><l rend="font-size(100%) indent(0px)">he met</l><l rend="font-size(100%) indent(10px)">in the</l><l rend="font-size(100%) indent(20px)">house,</l><l rend="font-size(100%) indent(17px)">'Let us</l><l rend="font-size(100%) indent(5px)">both go</l><l rend="font-size(100%) indent(-7px)">to law:</l><l rend="font-size(100%) indent(-23px)"><hi rend="italic">I</hi> will</l><l rend="font-size(100%) indent(-26px)">prosecute</l><l rend="font-size(90%) indent(-40px)"><hi rend="italic">you.</hi> —</l><l rend="font-size(90%) indent(-30px)">Come, I'll</l><l rend="font-size(90%) indent(-20px)">take no</l><l rend="font-size(90%) indent(-7px)">denial;</l>…</lg>

http://tbe.kantl.be/TBE/TBE.htm?page=examples

Page 13: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Provides various manifestations of a text or audio Independent of applicationsTEI is extensibleAccommodate encoding methods for data processing needs and analysisFor better description, organization and classification of information

TEI Metadata Standard

Page 14: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

ImplementationStaffFundingTime management…

Page 15: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Finalize the project

Establish workflows,

policies and procedures

Trainings are provided

Text files scanned

Run through optical

character recognition software –

Abbyy FineReader

Cleveland, Ohio and Western Reserve Digital Text Collection project

Page 16: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

ProceduresSpelling check for the texts

Create TEI headers

Bibliographic description, revisions, source of text

Encode the text

Quality control

Page 17: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Implementation Workshops In-house documentations for best

practice Standards On line resources Examples for completed work Assistance from supervisor Learn from each other

Page 18: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Example of an In-house documentation

Page 19: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Cleveland, Ohio and Western Reserve Digital Text Collection project

For future metadata conversion, exchange, facilitate metadata harvesting and federated search

Facilitate metadata sharing and cross-collection searching

Page 20: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

Future Improvement

Make text searchable through web

Have hyper linked, referenced electronic resources

Page 21: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

ResourcesWWP Guide to Scholarly Text Encoding:

http://www.wwp.brown.edu/encoding/guide/index.html

Teach Yourself TEI: http://www.tei-c.org/Support/Learn/tutorials.xml

A Gentle Introduction to XML: http://www.tei-c.org/release/doc/tei-p4-doc/html/SG.htmlA

A Companion to Digital Literary Studies: http://www.digitalhumanities.org/companion/DLS/

Page 22: Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & Richard Wisneski

ReferencesTEI: Text Encoding Initiative, “TEI: Text Encoding

Initiative,” 2010, http://www.tei-c.org/index.xml

International Federation of Library Associations and Institutions. Cataloging Section. “Functional Requirements for Bibliographic Records: Final Report,” 1998, http://www.ifla.org.proxy2.library.uiuc.edu/VII/s13/frbr/frbr.htm

TEI By Example Project, “TEI By Example Project,” 2010, http://tbe.kantl.be/TBE/TBE.htm?page=examples