30
From crowd-sourced collection to digital scholarly edition The example of the Letters of 1916 project

From crowd sourced collection to digital scholarly edition

  • Upload
    bleierr

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: From crowd sourced collection to digital scholarly edition

From crowd-sourced collection to digital scholarly edition

The example of the Letters of 1916 project

Page 2: From crowd sourced collection to digital scholarly edition

Funding Bodies

Page 3: From crowd sourced collection to digital scholarly edition

Susan Schreibman - Project Director and Editor in Chief

Karolina Badzmierowska - Researcher

Roman Bleier - Researcher

Emma Clarke - Researcher

Vinayak Das Gupta - Researcher

Richard Hadden - Researcher

Hannah Healy - Researcher

Shane McGarry - Software Engineer

Neale Rooney - Researcher

Linda Spinazzè - Researcher

Team

An Foras Feasa Institute, Maynooth

Page 4: From crowd sourced collection to digital scholarly edition

Why 1916?

The Easter Rising

24-29 April 1916

“ Allowing letters from personal collections

to be read alongside official letters

and letters contributed by institutions

will add new perspectives

to the events of the period and allow us

to understand

what it was like to live an ordinary life

through what were extraordinary times ”

Susan Schreibman

1 November 1915 - 31 October 1916

The Letters 1916 - a year in the life

Page 5: From crowd sourced collection to digital scholarly edition

Letters of 1916 - some numbers (from 13 October)Launched: 27 September 2013

Correspondence documents uploaded:

2209

Uploaded items from 42 private

collections and 23 collaborating

institutions

Registered users: 1159

Transcribed characters: 2308911

Page 6: From crowd sourced collection to digital scholarly edition

Diversity of Letters of 1916 correspondence data

Diversity of documents:

Single/Multi-page Letters

Postcards

Greeting cards

Telegrams

Envelopes

...

Variety of topics:

Love letters

Family life

Business

Crime

World War One

...

Page 7: From crowd sourced collection to digital scholarly edition

Crowdsourcing workflow - upload

Page 8: From crowd sourced collection to digital scholarly edition

Crowdsourcing workflow - transcription desk

Facsimile image

Bentham toolbar

Text Editor

Page 9: From crowd sourced collection to digital scholarly edition

Toolbar

Page 10: From crowd sourced collection to digital scholarly edition

About the TrainingTraining of transcribers - Essential part of public outreach

Leads to better quality of transcriptions

Workshop

Seminars

Secondary school history teachers,

students, and general public

Page 11: From crowd sourced collection to digital scholarly edition

Goal : Accuracy

1. Incorrect or incomplete metadata

2. Non-TEI markup (e.g. HTML tagging…)

3. TEI tag abuse - misunderstandings

Facing three main areas with quality issues

Community engagement vs standards of excellence?

Page 12: From crowd sourced collection to digital scholarly edition

Incorrect, incomplete or incoherent metadata

the field correspond to the tag

<note type="summary"> inside the TEI header

Page 13: From crowd sourced collection to digital scholarly edition

non-TEI markup (HTML cases)

Page 14: From crowd sourced collection to digital scholarly edition

non-TEI mark-up (non XML)

Indication of location of a section of text:

NOTE IN LEFT MARGIN Give my regards to Dick when next you meet him

(front of post card)To Lady Clonbrock, Ahascragh, Co.Galway

[Handwritten notes at bottom :I Note annexII Await any application from Prof Collingwood;III Resubmit on 1st March]

Uncertainty and missing text:James McCarthy & Family, Wm Perron. 1.50 Nick Welch, xxxxxxx Jxxx & Mrs Shields. 1.00 Alex xxx, Fred xxxx, M. Barry, L. x.

has told you ?Neeson? is in Sussex. Th? ????? ???? ?????letters? from him, but no

(samples from reliable transcriptions)

Page 15: From crowd sourced collection to digital scholarly edition

TEI tag abuse - misinterpretation of TEI

The transcriber uses the tags in an

attempt to recreate the layout

The Transcriber applies the tags without

comprehending the functionality

Page 16: From crowd sourced collection to digital scholarly edition

Quest for Crowdsourcing Accuracy

Quality check:

● pre-selecting the contributors● a self-regulating community● professional staff hired to ensures the crowdsourced content is fine

The 1916 Letter project tries to go a different way and applies a hybrid and semi-automated approach to proofing

Page 17: From crowd sourced collection to digital scholarly edition

Borrowing a Unix Philosophy

“If you can get 90 percent of the desired effect for 10 percent of

the work, use the simpler solution.”

— Bob Sheifler and Jim Gettys, Early Principles of X-Window

Page 18: From crowd sourced collection to digital scholarly edition

Difficult Letters

Page 19: From crowd sourced collection to digital scholarly edition

Modularity in crowdsourced transcribing and editing

Crowdsourcing needs

discrete tasks to be

carried out —

otherwise, chaos!

Page 20: From crowd sourced collection to digital scholarly edition

Post-Omeka Workflowletters: { 302: { title: ”Letter from Patrick Pearse to his mother”, pages: {

27: { facs: “img27.jpg” transcription: “<p>Dear Mother</p> [...] <salute>Your loving son</salute> Padraic.” } 28: {...} }

other-metadata: {...} }, 303: {...}}

Page 21: From crowd sourced collection to digital scholarly edition

Basic typos with tagsSome examples

Slashes in the wrong place:</pb> → <pb/><address/> → </address>

Accidental angle brackets:<<p> → <p>

Missing angle brackets:<salute → <salute>

Number of ‘tag-typos’ per letter (grouped by number of errors)

Nearly half the letters have at least one tag-typo we can fix like this

Page 22: From crowd sourced collection to digital scholarly edition

Finding types of correspondence“Letter from Patrick Langford Beazley to Piaras Béaslaí, 14 Feb

1916”

“Postcard from Herbert Pim to John Sweetman, 1 October 1916”

“Deportation Order from the Secretary of State to James Gough,

17 June 1916” ??

Page 23: From crowd sourced collection to digital scholarly edition

Envelopes

Page 4

DUBLIN 16 APRIL<address>Diarmid Coffey <sic>Esqu</sic>,<lb/>

Mount Trenchard,<lb/> Foynes,<lb/>

Co. Limerick, Ireland</address>

Page 1

<note>you addressed <lb/><sic>yr</sic> letter to<lb/> Harcourt Terrace<sic>wh</sic> delayed it late <lb/>it came this <lb/>afternoon! <lb/>toolate to<lb/> <hi rend="underline">write</hi></note><address>Langridge,<lb/>Bath</address><date>16.10.16</date><salute>Dearest D.</salute><p> Phyllis &amp; Basil have <lb/>written that they come <lb/> out for weekend so [...] Envelope

Page 24: From crowd sourced collection to digital scholarly edition

address>3 Coast Hill <lb/> Queenstown </address> <date>June.19.1916 </date> <salute>My Own Dearest Jim </salute>

Wish of your loving <lb/> <salute>Mother A. Fitzgerald </salute> xxxxxxx</p>

Adding structural elements to letters <opener> <address> <addrLine>3 Coast Hill </addrLine> <addrLine>Queenstown </addrLine> </address> <dateline> <date>June.19.1916 </date> </dateline> <salute> My Own Dearest Jim </salute> </opener>

<closer> <salute> Wish of your loving <lb/> Mother </salute> <signed> A. Fitzgerald </signed> </closer>

Page 25: From crowd sourced collection to digital scholarly edition

Adding the @when<date>Tues oct 22 1916</date>

>>> a = dateparser.parse('Tues oct 22 1916')

>>> a

datetime.datetime(1916, 10, 22, 0, 0)>>> a.date().isoformat()

'1916-10-22'

<date when=”1916-10-22”>Tues oct 22 1916</date>

Page 26: From crowd sourced collection to digital scholarly edition

Postcards

Type 2

Type 1

Page 27: From crowd sourced collection to digital scholarly edition

Templating

Page 28: From crowd sourced collection to digital scholarly edition

LetEd.

Page 29: From crowd sourced collection to digital scholarly edition

Questions to concludeIs it worth it?

Why the trouble of TEI encoding instead of plain text?

Page 30: From crowd sourced collection to digital scholarly edition

Roman Bleier | [email protected]

Richard Hadden | [email protected] | @oculardexterity

Linda Spinazzè | [email protected]

We welcome suggestions, comments, questions.