27
Crowdsourcing Historical Research Claudine Chionh Drupal Downunder 2012

Crowdsourcing Historical Research

Embed Size (px)

DESCRIPTION

Presentation on the Founders and Survivors project for Drupal Downunder 2012.

Citation preview

Page 1: Crowdsourcing Historical Research

Crowdsourcing Historical Research

Claudine ChionhDrupal Downunder 2012

Page 2: Crowdsourcing Historical Research

Founders and Survivors

• Study of the 73,000 convicts transported to Van Diemen's Land (Tasmania) between 1803 and 1853

• Records from the convict system and elsewhere• Health, environment, lifestyle, wellbeing• Effects on health and resilience of descendants

http://foundersandsurvivors.org/

Page 3: Crowdsourcing Historical Research

Goals of the project

• Compile (health and demographic) data about this population from a range of sources

• Enable other researchers to use this data• Explore quantitative and geographic tools and analyses that are

not commonly used in historical research• Combine professional expertise with the enthusiasm of

volunteers

Page 4: Crowdsourcing Historical Research

Some research projects

• Morbidity and mortality on the voyage to Australia• Crime and convicts in Tasmania, 1853-1900• Fertility decline in late C19 Tasmania• Prostitution and female convicts• Tracing convicts' descendants who served in WWI

http://foundersandsurvivors.org/research

Page 5: Crowdsourcing Historical Research

Project staff

• Historians• Demographers• Epidemiologists• Two part-time developers

Page 6: Crowdsourcing Historical Research

Who are our users?

• Research team• Other interested researchers• Genealogists/family historians• Local historians

Page 7: Crowdsourcing Historical Research

Data sources

• Conduct records• Surgeons' journals• Newspaper reports• Births, deaths, marriages• Parish records• Family histories, memories, legends

Page 8: Crowdsourcing Historical Research

Official/formal sources

Records from the convict system

• Trial, conviction documents• Conduct records• Ship surgeons' journals• Permissions to marry• Ticket of leave

Outside the convict system

• Births, deaths, marriages• Later convictions

Page 9: Crowdsourcing Historical Research

Paper databases

Broader historical context:

• Mass transportation• Modern record-keeping and statistics

Page 10: Crowdsourcing Historical Research

Informal sources

• Newspaper reports• Family history: primary sources, compiled genealogies, anecdote

and legend

Page 11: Crowdsourcing Historical Research

Our volunteers

• Amateur historians, genealogists• Librarians• IT specialists

Page 12: Crowdsourcing Historical Research

How volunteers can contribute

• Individual convict biographies• Tracing batches of convicts in ships

Page 13: Crowdsourcing Historical Research

Solutions

• XML database• Drupal• Google Docs

Page 14: Crowdsourcing Historical Research

The Founders and Survivors database

• XML (based on Text Encoding Initiative http://www.tei-c.org/)• BaseX XML database engine http://basex.org/

Page 15: Crowdsourcing Historical Research

Experimenting with Drupal

• Used an older version of Migrate to import some tabular data as nodes

• Problem of scale: 73,000 convicts• XML approach proved to be more efficient

Page 16: Crowdsourcing Historical Research

Getting data into our system

Formal sources

• Collected by archives and individual researchers• CSV, Excel, Filemaker, Access ...• Incorporated into BaseX database with Perl scripts

Informal sources

• Individual convicts' life histories are captured in a Drupal content type ('Community contributed content')

• Some sub-projects also capture summary data in Google spreadsheets

Page 17: Crowdsourcing Historical Research

Viewing data

• Master database in BaseX: presented in XSLT, different views for logged in researchers and others

• Community contributed content (CCC): Drupal• Two-way link between master database and CCC• Google spreadsheets prepopulated with links to corresponding

records in master database

Page 18: Crowdsourcing Historical Research

Data capture

• Convict biographies captured in Drupal – Community Contributed Content (CCC)

• Linked to entry in XML database• Perl scripts to incorporate CCC records into master database

Page 19: Crowdsourcing Historical Research

XML entry for an individual convict

Page 20: Crowdsourcing Historical Research

Prepopulated Drupal form

Page 21: Crowdsourcing Historical Research

Community contributed content

Page 22: Crowdsourcing Historical Research

Ships (batches of data)

• Tracing all convicts on a ship• Summary data in Google Spreadsheets• Spreadsheets are prepopulated from the master database

Page 23: Crowdsourcing Historical Research

Ship summary data in Google Spreadsheets

Page 24: Crowdsourcing Historical Research

Drupal can't do everything

• Scale• Complexity• Expertise

Page 25: Crowdsourcing Historical Research

Where Drupal is appropriate for our project

• Web frontend• Data capture• Collaboration, forums

Page 26: Crowdsourcing Historical Research

Summary

• Massive XML database with complex relations• Drupal for capturing slightly complex data and facilitating

collaboration• Google Spreadsheets for capturing tabular data

Page 27: Crowdsourcing Historical Research

Questions?

Founders and Survivors

http://foundersandsurvivors.org/

[email protected]

Claudine Chionh

http://www.onefewercar.net/

[email protected]