Upload
olaf-janssen
View
1.084
Download
1
Embed Size (px)
DESCRIPTION
This slidedeck gives an overview of Dutch e-humanties projects that build upon the datasets of the Koninklijke Bibliotheek, the national library of the Netherlands. It focuses on 8 projects that reuse the digitized historical newspapers (1618-1995) of the KB. It was presented on 7-1-2014 at the Huygens Institute for the History of the Netherlands (Huygens ING for short). This is an institute of the Royal Netherlands Academy of Arts and Sciences (KNAW) where around 100 scholars work in the largest humanities institute of the Netherlands. Keywords: biland,delpher,e-humanities,elite network shifts,hirods,historical newspapers,isher,koninklijke bibliotheek,national library of the netherlands,open data,polimedia,political mashup,reuse,sealincmedia,translantis,washp
Citation preview
Huygens ING, 07-01-2014
Olaf Janssen, Koninklijke Bibliotheek, National Library of the Netherlands
[email protected] - @ookgezellig - slideshare.net/OlafJanssenNL
Reusing historical newspapers of KB in e-humanities
Case studies and examples of research projects
http://germanics.washington.edu/sites/germanics/files/images/digital_humanities_wordle.p http://www.kb.nl/sites/default/files/kranten.jpg
This slidedeck is optimised
for
slideshare.net/OlafJanssenNL
What I hope you’ll get out of this talk Improved understanding of 1. e-humanities research projects using KB historic
newspapers
2. reuse potential of KB datasets in e-humanities
Delpher
Newest website of KB & partners
Front-end interface to 5 datasets
Downsides of Delpher webinterface
“Readers [..] are disempowered by machinery that allows them only to choose among options that have been pre-scripted’” (Liu *)
Datasets & APIs offer more flexibility
* Marlene Manoff, ‘The Materiality of Digital Collections. Theoretical and Historical Perspectives’, Portal: Libraries and the Academy 6 (2006) 311-325, p 320. http://dspace.mit.edu/bitstream/handle/1721.1/35689/6.3manoff.pdf
Open datasets & APIs of KB
kb.nl/dataservices
These are only the (semi-)open sets
KB has more sets
on offer!
KB datasets, Delpher and openness KB dataset Abbreviation Set in Delpher? Set described on
kb.nl/dataservices?
Openness
Early Dutch Books Online
EDBO / DPO Yes, Boeken Basiscollectie
Yes, link Metadata : CC0 Objects : Public Domain
Historic Newspapers 1618-1995
DDD Yes, Delpher Kranten No Mixed (<1900 Public Domain) Available on demand for scientific research
Radiobulletins from ANP
ANP Yes, Delpher Radiobulletins
Yes, link Metadata : CC0 Objects: CC-BY-NC-ND
Proceedings of Parliament 1814-1995
SGD No Yes, link Metadata : CC0 Objects : CC0
Dutch Magazines 19th + 20th c
DTS Yes, Delpher Tijdschriften
No Mixed (<1900 Public Domain) Available on demand for scientific research
Watermarks in Incunabula in the Low Countries
WILC No (images) Yes, link
Metadata : CC0 Objects : CC0
Medieval Illuminated Manuscripts
MVH / Byvanck
No (images) Yes, link
Metadata : CC0 Objects : Public Domain
KB datasets, Delpher and openness KB dataset Abbreviation Set in Delpher? Set described on
kb.nl/dataservices?
Openness
Early Dutch Books Online
EDBO / DPO Yes, Boeken Basiscollectie
Yes, link Metadata : CC0 Objects : Public Domain
Historic Newspapers 1618-1995
DDD Yes, Delpher Kranten No Mixed (<1900 Public Domain) Available on demand for scientific research
Radiobulletins from ANP
ANP Yes, Delpher Radiobulletins
Yes, link Metadata : CC0 Objects: CC-BY-NC-ND
Proceedings of Parliament 1814-1995
SGD No Yes, link Metadata : CC0 Objects : CC0
Dutch Magazines 19th + 20th c
DTS Yes, Delpher Tijdschriften
No Mixed (<1900 Public Domain) Available on demand for scientific research
Watermarks in Incunabula in the Low Countries
WILC No (images) Yes, link
Metadata : CC0 Objects : CC0
Medieval Illuminated Manuscripts
MVH / Byvanck
No (images) Yes, link
Metadata : CC0 Objects : Public Domain
Let’s now look at some e-humanities
projects that build upon these 3 sets
(with a focus on Historic Newspapers 1618-1995 – DDD)
For now I’ll focus on 2 e-humanities projects
1. Polimedia 2. Political Mashup You can self-study
3. Translantis 4. HiRoDs 5. Elite network shifts 6. ISHER 7. WASHP | BILAND 8. SEALINCMedia
1. Polimedia - What
• Connects transcripts of Dutch Parliament with media coverage in newspapers and radio bulletins
• Improved analyses of radio & newspaper coverage of political debates
• KB = data supplier - SGD (1945-1995) - DDD (1945-1995) - ANP (1945-1984)
• www.polimedia.nl
1. Polimedia - Who
Build by team from
• Erasmus University Rotterdam • VU University Amsterdam • TU Delft • Netherlands Institute for Sound and Vision
Search for “lubbers”
Sept 2013: Polimedia wins LinkedUp Challenge
http://www.ewi.tudelft.nl/nl/actueel/laatste-nieuws/artikel/detail/polimedia-wint-internationale-wedstrijd-linkedup-challenge/
http://www.flickr.com/photos/mariekeguy/9786827596/in/set-72157635540561993
Polimedia dataset
• Polimedia dataset is available for researchers
to build upon
• data.polimedia.nl
2. Political Mashup - What
• Research programme to make Dutch political data (1814-present) more understandable and accessible
• Creation of rich semantically annotated dataset in XML For every word ever spoken in Dutch parliament we know - who said it - what political party the speaker belonged to - which role the speaker fulfilled - when it was said - to whom it was said - in which context it was said
• Datasets of KB reused
- Proceedings of Parliament 1814-1995 (SGD) - Historic newspapers (DDD)
• http://politicalmashup.nl/ • http://politicalmashup.nl/over-political-mashup/ • http://search.politicalmashup.nl/
2. Political Mashup - Who
Team around Maarten Marx (Univ of Amsterdam) • Historians (Groningen Univ) • Language technologists (Univ of Tilburg, UoA) • Computer scientists (UoA)
Datasets & suppliers • Proceedings of Parliament 1814-1995
Koninklijke Bibliotheek
• Proceedings of Parliament 1995-present officielebekendmakingen.nl
• Political party + election manifestos, websites of political parties Centre for Documentation of Dutch Political Parties (DNPP)
• Biographical data political parties & politicians Centre for Parliamentary Documentation (PDC)
http://ccct.uva.nl/content/project/politicalmashup
Political Mashup & SGD – Polidocs
Polidocs
• Proof-of-concept search engine for
Proceedings of Parliament 1984-2008
• Part of Political Mashup programme, by students UoA
• www.polidocs.nl • www.polidocs.nl/about.html
Political Mashup & SGD – Ngram viewer
Political ngram viewer
• Frequency of words (or phrases) in
Dutch parliament through time
• Inspired by Google ngram viewer
• http://ngram.politicalmashup.nl
Oct 2012: Political Mashup wins Dutch Data Prize 2012
http://politicalmashup.nl/2012/10/politicalmashup-wint-nederlandse-dataprijs-2012
“Elke burger wordt door Political Mashup in staat gesteld zowel voor het verre verleden, als voor de dag van gisteren, het politieke proces beter te doorgronden en ook zijn
gekozen volksvertegenwoordigers effectief te controleren. Het bevordert daarmee niet alleen de betrouwbaarheid (en herhaalbaarheid) van het politiek-historisch onderzoek,
maar stelt de burger ook in staat echt democratisch burger te zijn.”
Political Mashup & SGD - Further reading
http://kb.nl/sites/default/files/docs/political-mashup-casestudy-hergebruik-open-data.pdf
Political Mashup & DDD – Newspaper stats
http://politicalmashup.nl/2013/03/de-omvang-van-het-kb-kranten-archief/
Political Mashup & DDD – Meats in newspapers
http://politicalmashup.nl/2013/03/vleesch-in-de-nederlandse-krant/
Political Mashup & DDD – Ngram viewer
KB historic newspapers ngram viewer
http://kbkranten.politicalmashup.nl
Political Mashup & DDD – Ngram viewer
Marx & Nusselder, 2013
https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxkZXZlbG9wbWVudGR1dGNobGFuZ3VhZ2V8Z3g6NzlkM
TJlZjE3MDRjZWFiNA
Political Mashup & DDD - Further reading
3. Translantis
• Emergence of the United States in public discourse in the Netherlands
• Analysis how USA served as cultural model for the Netherlands in the period 1890-1990
• Utrecht University, University of Amsterdam, ING Huygens
• Texcavator text mining tool
• KB = provider of newspapers 1890-1990
• http://translantis.wp.hum.uu.nl
Texcavator text-mining tool
• Based on open-source text analytics platform xTAS (University of Amsterdam)
• Used in projects - Translantis - Political Mashup (see 2.)
- WASHP & BILAND (see 7.) - Infiniti
• http://dev.wahsp.nl/texcavator (login?)
• http://xtas.net/demonstrators ShoShin (KB newspapers)
4. HiRoDs
• HIstorical Roots of the Dutch Sustainability
Challenge
• Tracing historical roots of current sustainability problems in NL by analyzing historical data on economic development, flows of materials, energy use etc. from 1850 onward
• KB = provider 19th c. newspaper data
• http://www.nwo.nl/onderzoek-en-resultaten/onderzoeksprojecten/48/2300167248.html
• Formation, circulation and relocation
of elites during regime transitions of 1945–50 and 1998 in the Netherlands Indies and Indonesia, by analyzing digitized newspapers
• KITLV, NIOD, University of Amsterdam, Bandung Institute of Technology, Erasmus University, DANS
• KB = (potential) provider of Indonesian language newspapers 1945-1957
• http://www.kitlv.nl/home/Projects?id=25
• http://www.ehumanities.nl/computational-humanities/elite-network-shifts/
5. Elite network shifts
6. ISHER
• Integrated Social History Environment for
Research
• Detect & associate events, trends, people and organisations related to social unrest (e.g. strikes) in historical newspapers
• KB = provider of newspaper data
• http://www.nactem.ac.uk/DID-ISHER
• http://www.diggingintodata.org/LinkClick.aspx?fileticket=3XQTLdzggoo%3D&tabid=196
7. WAHSP | BILAND
• WAHSP: web-application for sentiment
mining in historical public media (newspapers, magazines and radio bulletins)
• Example: research public sentiments around drugs using Dutch newspapers 1900-1945
• Texcavator text mining tool
• BILAND: extend tool for bilingual research Also include German newspapers
• KB = provider of newspaper data
• http://www.biland.nl
• http://dev.wahsp.nl/texcavator (login?)
8. SEALINCMedia
• Socially Enriched Access to LINked
Cultural Media
• Modeling and evaluating (social, web2.0) human input for multimedia content access, with the aim to integrate it in automatic data analysis
• Public-private partnership 3 universities, 1 scientific institute, 4 technology companies, 5 heritage institutions (data providers)
• KB = provider of newspaper data
• http://www.commit-nl.nl/projects/socially-enriched-access-to-linked-cultural-media
• http://sealincmedia.wordpress.com/
THANKS!
[email protected] @ookgezellig slideshare.net/OlafJanssenNL