11
The challenge of making digitised European newspaper content available online Susan Reilly, LIBER Twitter: @skreilly IFLA Newspapers, Singapore, Aug 2013

The challenges of making Europe's newspapers available online

Embed Size (px)

DESCRIPTION

tPresentation from WLIC2013. Reports on a survey conducted by the Europeana Newspaper project of digitised newspaper collections in LIBER (European research) libraries.

Citation preview

Page 1: The challenges of making Europe's newspapers available online

The challenge of making

digitised European newspaper

content available online

Susan Reilly, LIBER

Twitter: @skreilly

IFLA Newspapers, Singapore, Aug 2013

Page 2: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 2

Overview

Europeana Newspapers: making European newspapers available online

Assessing the state of our digitised newspapers collections

Where do we go from here

Page 3: The challenges of making Europe's newspapers available online

Europeana Newspapers: making European newspapers available online

• Content from 20 countries! (13+7 new countries)

• Aggregation of more than 18 million newspapers into Europeana

• Make newspapers more accessible by applying refinement methods for OCR, OLR (article segmentation), and named entity (NER) and class recognition

• Increase visibility via dedicated content browser

• Ensure sustainability by spreading best practice

Page 4: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Asessing the state of Europe digitised newspaper collections

• Who’s digitising newspapers?

• What percentage of newspaper collections are digitised?

• How many pages?

• Quality of digitisation?

• How are images made available?

Page 5: The challenges of making Europe's newspapers available online

Findings:% of newspaper collections digitised

• Survey of LIBER member (400 European research libraries)

• 47 responses• Does this indicate number of institutions digitising

newspaper?

• Less than 10% of respondents’ collections digitised• Compared to average of 20% for % of total collection digitised

(Enumerate)

• 130 million pages and 24,000 titles• Not all libraries could provide exact figures because of

cursory nature of catalogue

Page 6: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Findings: 20th century content an issue

• Conservative approach to copyright terms

• ½ of respondents reported a cut-off date beyond which they do not make content available

• Early as 1863• Latest last 70 years

• Special arrangements with publishers (23%)

• Collective rights agreements too complex

Page 7: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Findings: How accessible are the collections?

• 85% provide free access• Sometimes only at national level

• Some subscription fees/under licence

Page 8: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Findings: How rich is the content?

• 36% employ no OCR

• 50% of those who did not confident enough in results to expose OCR’d text via search interface

• 36% zoning and segmentation

• Only 6% named entity recognition

• Huge variance in metadata• Dublin Core only• Own standards

Page 9: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Challenges

• Newspaper digitisation is behind

• Copyright issues more complex

• Lack of quality evaluation technologies for OCR

• Lack of standardised metadata suited to newspapers

Page 10: The challenges of making Europe's newspapers available online

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Solutions

• Standardised metadata mapped to EDM

• Quality evaluation technologies for OCR

• Clarity over rights issues

• Dialogue with publishers

• More funding for digitisation• Increase visibility

Page 11: The challenges of making Europe's newspapers available online

Thank you for your attention!

http://www.libereurope.eu

http://www.europeana-newspapers.eu/