28
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 1 Online materials published in Austria collecting, archiving and metadata

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 1 Online materials published in Austria collecting, archiving and metadata

Embed Size (px)

Citation preview

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 1

Online materials published in Austria

collecting, archiving and metadata

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 2

Topics

• Austrian National Library• Legal situation in Austria• Web@rchive Austria• E-pubs• Metadata• Plans

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 3

Austrian National Library• Largest scientific

library in Austria• Entitled „National

Library“ since 1920, with a history dating back into the 14th century

• About 350 employees

• Holdings: about 8 million objects

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 4

Legal situation in Austria• Austrian Media Act 1981 included

only printed works• Amendment from 2000 included

digital offline media (such as CD-ROM, DVD or diskette) in legal deposit legislation

• Amendment from 2009 mandates the Austrian National Library to collect and archive online media

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 5

Web@rchive Austria (1)• Webarchiving started in 2008• Access started in May 2010• Storage and back-up outsourced to

Austrian Federal Computing Centre

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 6

Web@rchive Austria (2)

• Staff– 2 FTE, department Digital Library

• Hardware– 8 crawlers

• Software (open source only)– crawler Heritrix– crawl management NetarchiveSuite– access with Wayback Machine

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 7

Web@rchive Austria (3)

• Strategies

combination of– domain harvesting– selective harvesting– event harvesting

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 8

Web@archive Austria (4)• Domain harvesting

– started in 2009– top-level domains .at– other domains content related to Austria– harvested up to 100 MB– crawling every 2 years– currently second crawl finished

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 9

Web@rchive Austria (5)

• Selective harvesting– started in 2010– highly dynamic, constantly changing

sites– harvested in specific intervals

• Media collection (ongoing)• Literature collection (starting)

– Harvested up to level 5 to 7

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 10

Web@rchive Austria (6)

• Event harvesting– started in 2008– mostly temporary sites– collected at special events/occasions

• elections (Austrian and European elections 2008)

• sport events (EURO 2008)– harvested up to level 5 to 7

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 11

Web@rchive Austria (7)• Access

– on site at the Austrian National Library (special terminals)

– and at 20 other authorized libraries (National Archives, Parliament, state und university libraries)

– open for everybody, not only researchers– restricted to reading and printing, download

and forwarding are prohibited

See screencasts under http://www.screenr.com/user/AT_Webarchive

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 12

E-pubs (1)

• Staff– 2 FTE

• Hardware– 1 server

• Software– repository software DigiTool

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 13

E-pubs (2)

• Strategies– e-books, e-journals, e-dissertations– „e-only“ publications– information from publisher

or inquiry by Austrian National Library– contracts with libraries concerning e-

dissertations– collecting manually (now) and

automatically (planned)– format preferred: PDF/A and XML

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 14

E-pubs (3)

• Workflow– information by publisher/inquiry through

ANL– identification and check („e-only“)– definition of rights– collecting

• Retrieval• decision on document type• storage on server Scirius

– 1. release

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 15

E-pubs (4)

• Workflow (continued)– descriptive cataloguing of the resource in

the library system Aleph (record is marked as data set for the national bibliography and will be published in the next issue)

– assignment of the Aleph system number to the document in DigiTool

– 2. release– subject indexing– 3. release– ingest in DigiTool

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 16

E-pubs (5)

• Access– on site at the Austrian National Library

(special terminals)– with consent of the publisher also remote

access– open for everybody, not only researchers– restricted to reading and printing,

download and forwarding are prohibited

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 17

Metadata for online materials• Metadata

– basically provide data for • description• organisation• exchange

– in the case of digital objects not only• descriptive,• structural and • administrative data is important for access and management

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 18

Metadata Web@rchive (1)

– no descriptive metadata– only structural and administrative

metadata

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 19

Metadata Web@rchive (2)• Structural metadata

– information on structure of digital objects

– defines relations of multipart objects

– METS is used at Austrian National Library

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 20

Metadata Web@rchive (3)• Administrative metadata

– technical metadata– rights metadata– preservation metadata– identification metadata

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 21

Metadata Web@rchive (4)• Technical metadata

– information on technical characteristics, format etc. of an object

– different according to the format of the objectf.e. Z39.87/MIX for images

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 22

Metadata Web@rchive (5)• Rights metadata

– information on copyright, access rights

– a self-defined data model is used by the Austrian National Library

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 23

Metadata Web@rchive (6)• Preservation metadata

– information on the necessary technical environment (hardware and software)

– PREMIS is used for electronic offline publications at the Austrian National Library

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 24

Metadata Web@rchive (7)• Identification metadata

– persistant idetifiers to ensure a stable access

– the Austrian National Library will use URNs

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 25

Metadata e-pubs (1)

• Structural metadata• Administrative metadata

– Technical metadata(generated automatically at time of ingesting)

– Rights metadata– Identification metadata

(PID generated at time of ingesting, URN is planned)

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 26

Metadata e-pubs (2)

• Descriptive metadata– created by cataloguing staff of Austrian

National Library in Aleph– MAB data is transferred from Aleph to

Digitool (system number via port) and stored as Dublin Core Qualified

9 August 2012

Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 27

Perspectives, plans, challenges

• pressure of time– highly dynamic websites– “collect now, ask later“– retrievability

• building more collections– literature collection just starting– collection of Austria(n) libraries in and outside

Austria• automatization of collection for e-pubs• preservation of web archives (partner institution

of project Scape• one-stop shop for all digital resources

9 August 2012

Thank your for your

attention!

[email protected]

Digital reading room Austrian National Libraryhttp://www.onb.ac.at/bibliothek/digitaler_lesesaal.htmNominate websiteshttp://www.onb.ac.at/FormsGen/form.jsp?formID=1Follow ushttp://twitter.com/AT_WebArchiveBibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August 2012 28