VanDyck Long-Term Preservation of Digital Scholarly Literature

Preview:

Citation preview

Long-termPreservationofDigitalScholarlyLiterature

CraigVanDyckNISO-NFAISVirtualConference:MakingCertainDigitalContentisPreserved

7 December2016

WhyPreservationMatters

• Endusers• Libraries•Publishers•Grantfunders•Researchinstitutes

2

3

Stakeholders

• Scholarsrelyonpermanentaccesstodigitalmaterials• Thescholarlyliteratureislong-lived• Librariesasthestewardsofpreservation• Librariesmaynotowncopiesofthedigitalliterature• Publisher-providedaccesscanbeunstable

4

Stakeholders,cont’d

• Funderswanttheoutputfromtheirfundingtoremainavailable• Researchinstitutesneedtheirfacultytohaveaccesstomaterials;andneedtobesurethattheirfaculty’soutputwillbeaccessible

5

HowCLOCKSSWorks

• Introduction• Technology;LOCKSS• Processes• Governance• Statistics• Triggers• Challenges• Priorities

6

CLOCKSS: ControlledLOCKSS(LotsofCopiesKeepStuffSafe)

• Beganoperationsin2006• Ensuringlong-termaccesstoscholarlyliteratureforresearchers• Adiverse,robustecosystemofdigitalpreservationsolutions• CLOCKSSpreservesandarchivesonbehalfoflibraries• Librarieshaveinsistedthatpublishersarchivetheircontent

7

CLOCKSS-- Technology

• CLOCKSSusestheopensourceLOCKSStechnology,with12libraryservernodes:NA:Indiana,OCLC,Rice,Stanford,Virginia,AlbertaEurope:Edinburgh,Humboldt/Germany,Universita Cattolica /ItalyAPac:HongKongU,NII/Japan,AustraliaNationalU

8

CLOCKSS– Technology,cont’d

• CLOCKSSiscertifiedasaTrustedDigitalRepositorybytheCenterforResearchLibraries• TRACauditperfectscorefortechnology;seeDavidRosenthalblog:http://blog.dshr.org/2014/07/trac-certification-of-clockss-archive.html

9

AwordaboutLOCKSS

• FromtheStanfordUniversityLibrary• Uniquetechnologysolution:multipleserversconstantlycross-checkingeachother,ensuringthepreserveddataisvalid•Manyinstances:

GlobalLOCKSSNetwork150nodes,eachwiththeirowncollection;postcancellationaccess14PrivateLOCKSSNetworkse.g.CLOCKSS,PublicKnowledgeProject,CanadianGovernmentInformation,CARINIANA(Brazil),ADPN,USgovernmentdocuments

10

CLOCKSS-- Processes

• Contentsubmissionviafiletransferorwebharvest:https://www.clockss.org/clocksswiki/files/File_Transfer_Guidelines_-_CLOCKSS.pdfhttps://www.clockss.org/clocksswiki/files/Web_Harvest_Guidelines_-_CLOCKSS.pdf

•Webharvestisparticularlyusefulwith“longtail”publishers

11

CLOCKSS-- Governance

• CLOCKSSisa“dark”archive• Triggeredcontentismadeavailableasopenaccess•Whatdoes“trigger”mean?

- Whendigitalcontentceasestobeavailabletoendusers- Accessmustbeensured,tosupportscholarship

12

CLOCKSS– Governance,cont’d

• Communitygovernance:equalnumberoflibrariesandpublishersontheBoardofDirectors• Fundedbypublisherfeesandvoluntarylibrarycontributions• Free-standing501(c)(3)non-profit• Financiallystable

13

CLOCKSS-- Statistics

• 200publisherparticipants,750librarysupporters• 15millionjournalarticlesandbooks,adding~4million/year• 5largestpublishers=70%ofthecontent• “longtail”publishers=65%ofthepublishers

14

CLOCKSS– TriggeringContentforAccess

• Rigorousrulesandpractices• Bylawsrequire75%Boardvoteinfavor,withnomorethan2votingagainstatrigger• 29triggeredjournals;1milliondownloadsthisyear• TriggeredjournalsareopenaccessviaCLOCKSS,atStanfordandEdinburgh• CreativeCommonsAttribution-Noncommercial-NoDerivativeWorksLicense

15

Challenges:TwoAsks

1. Preservingthe“LongTail”:- Longtailjournalsarethemostat-risk,andthehardesttofindandworkwith- Weneedlibraries’prioritiesforwhattoarchivethatisnotyetarchived

2. Financialsupportforadiverseandrobustdigitalpreservationenvironment

16

CLOCKSS– 2017Priorities

• Investinginhardwareandsoftware:capacity,timeliness• Addingmorelargebackfiles• Newcontenttypese.g.datasets,video,databases• Strongertransparency• Increasedoutreach

17

CLOCKSS,concluded

CLOCKSSholdingsarepubliclyreportedintheKeepersRegistry:https://thekeepers.org/

https://clockss.orgcvandyck@clockss.org973-600-7397

18

Recommended