P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

  • View

  • Download

Embed Size (px)


Présentation de Peter Doorn (Data Archiving and Networked Services DANS) présentée lors do colloque infoclio.ch à Berne le 16 septembre 2010.

Text of P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

  • Building national and international data infrastructures for humanities research

    Peter Doorn - director, Data Archiving and Networked Services (DANS); co-ordinator, Preparing DARIAH (Digital Research Infrastructure for the Arts and Humanities)

    Presentation for Digitale Forschungsinfrastrukturen fr die Geschichtswissenschaften, Bern, 16 September 2010Driven by data

  • Contents:What is a data/research infrastructure?The changing needs of the researchersSetting up data infrastructures in the Netherlands, 1964 2010The next stepsDARIAH and other international initiatives

    Driven by data

  • What is a data/research infrastructure?

    In the natural sciences: something concrete, something physical....

    A building, a telescope, a particle accelerator, a nuclear icebreaker....Driven by data

  • Driven by dataResearch Infrastructures (R.I.)R.I. in general: permanent and physicalR.I. for the arts and humanities?Cultural heritage in all forms is the main source of humanities researchLibraries, archives and museums are the traditional laboratories for the humanitiesIn the digital age, essential for innovative humanities research is:Access to digitised heritage data (data bases, text corpora, speech, image collections, etc.)Tools to process this informationThe most important new research infrastructure for the humanities is therefore a digital one

  • What kind of infrastructure do humanities scholars need?

  • Driven by dataFrom Humanities computing to e-humanitiesRoots go back to the 1960s:text analysis, e.g. bible studiesquantitative social and economic historycomputer linguisticsdigital archaeology

    E-humanities as analogy of e-science: science increasingly done through distributed global collaborations enabled by the Internet, using very large data collections, large-scale computing resources and high performance visualisation.

  • 3. Setting up data infrastructures in the Netherlands, 1964 2010

  • 1989: Netherlands Historical Data Archive Initiative by Low Countries Association for History and Computing Started with feasibility study, followed by inventory of databases and a pilot Until 1995 on a project basis, supported with digitization projects Organizational form flexibleDriven by data

  • 2004: Electronic Depot of Dutch Archaeology Idea came up at a conference of historians and archaeologists in 2003 1980: computer used during excavation Initiative by university archaeologists, data archive and state archaeological service Started as a series of projects, since 2005 hosted by DANSDriven by data

  • DANS created in 2005 Merger of earlier existing data infrastructures Serving humanities and social sciences Mission: providing permanent access to research data Funded by Academy of Arts and sciences (KNAW) and Dutch Organisation for Research (NWO) Budget: 2.5 M + 1 M projects. Staff grew from 10 to almost 40 (including projects). Driven by data

  • What do we do? Archive data and provide access Data projects in connection with researchers Data Seal of Approval Persistent Identifiers Symposia and Publications Subsidize small data projects

    Three sections: Archive Infrastructure Software Development

    Driven by data

  • Datasets according to disciplinesDriven by data

  • www.dans.knaw.nlDriven by data

  • Electronic Archiving System for searching and depositing data

  • Dataset description

  • Download data after loginDataDocumentationPublicationsDriven by data

  • Download statistics visible to all registered users

  • Digitization Population Censuses

  • www.volkstellingen.nlDriven by data

  • Driven by dataSpreadsheets are look-alikes of original published tables

  • Driven by dataMapping the census data for the Dutch municipalities

  • Shipping in the Golden Age

  • Journal entries, 26-29 September 1758Ships name: NoordbevelandMonth: SeptemberYear: 1758Day: TuesdayDate: 26thWeather on boardWindPeculiarities

  • Dutch Shipping Routes 1750-1850Courtesy of CLIWOC project, KNMIDriven by data

  • Driven by dataThe research data:can be found on the Internetare accessible (clear rights and licenses)are in a usable formatare reliablecan be referred to (persistent identifier)www.datasealofapproval.org5 Criteria 16 guidelines

  • 4. The next steps

    Broaden DANS into a discipline-independent data organisation.

    Many DANS activities are independent of discipline: Data Quality Guidelines: Data Seal of Approval Resolver for Persistent Identifiers Selection criteria for data preservation Deposit and Access Licenses, Intellectual Property Rights, Privacy Standards (Archival file formats, metadata) Storage, conversion, backup, documentation services Driven by data

  • New DANS strategy

    In line with National Coalition for Digital Preservation Build bridges between e-science and digital humanities Connect to other data infrastructures and initiatives in the technical, natural and life sciences Step by step approach Many large-scale facilities on the National Roadmap have a data function

    Driven by data


    5. DARIAH and other international initiativesEuropean infrastructure challengesIn spite of some achievements, existing research infrastructures are primarily national... if they are there at all!European activities are until now funded on a project basis and carried out as voluntary activities by national partnersStable, pan-European research infrastructures for the arts and humanities hardly existIncreasing internationalisation of humanities research puts new requirements for such infrastructuresDARIAH is the only ESFRI proposal for the arts and humanities


    Science Case for DARIAHChanging research practice in a networked world:Digital resources (data & tools) form the laboratory of the scholar in the arts and humanitiesComputational technologies and methods of analysisResources on the web are highly distributedThe scale of research goes up: networked projects European projects have no continuityThe existing structures are too weak (ad hoc networks, no permanence) and national in scopeAnswer: strong European data infrastructure, providing continuity and support for digital A&H research and access to digital resources


    DARIAH MissionThe mission of DARIAH is to enhance and support digitally enabled research across the humanities and arts. DARIAH aims to develop and maintain an infrastructure in support of ICT-based research practices, working with communities of practice to:Explore and apply ICT-based methods and tools to enable new research questions to be asked and old questions to be answered in new waysLink and provide access to distributed digital source materials of many kindsExchange knowledge, expertise, methodologies and practices across domains and disciplines


    DARIAH Partners14 members in 10 countries: Croatia, Cyprus, Denmark, France, Germany (2), Greece (2), Ireland, Netherlands, Slovenia, United Kingdom (3)Associate members: Italy, Spain, SwedenAspiring partners: Austria, SwitzerlandOther prospective partners in: Bulgaria, FYROM (Macedonia), Hungary, Lithuania, Norway, Serbia, Rumania



    Preparation Project: Overview of the Work PackagesProject managementDisseminationStrategic workFinancial workGovernance and logistical workLegal workTechnical reference architectureTechnical: Conceptual modelling


    Preparing DARIAH: time schedule20082009May 2007Deadline Capacities call ESFRI projectsQ3 2008Agreement EC fundingQ4 2008Start Preparing DARIAH20102007October 2006Publication ESFRI Roadmap December 2006Publication relevant FP7 callQ3 2010 DARIAH conferenceQ1 2011Start construction DARIAHFinancial Commitment?Q4 2009 Funders meeting


    DARIAH Virtual Competency Centers (Hubs)Research & Education: supporting research groups and centres in the 'digital humanities'; knowledge exchange and education, post- graduate programmes and researcher exchangee-Infrastructure: service provision, systems & tools, connecting resourcesAdocacy & Promotion (& Management): PR, encourage collaboration, community building, website, administration, demonstrate value and impact Content & Legal: supporting scholarly data creation, access, curation and preservation; rights management, IPR licences, quality assurance


    The VCC concept


    DARIAH Governance and Costs in Construction PhaseGovernance structure of ERIC


    Relations to other projects and networks


    19-21 October, Vienna: SDHDARIAH-CLARIN conferencewww.dariah.eu