Upload
ariadnenetwork
View
149
Download
0
Embed Size (px)
Citation preview
ARIADNE is funded by the European Commission's Seventh Framework Programme
ARIADNE IntegraAon Framework
Achille FeliceD
VAST-‐LAB -‐ PIN, Università degli Studi di Firenze, Italy
Winter School Overview
1. The ARIADNE Infrastructure 2. Integrated registraAon and resource descripAons 3. SemanAc integraAon of archaeological informaAon
4. Mapping strategies 5. Mapping and conversion tools
What is ARIADNE • ARIADNE is a Research Infrastructure aiming at the
integraAon of archaeological datasets in Europe (and beyond)
• Four years’ duraAon • StarAng 1st February 2012 • 24+ partners • Coordinated by PIN-‐University of Florence (IT)
Why ARIADNE • Huge number of archaeological data available in
digital format • Large number of non-communicating archaeological
datasets • Increasing interest of the research community for
data sharing • Social pressure for opening data vaults
• Museum InformaAon • Library InformaAon • Images • 3D Models
• RDBMS • GIS • XML • CSV • Excel • Unstructured
file
XML
Digital documentaAon
IntegraAon in ARIADNE • CreaAon of an integrated ecosystem of archaeological
informaAon
• To guarantee interoperability among data coming from different archives
• To use data as if they were stored in a single archive – Unique access point – Uniform interfaces
• To ensure retrieval of informaAon in a coherent and meaningful way – SemanAcs
How to achieve integraAon Data sharing requires • Suitability of somebody else’s data for one’s
purposes • Interoperability of datasets • Trusting in data collected by others • Guarantee of data “provenance” • Common understanding on meanings
Project activities • Networking activities
– Community building: involving additional institutions sharing data and establishing together guidelins
– Standardization and good practices • Trans-National Access to shared datasets and training in
their creation, as well as to on-line repositories – Support for digitization and data organization
• Research activities – Knowledge organization – Data management – New or improved tools to extract information – Advances in methodology
IntegraAon road map – DigiAsaAon of informaAon on paper
– Online availability • Microsoc Word, Excel, Access, PDF, GIS …
– Online accessibility improvement
• ADS, ARACHNE, ZENON, FasAOnline, …
– Consistency checking for mapping and informaAon extracAon
• DescripAve informaAon to ACDM (Registry)
• Legacy data to CIDOC CRM
• Legacy thesauri to SKOS
Archaeological resources registraAon
• RegistraAon – Datasets inventory – Services inventory
• DescripAon – ACDM model for describing datasets and services
• IngesAon into Registry • Data Enrichment Policies • IntegraAon Strategies
Describing archaeological data ARIADNE Registry (hgp://ariadne-‐registry.dcu.gr) • Web Interface for datasets and services descripAon:
– hgp://ariadne-‐registry.dcu.gr/index.php?p=web • XML file
– hgp://ariadne-‐registry.dcu.gr/index.php?p=xml
• Excel templates – hgp://ariadne-‐registry.dcu.gr/index.php?p=excel
• Database export to RDF according to the ACDM record – hgp://ariadne-‐registry.dcu.gr/index.php?p=tools
ARIADNE Catalogue Data Model (ACDM)
• A model for resource descrip8on
– To describe archaeological resources made available by partners for discovery, access and integraAon
– Based on exis8ng standards • DCAT -‐> Data resources
• ISO/IEC 11179 -‐> Language resources
• DBPedia -‐> Services
ARIADNE Catalogue Data Model (ACDM) • ARIADNE resources
– Archaeological datasets from 20+ countries
– 24 Languages
– 1,800,000+ records
– 50,000+ grey literature
• ARIADNE informa8on types
– Archaeological excavaAons
– Monuments and sites
– ScienAfic analysis
ARIADNE Catalogue Data Model (ACDM)
• ARIADNE digital resource types – DBMS -‐> PostgreSQL, MySQL, Microsoc Access, …
– Datasets -‐> Repositories of digital objects with the same structure
– Collec8ons -‐> Sets of text files/images in hierarchical systems
– Mul8media -‐> 3D models, images, videos
– GIS
– Metadata and vocabularies
ARIADNE Services • Web Services
– Visual Media Service: easy publicaAon and presentaAon on the web of complex visual media assets (hgp://visual.ariadne-‐infrastructure.eu/ )
– Landscape Service: large terrain dataset generaAon, 3D landscape composing and 3D model processing hgp://landscape.ariadne-‐infrastructure.eu/
– DANS Dendrochronology Service: hgp://dendro.dans.knaw.nl/
– DAI Vocabularie: hgp://archwort.dainst.org/thesaurus/de/vocab
– DAI GazePeer: hgp://gazegeer.dainst.org/
– Vocabulary Matching Tool: hgp://heritagedata.org/vocabularyMatchingTool
• Stand-‐alone Services
– MeshLab: open source, portable, and extensible system for the processing and ediAng of unstructured 3D triangular mesheshgp://meshlab.sourceforge.net/
Visual Media Service
• For each visual media asset: – The user fills a simple form and uploads the data file
• 3D models • High-‐resoluAon 2D images • ReflecAon TransformaAon Images (RTI)
– An automa8c service on the ARIADNE server opens, converts, transforms it in a browsable page
• MulA-‐resoluAon encoding • Progressive transmission • Immediate visualizaAon
– An URL (or a .zip file) is created and sent to the user – The user links the media asset in his archive using the provided URL
Media Service IntegraAon
• Results of the Visual Media Service canbe incorporated directly in any exisAng archive
• See an example
of work accomplished with ADS:
The ACDM model
Collection
DataFormat
DigitalObjectDesc
hasAttachedObject
MetadataRecord
Vocabulary
hasMetadataR
ecord
usesVocabulary
usesVocabulary
AttachedDocuments
hasSimpleDigitalType
Service applyTo
DataResource
ArchaeologicalResource
LanguageResource
hasRecordStructure
Database
TextualDocument
GazetteerhasItemMetadataStructure
dct:hasParts
Mapping
hasAttachedDocument
MetadataSchema
MetadataElement
hasElements
conformsTo
1..*
1..*
1..* 0..*
1..*
0..*
1..*
0..*
1
0..*
EncodingLanguage
expressedIn
hasSchema
DBSchema
isRealizedBy
1
1..*
0..1 1..*
1..*0..*
1..*
1..*
0..*
0..*
1..*
1..*0..*
1 1..*
1..*
1
1
1..*
dct:isPartOf
1..*
dcat:Catalogdct:isPartOf
foaf:Agent dct:publisher/dct:contributor/:owner/ …
Distribution dcat:distribution
1..*
1
GIS
from
0..*
0..*
to
Licence
hasLicence
dct:publisher0..*
1
MetadataAttribute
hasAttribute
1
0..* Version
hasVersion
hasVersion
1
1
0..*0..*
hasVersion
0..*
1
DataSet
dcat:Dataset dcat:Distribution
hasVersion 1..*1
skos:Concept
dct:isPartOf
hasMetadataRecord
0..*
0..*
1..*
0..*
AriadneConcept
1..* 0..*native-subject
ariadne-subject1..*
0..*
provided-subject derived-subject
dct:publisher 1*
0..*1
0..*1
1..*
0..1
Resource associaAons • dct:isPartOf associates any archaeological resource with the catalogue. • dct:publisher an agent responsible for making the resource publicly accessible • dct:contributor an agent responsible for describing the resource in the Catalogue • dct:creator an agent primarily responsible for creaAng the resource. • owner an agent that is the legal owner of the resource. • legalResponsible an agent holding the legal responsibility of the resource. • scien8ficResponsible an agent holding the scienAfic responsibility. • technicalResponsible an agent holding the technical responsibility • ariadne-‐subject a subject from the AAT vocabulary
– provided-‐subject manually specified subjects drawn from the Gegy AAT vocabulary;
– derived-‐subject subjects, automaAcally derived from mapping local vocabularies to the Gegy AAT vocabulary.
• na8ve-‐subject a subject from a vocabulary in use by the original owner of the resource
• hasAPachedDocuments the documents that are agached to the resource for illustraAon purposes.
• High level descripAon of the archaeological archives – Resource Discovery – Common elements idenAficaAon
• Subjects integraAon – Concepts and terms standardisaAon
• SpaAal integraAon – Unambiguous idenAficaAon of geographic enAAes – Diachronic representaAon of places
• Temporal integraAon – Unambiguous representaAon of Ames and periods
ARIADNE integraAon pathways
IntegraAon pathways
• “What” -‐> Subjects and topics • “Where” -‐> Place names and geographic enAAes • “When” -‐> Temporal aspects, periods and Ames
• Formal descripAon -‐> ACDM model • Linked Open Data philosophy -‐> RDF • Shared knowledge base -‐> ARIADNE Registry
• Thesauri and terminological tools for archaeology
• NaAonal or local validity
IntegraAon pathways: “What”
IntegraAon pathways: “What”
• Using Gegy Art & Architecture Thesaurus (AAT) as a common spine
• Partners to enter their data in the ACDM Registry: – Using AAT terms – Providing a mapping from a naAonal / regional vocabulary to the AAT
• Mapping tools for interoperability among vocabularies – USW Mapping Tool
IntegraAon pathways: “Where” • All spaAal coordinates converted to WGS84 • Where place names supplied; Geonames used to
provide coordinates (where possible) • Hystorical names: Pelagios project
– hgp://pelagios-‐project.blogspot.it/ – Linked Open Data and URI from Pleiades – hgp://pleiades.stoa.org/
• Need to preserve hierarchy in place names (state, region, locality …)
• Modern place names resoluAon starAng from ancient ones – ByzanAum, ConstanAnople, Istanbul …
IntegraAon pathways: “When” • Content providers: different periodisaAons and
subdivision of the past used in archaeology • Controlled vocabularies -‐> supplied to ARIADNE with a
start and end date for each term • Vocabularies, and start and end dates to be made
available as URIs in Linked Data format – CollaboraAon with PeriodO project – hgp://perio.do/ – Dedicated URIs for period collecAons created for ARIADNE in PeriodO
– hgp://www.ariadne-‐infrastructure.eu/Resources/PeriodO/DocumentaAon
Content publicaAon pipeline
Data +
Metadata
Data +
Metadata
Data +
Metadata
Excel Files
OAI-‐PMH
XML
AggregaAon Infrastructure
Portal Infrastructure
ValidaAon Enrichment PublicaAon
Thesauri Mappings
Temporal Mappings
ACDM ACDM Enriched
Subject enrichment
• Data integraAon • NaAve Subjects • AAT Mappings • Derived Subjects
Subject Enrichment
NaAve Subjects Derived Subjects (AAT Concepts)
Content provider Mapping to AAT
skos:prefLabel: roads acdm:derivedSubject: hPp://vocab.gePy.edu/aat/300008217
Acdm:naAveSubject: roads
Querying the integrated archives
• User interfaces and the ARIADNE Portal – Main access point to system and services
• Archaeological objects, places, events, actors and types – Browse and refine with facet views
• InteracAon with the Registry – Drives the queries towards the most relevant archives
• InteracAon with terminological data and services – Vocabularies to provide support at query and retrieval Ame
ARIADNE Portal • ARIADNE Portal
– Version 1.0: released on 24 March 2016 – Version 1.2: released on 22 November 2016 – Official domain: hgp://portal.ariadne-‐infrastructure.eu
Content Discovery • LighAng fast search • Content similarity
– Geographical – ThemaAc – Temporal (TBA)
• Faceted browsing • MulA-‐lingual AAT based
search
Preserving legacy archives
• Legacy database syncronizaAon
– ARIADNE system constantly updated according to modificaAons of legacy archives
• References to legacy archives always provided – Data provenance
– URLs to informaAon on original portals/web applicaAons
• User to navigate original informaAon
– To perform custom searches tailored on specific needs
The ARIADNE Infrastructure
17 Content Providers
> 1.85m Records
1 Common Schema
> 6k AAT Concepts
> 1.78m SpaAal EnAAes
> 73k Periods & Dates
> 10 Services
> 1.77m NaAve Subjects
> 8k Users
Data ingesAon / flow within ARIADNE
Repository
Excel Sheet
MORe
ARIADNE Registry
ValidaAon
Cleaning
Enrichment
IntegraAon
RDF Store (RDF)
ElasAc Search
RDF Store (CRM)
Archive
ARIADNE Portal
IntegraAon Experiments
ARIADNE is a project funded by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-‐INFRASTRUCTURES-‐2012-‐1-‐313193. The views and opinions expressed in this presentaAon are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission.
Grazie …