26
Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences DC 2012 05.09.2012 Thomas Bosch GESIS – Leibniz Institute for the Social Sciences, Germany [email protected] Richard Cyganiak Digital Enterprise Research Institute, Ireland [email protected] Joachim Wackerow GESIS – Leibniz Institute for the Social Sciences, Germany [email protected] Benjamin Zapilko GESIS – Leibniz Institute for the Social Sciences, Germany [email protected]

DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

Embed Size (px)

Citation preview

Page 1: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

DC 201205.09.2012

Thomas BoschGESIS – Leibniz Institute for the Social

Sciences, [email protected]

Richard CyganiakDigital Enterprise Research Institute,

[email protected]

Joachim WackerowGESIS – Leibniz Institute for the Social

Sciences, [email protected]

Benjamin ZapilkoGESIS – Leibniz Institute for the Social

Sciences, [email protected]

Page 2: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

2

Agenda

Page 3: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

3

• DDI (Data Documentation Initiative)• Established international standard for the documentation and

management of data from the social, behavioral, and economic sciences

• Data model for statistical data

• Supports the entire research data lifecycle• Focus on microdata • Structured high quality metadata

• enable secondary analysis without the need to contact the primary researcher

• Enables the re-use of metadata of existing studies for designing new studies

• Currently specified in XML Schema

What is DDI?

Page 4: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

4

• DDI subset • of the most important DDI elements

• Use cases• Experts in the statistics domain formulated use cases which are seen

as most significant to solve frequent problems• Most important use case: discover microdata connected with multiple

studies

• Leverage existing DDI-XML docs to DDI-RDF automatically• Direct mapping• Generic mapping (Bosch and Mathiak, 2011)

How was the DDI Ontology developed?

Page 5: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

5

• Currently no such ontology available• To increase visibility of data holdings using mainstream Web

technologies• To open DDI to the Linked Data community• To process DDI-RDF by RDF tools• To link DDI-RDF to other RDF data• To better identify opportunities for merging datasets • To enable inferencing• To research microdata within the LOD cloud

Why DDI as Linked Data?

Page 6: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

6

• Dublin Core Metadata Element Set, Version 1.1• DCMI Metadata Terms• SKOS• SDMX RDF Data Cube Vocabulary• ISO/IEC 11179 • ISO 19115

What other metadata standards vocabularies are used?

Page 7: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

7

• Which studies are connected with a specific coverage consisting of the 3 dimensions: time, country, and subject?

• What questions with a specific question text are contained in the study questionnaire?

• What questions are connected with a concept with a specific label?• What questions are combined with a variable with an associated coverage

consisting of the 3 dimensions time, country, and subject?• What concepts are linked to particular variables or questions?• What representation does a specific variable have?• What codes and what categories are part of this representation?• What variable label does a variable with a particular variable name have?• What‘s the maximum value of a certain variable?• What are the absolute and relative frequencies of a specific code?• What data files contain the entire dataset?

Discovery Use Case

Page 8: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

8

Page 9: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

9

study | coverage

Page 10: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

10

Page 11: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

11

instrument | question | concept

Page 12: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

12

Page 13: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

13

Page 14: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

14

values | value labels

Page 15: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

15

Page 16: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

16

Page 17: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

17

variable | descriptive statistics

Page 18: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

18

Page 19: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

19

Page 20: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

20

logical dataset | dataset | data file

Page 21: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

21

Page 22: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

22

Page 23: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

23

conceptual model

Page 24: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

24

Page 25: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

Acknowledgements

• Archana Bidargaddi (NSD - Norwegian Social Science Data Services, Norway)• Franck Cotton (INSEE - Institut National de la Statistique et des Études

Économiques, France)• Richard Cyganiak (DERI - Digital Enterprise Research Institute, Ireland)• Daniel Gilman (BLS - Bureau of Labor Statistics, USA)• Marcel Hebing (SOEP - German Socio-Economic Panel Study, Germany)• Larry Hoyle (University of Kansas, USA)• Jannik Jensen (DDA - Danish Data Archive, Denmark)• Stefan Kramer (CISER - Cornell Institute for Social and Economic Research, USA)• Amber Leahey (Scholars Portal Project - University of Toronto, Canada)• Abdul Rahim (Metadata Technologies Inc., USA)• John Shepherdson (UK Data Archive, UK)• Dan Smith (Algenta Technologies Inc., USA)• Humphrey Southall (Department of Geography, UK Portsmouth University, UK)• Wendy Thomas (MPC - Minnesota Population Center, USA)• Johanna Vompras (University Bielefeld Library, Germany)

25

Page 26: DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social, Behavioural, and Economic Sciences

26

Thank you for you attention!