46
AAT LOD Microthesauri Marcia Lei Zeng International Terminology Working Group (ITWG) September 57, 2014 Dresden, Germany

AAT LOD Microthesauri - Getty

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AAT LOD Microthesauri - Getty

AAT LOD Microthesauri

Marcia Lei ZengInternational Terminology Working Group (ITWG)

September 5‐7, 2014Dresden, Germany

Page 2: AAT LOD Microthesauri - Getty

1. DefinitionMicrothesaurus: designated subset of a thesaurus that is capable of functioning as a complete thesaurus.

‐‐ ISO25964‐2:2013

Microthesauri are different from:

• Derived vocabularies

S (source)

S

S

S

New

New

N - - N e w - -N

Derivation/Modeling

• adaptation • modification • expansion • partial adaptation• translation

Page 3: AAT LOD Microthesauri - Getty

3

Deriving new vocabularies from a source vocabularyNew vocabularies depending on a source vocabulary

(with new categories and terms)

Page 4: AAT LOD Microthesauri - Getty

2. Overview: Situations and decisions for a digital collection that wants to become LOD dataset  

1

2

3

Page 5: AAT LOD Microthesauri - Getty

3

4

AAT‐based Vocabularies

5

6

Full ATT orAAT Microthesaui

Other Non‐LODVocabs

Page 6: AAT LOD Microthesauri - Getty

11

22

3333

44

AAT‐based Vocabularies 55

66

Full ATT orAAT Microthesaui

Other Non‐LODVocabs

The need to• use,• create,• derive, and• map AAT&• go to LOD

2. Overview: Situations and decisions for a digital collection that wants to become LOD dataset  

Page 7: AAT LOD Microthesauri - Getty

3. Can a microthesaurus be made from an existing thesaurus?

Structure ExampleYES Classificatory

structure• EUROVOC• Chinese Classified Thesaurus• [English Heritage Thesauri]

YES Faceted structure • AAT• FAST (Faceted Application of Subject Terminology) 

YES Deep hierarchies (family trees)

• AAT• NASA Thesaurus • INSPEC Thesaurus

NO flat structure[alphabetically organized]

• LCSH• hundreds of thesauri

Microthesaurus: designated subset of a thesaurus that is capable of functioning as a complete thesaurus.    ‐‐ ISO25964‐2:2013

Page 8: AAT LOD Microthesauri - Getty

Eurovoc "EuroVoc is split into 21 domains and 127 microthesauri.Each domain is divided into a number of microthesauri. 

A microthesaurus is considered as a concept scheme with a subset of the concepts that are part of the complete EuroVocthesaurus."http://eurovoc.europa.eu/drupal/?q=node/555

Page 9: AAT LOD Microthesauri - Getty

CHIN listed CHIN listed 891 

recommended resources.

Only AAThas facets and hierarchies 

that are listed separately. 

Canadian Heritage Information Network (CHIN)

Page 10: AAT LOD Microthesauri - Getty

From: Getty Vocabularies: Linked Open DataSemantic Representation. Section 2.3.4 Top Concepts

http://vocab.getty.edu/doc/#The_Getty_Vocabularies_and_LOD

Page 11: AAT LOD Microthesauri - Getty

Art and Architecture Thesaurus (AAT)

Facet: Objects

Hierarchy: Furnishing and Equipment

Concept: containers (receptacles)

Guide term: <containers by form>

concept:vessels (containers)

concept:rhyta

Page 12: AAT LOD Microthesauri - Getty

Facet: Objects

Hierarchy: Furnishing and Equipment

Concept: containers (receptacles)

Guide term: <containers by form>

concept:vessels (containers)

concept:rhyta

What are special in AAT

Facets

Sub‐facets(Indicated by node labels)

Art and Architecture Thesaurus (AAT)

[large] Hierarchies(full coverage, deep layer)

The units were recommended to use by projects such as

The Canadian Heritage Information 

Network (CHIN) 

Page 13: AAT LOD Microthesauri - Getty

concept

concept:

Concept 

BT

NThttp://id.loc.gov/authorities/subjects/sh85142374.skos.rdf

What are usually available in a flat structured LOD

thesauri

Page 14: AAT LOD Microthesauri - Getty

Easy to get an immediate family members.This is true for all thesauri, LCSH, AGROVOC, etc.

Page 15: AAT LOD Microthesauri - Getty

Facet: Objects

Hierarchy: Furnishing and Equipment

Concept: containers (receptacles)

Guide term: <containers by form>

concept:vessels (containers)

concept:rhyta

What are special in AAT& are available as LOD

Facets

Sub‐facets(Indicated by node labels)

Art and Architecture Thesaurus (AAT)

[large] Hierarchies(full coverage, deep layer)

concept

concept:

Concept 

Page 16: AAT LOD Microthesauri - Getty

4. An example-- Use a <Guide Term> to obtain

all concept URIsin a facet or hierarchy

Part 1. Get Data

Page 17: AAT LOD Microthesauri - Getty

After choosing a facet or a hierarchy...1. Get the ID2. Go to SPARQL Endpointnext slide

Page 18: AAT LOD Microthesauri - Getty

2. Go to SPARQL Endpoint3. Choose "Descendants of a Given Parent"

http://vocab.getty.edu/sparql

Page 19: AAT LOD Microthesauri - Getty

4. Replace the ID in the Query template5. Submit6. Get all URIs and labels under this guide term.

Note: Here is the text of the query. I replaced the aat ID, also inserted a line to get the labels, and sort by label:

# 5.1.2 Descendants of a Given Parentselect * {?x  gvp:broaderExtended aat:300117143.

?x gvp:prefLabelGVP [xl:literalForm ?l]; skos:inScheme aat:} order by ?l

Page 20: AAT LOD Microthesauri - Getty
Page 21: AAT LOD Microthesauri - Getty

(Below: Checked if the results are at multiple levels in the hierarchy; display did not sort. )

Sparql gave me:

Page 22: AAT LOD Microthesauri - Getty

7. Download JSON format data, now I have a dataset.

Download Options: (1) JSON* (2) XML

*JSON (JavaScript Object Notation) is a lightweight data-interchange format.

Page 23: AAT LOD Microthesauri - Getty

AAT URIs and preferred labelsunder one facet or hierarchy

AAT URIs and labels according to a Contributor 

#5.1.3 Subjects by Contributor Idselect * {?x a gvp:Subject; dct:contributoraat_contrib:10000178. ?x gvp:prefLabelGVP [xl:literalForm ?l]} order by ?l

# 5.1.2 Descendants of a Given Parentselect * {?x  gvp:broaderExtendedaat:300117143.

?x gvp:prefLabelGVP[xl:literalForm ?l]; skos:inScheme aat:

} order by ?l

Page 24: AAT LOD Microthesauri - Getty

Part 2. Viewing the dataset by a non-techy person

Acknowledgement: Thanks to a Visiting Scholar En‐bo Jiang for 

helping the testing.

Page 25: AAT LOD Microthesauri - Getty

How to manage by a non-techy person?

Techy-person can prepare the file as:

1. From a JSON* file convert to CSV** file (can be opened as spreadsheet) using an open source converter

Non-techy person's wish:I can see what are in the dataset;I can use a spreadsheet to open and manage it.

**CSV = Comma Separated Value file format

*JSON = (JavaScript Object Notation), a lightweight data-interchange format.

Page 26: AAT LOD Microthesauri - Getty

"Form" view online

http://codebeautify.org/view/jsonviewer

Using an online converter, turn JSON to CSV.

Page 27: AAT LOD Microthesauri - Getty

JSON file "Form" view

http://codebeautify.org/view/jsonviewer

Page 28: AAT LOD Microthesauri - Getty

"Tree" view online

http://codebeautify.org/view/jsonviewer

Page 29: AAT LOD Microthesauri - Getty

How to manage by a non-techy person?

Techy-person can prepare the file as:

1. From a JSON* file convert to CSV** file (can be opened as spreadsheet) using an open source converter, or2. From a JSON file export to spreadsheet from OpenRefine

Non-techy person's wish:I can see what are in the dataset;I can use a spreadsheet to open and manage it.

Page 30: AAT LOD Microthesauri - Getty

When uploaded the JSONfile to OpenRefine, highlight the first enter in order for the software to tell the structure.

Page 31: AAT LOD Microthesauri - Getty

Establish a 'Project', then ready to edit.

Page 32: AAT LOD Microthesauri - Getty

Export

Page 33: AAT LOD Microthesauri - Getty

To do: need to double check if all node labels and preferred terms are in.

Open the JSON file from spreadsheet on my laptop

Page 34: AAT LOD Microthesauri - Getty

If open the XML file from spreadsheet, it looks like:

Page 35: AAT LOD Microthesauri - Getty

Summary of the steps

• Replace the ID in the Query template

• Submit• Get the URIs and labels in

under this guide term.• Sort by order (column x)

1. Choose the facet or hierarchy you like to start;2. Find the ID of that concept.3. Use this template to get the URIs and labels:

4. Use a tool that can treat JSON to view and manage.

# 5.1.2 Descendants of a Given Parentselect * {?x  gvp:broaderExtendedaat:300117143.

?x gvp:prefLabelGVP[xl:literalForm ?l]; skos:inScheme aat:

} order by ?l

Page 36: AAT LOD Microthesauri - Getty

Additional: Using RelFinder to Visualize 

Interactive Relationship Discovery in RDF Data

http://www.visualdataweb.org/relfinder.php

Page 37: AAT LOD Microthesauri - Getty

Example: Find relations between Leonardo da Vinci and Renaissance(based on DBpedia dataset) ‐1

1. SPARQL1. Pointing to a SPARQLend point

2. Type two terms to find matching entries

3. The tool will display 3. The tool will display the triples one by one

4. Click on any concept to highlight the relations

Page 38: AAT LOD Microthesauri - Getty

Leonardo da Vinci and Renaissance (based on DBpedia dataset) 2

Page 39: AAT LOD Microthesauri - Getty

SPARQL Query Creator [Beta]

My Plan: to create a friendly SPARQL query creator for generating AAT Microthesauri

Page 40: AAT LOD Microthesauri - Getty

5. ConclusionLOD AAT Microthesauri's importance in the Non‐AAT World

Page 41: AAT LOD Microthesauri - Getty
Page 42: AAT LOD Microthesauri - Getty
Page 43: AAT LOD Microthesauri - Getty
Page 44: AAT LOD Microthesauri - Getty
Page 45: AAT LOD Microthesauri - Getty

AAT's importance in the Non‐AAT World

THANK YOU!

Page 46: AAT LOD Microthesauri - Getty

Wish: Provide better SPARQL template interfaces, allowing all kinds of explorations