15
Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013 Topic Introduction Controlled vocabularies and humanities, a problematic relationship. The functional categorization of historical place types and the problems it raises. Giovanni Colavizza Leibniz Institute of European History [email protected] 1

Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Embed Size (px)

Citation preview

Page 1: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

Topic IntroductionControlled vocabularies and humanities, a problematic relationship.

The functional categorization of historical place types and the problems it raises.

Giovanni ColavizzaLeibniz Institute of European History

[email protected]

1

Page 2: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

The scenarioControlled vocabulary: a selected list of terms, which refer to concepts, used for categorization. Criteria of concept selection are usually domain specific.

Focus for this talk: vocabularies of concepts, not proper names.

2

Page 3: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

The scenarioControlled vocabulary: a selected list of terms, which refer to concepts, used for categorization. Criteria of concept selection are usually domain specific.

Focus for this talk: vocabularies of concepts, not proper names.

2

The term - concept relation is often not specified: intended (?) use of natural language, which is context and interpretation specific.But there goes language independence!

@Dalia Varanka, A topographic feature taxonomy for a US national topographic mapping ontology, 2009.

Page 4: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

The problemQuantitative and computer-based methods scale-up our responsibilities together with our means.

The data and metadata loop:

3

Retrieve

Reuse - Extend

Share

Page 5: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

The problemQuantitative and computer-based methods scale-up our responsibilities together with our means.

The data and metadata loop:

3

More strict requirements: classification systems must be shared, to some extent. Such shared part must be formally specified (machine-readable). The term - concept bond has to become explicit.

Retrieve

Reuse - Extend

Share

Page 6: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

New design requirements

•Allow for comparison beyond single project (data integration)•Interoperability and portability•Scalability•More accurate retrieval•Automatic classification•Named entity recognition •Reasoning...

4

One possible solution: integrate a more strict knowledge model on top of controlled vocabularies. Express it via ontologies: simplified specifications of (shared!) conceptualizations.Already possible! ISO 25964 (data model), SKOS (web format)

Page 7: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

IEG proposal - concept•Keep both natural language vocabularies AND formalized ontologies•An integrated approach:

1.develop back-end ontologies, well formalized and documented*2.vocabularies are built as needed, in natural language, associating tags with formally defined concepts (prevent late integration)

5

Page 8: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

IEG proposal - concept•Keep both natural language vocabularies AND formalized ontologies•An integrated approach:

1.develop back-end ontologies, well formalized and documented*2.vocabularies are built as needed, in natural language, associating tags with formally defined concepts (prevent late integration)

5

But!No 1-1 mapping between vocabularies and ontologies. Focus on what’s shared*.Pareto principle: 80% effects (tags we need) come from 20% causes (concepts).

Page 9: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

IEG proposal - implementationImplementation is key:1.Upper ontologies (integration among domains)2.Domain ontologies (e.g. functions)3.Labeling system4.Controlled vocabularies

> Linked data enabled, user friendly (minimize learning curve and overhead), single entry-point to standards: bridges tags and concepts.

6

Page 10: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

IEG proposal - implementationImplementation is key:1.Upper ontologies (integration among domains)2.Domain ontologies (e.g. functions)3.Labeling system4.Controlled vocabularies

> Linked data enabled, user friendly (minimize learning curve and overhead), single entry-point to standards: bridges tags and concepts.

6

Large-scale collaborative and community-driven framework (numbers 1, 2, 3, in part 4), few experts for back-end, many users for front-end, everything open.

Could we think about a Consortium for controlled vocabularies (like TEI)?

Page 11: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

Historical place typesQuite problematic:Same names mean different things in space, time, cultureGeneric tags for specific meanings: ambiguityLayers of interpretations: agents, socio-political context, historians

7

Page 12: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

Historical place typesQuite problematic:Same names mean different things in space, time, cultureGeneric tags for specific meanings: ambiguityLayers of interpretations: agents, socio-political context, historians

7

From nouns to verbs:Most vocabularies of place types/features are already loosely classified by functionality (economic activity, leisure facility, place of culture, etc.)There are less verbs than nouns (Wordnet synsets: ~82k nouns, ~14k verbs)Verbs brings us closer to concrete events (and linked data triples..)

Page 13: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

Functional categorization - I

@Filippo De Vivo, Patrizi, informatori, barbieri. Politica e comunicazione a Venezia nella prima età moderna. Milan: Feltrinelli, 2012. In English: id., Information and communication in Venice: Rethinking Early Modern Politics. Oxford: Oxford University Press, 2007.

8

Page 14: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

Open questionsIs all this useful and feasible? (let’s try it)

Where to start (historical place types)What to model (functions)Design requirementsExplore technical solutionsHow to integrate existing vocabularies> Sketch guidelines

Partners, anyone? :)

9

Page 15: Mainz Expert Workshop on Controlled Vocabularies 10/10/2013

Giovanni Colavizza Experts Workshop on Controlled Vocabularies Mainz 10-11/10/2013

Thanks!Controlled vocabularies and humanities, a problematic relationship.

The functional categorization of historical place types and the problems it raises.

Giovanni ColavizzaLeibniz Institute of European History

[email protected]

10