40
MANAGING CHANGES IN CLASSIFICATION: the case of UDC Aida Slavic Editor-in-Chief, UDC [email protected]

Aida Slavic Managing KOS: Evolution of concepts and their representation

Embed Size (px)

Citation preview

Page 1: Aida Slavic Managing KOS: Evolution of concepts and their representation

MANAGING CHANGES IN CLASSIFICATION: the case of UDC

Aida Slavic Editor-in-Chief, UDC

[email protected]

Page 2: Aida Slavic Managing KOS: Evolution of concepts and their representation

FOCUS

•  Bibliographic classification in the linked data environment

•  Practical issues to do with changes in classification scheme

§  Consequences these changes have on

information exchange §  Importance of publishing historical

classification data as linked data

Page 3: Aida Slavic Managing KOS: Evolution of concepts and their representation

539.1 Nuclear physics. Atomic physics. Molecular physics 539.12 Elementary and simple particles 539.123/.124 Leptons. Including: Muons 539.123 Neutrinos 539.123.6 Antineutrinos 539.124 Electrons (including beta-particles) 539.124.6 Positrons 539.125/.126 Hadrons. Baryons and mesons 539.125 Nucleons 539.125.4 Protons 539.125.46 Antiprotons 539.125.5 Neutrons 539.125.56 Antineutrons 539.126.3 Mesons 539.126.4 Resonances 539.126.6 Hyperons

ALPHABETICAL vs SYSTEMATIC

Antineutrinos Antineutrons Antiprotons Atomic physics Baryons Beta-particles Bosons Electrons Hadrons Hyperons Leptons Mesons Molecular physics Muons Neutrinos Neutrons Nuclear physics Nuclei Nucleons Positrons Protons Resonances

words alone can only be arranged or ordered alphabetically

Classification orders concepts systematically

Page 4: Aida Slavic Managing KOS: Evolution of concepts and their representation

539.1 Nuclear physics. Atomic physics. Molecular physics 539.12 Elementary and simple particles 539.123/.124 Leptons. Including: Muons 539.123 Neutrinos 539.123.6 Antineutrinos 539.124 Electrons (including beta-particles) 539.124.6 Positrons 539.125/.126 Hadrons. Baryons and mesons 539.125 Nucleons 539.125.4 Protons 539.125.46 Antiprotons 539.125.5 Neutrons 539.125.56 Antineutrons 539.126.3 Mesons 539.126.4 Resonances 539.126.6 Hyperons

539.1 Nuclear physics. Atomic physics. Molecular physics 539.12 Elementary and simple particles 539.123/.124 Leptons. Including: Muons 539.123 Neutrinos 539.123.6 Antineutrinos 539.124 Electrons (including beta-particles) 539.124.6 Positrons 539.125/.126 Hadrons. Baryons and mesons 539.125 Nucleons 539.125.4 Protons 539.125.46 Antiprotons 539.125.5 Neutrons 539.125.56 Antineutrons 539.126.3 Mesons 539.126.4 Resonances 539.126.6 Hyperons

NOTATION

Antineutrinos Antineutrons Antiprotons Atomic physics Baryons Beta-particles Bosons Electrons Hadrons Hyperons Leptons Mesons Mesons Molecular physics Muons Neutrinos Neutrons Nuclear physics Nuclei Nucleons Positrons Protons Resonances

alphabetical order systematic order semantic relationships fixed by notation

NOTATION – enables mechanical ordering of subjects

Page 5: Aida Slavic Managing KOS: Evolution of concepts and their representation

NOTATION - LANGUAGE INDEPENDENT

Class =162.3 Czech SKOS export from UDC Summary

Page 6: Aida Slavic Managing KOS: Evolution of concepts and their representation

CLASS vs CONCEPT

§  Class notation rarely represent a single concept §  Sometimes the notation serves for practical grouping

of phenomena

§  This causes many issues when it comes to using ontology-based standards as vehicles for presenting and managing classification schemes

Page 7: Aida Slavic Managing KOS: Evolution of concepts and their representation

NOTATION: A PLACE HOLDER

598.2 Aves (Birds) 598.24 Gruiformes. Charadriiformes. Ciconiiformes 598.244 Ciconiiformes 598.244.2 Ciconiidae

Including: Storks (genera Ciconia and Mycteria); the Jabiru (genus Jabiru); openbill storks (genus Anastomus) and adjutants (genus Leptoptilos)

Note: “storks” (in English) can be roughly taken as a common term for most of the extant species of class Ciconiidae ...in other languages species in this class do not have the same common name e.g. the English word ‘storks’ cannot be translated accurately in other languages

Page 8: Aida Slavic Managing KOS: Evolution of concepts and their representation

NOTATION: A CONTAINER OF INFORMATION

582.53 Alismatales Including: Strictly extinct genus Heleophyton SN: Class here also Alismatidae (scientifically outdated)

... 597.2/.5 Pisces (fishes) (scientifically outdated)

Note: Bibliographic classifications often have to contain concepts – even after these stop to be scientifically relevant.

Page 9: Aida Slavic Managing KOS: Evolution of concepts and their representation

BIBLIOGRAPHIC CLASSIFICATIONS

§  deal with recorded knowledge, i.e. after it has been embodied in documents

§  organize literature about entities and not entities

themselves

§  have to fulfil additional requirements with respect to the context in which knowledge may be created, presented, recorded or used

Page 10: Aida Slavic Managing KOS: Evolution of concepts and their representation

NOT AN ONTOLOGY…

§  Bibliographic classifications are primarily concerned with subjects

Subject = systematized body of ideas Concept = an idea

§  What is the subject (forms of knowledge)? Mining, Chemistry, Medicine

§  What is the subject about? (topics) mining of gold physical properties of water angina pectoris

Page 11: Aida Slavic Managing KOS: Evolution of concepts and their representation

BIBLIOGRAPHIC CLASSIFICATIONS

Two dominant characteristics: §  disciplinary organization - organize the universe of

knowledge by disciplines i.e. forms of knowledge - based on some scientific and educational consensus

§  aspect classification - groups phenomena according to the

way they are researched, described and studied in documents

Page 12: Aida Slavic Managing KOS: Evolution of concepts and their representation

POLYHIERARCHY

§  in the universe of knowledge one concept can belong to more than one broader category

Domestic animals

Pets

Carnivora

Canidae

Dog

Page 13: Aida Slavic Managing KOS: Evolution of concepts and their representation

“DISTRIBUTED RELATIVES”

Chemical industry Pest-control chemicals Chemicals for controlling rodents. Rodenticides Mouse

Agriculture Animal husbandry Rodents kept for fur Mouse

Zoology Mammals Rodentia. Lagomorpha Myiomorpha

Muridae. Mice and rats Mouse

Agriculture Plant protection Control of plant diseases and pests Destruction of vertebrate pests Mouse

see also

see also

see also

Page 14: Aida Slavic Managing KOS: Evolution of concepts and their representation

LINKING CONCEPTS ACROSS KNOWLEDGE

 

       

Sharks

Natural  SciencesBiologyAnimalsVertebrataPisces  (Fishes)Elasmobranchii

Sharks

Arts.  Recreation.  Entertainment.  SportFilm.  Cinema  (motion  pictures)Film  genresDocumentary  filmsDocumentaries  about  sharks

Social  SciencesEconomic  scienceEconomic  sectorsTourismAdventure  tourismSwimming  with  sharks

Arts.  Recreation.  Entertainment.  SportSportSport  fishingSea  fishingShark  fishing

Applied  SciencesAgricultureFishingFishing  for  deep-­‐sea  speciesShark  fishing

Applied  SciencesIndustriesLeather  industryFish  skinSharkskin

Page 15: Aida Slavic Managing KOS: Evolution of concepts and their representation

681 PRECISION MECHANISMS AND INSTRUMENTS 681.1 Apparatus with wheel or motor mechanisms 681.2 Instrument-making in general. Instrumentation. 681.3 Computers first placed here before 1980s 681.5 Automatic control engineering 681.6 Graphic reproduction machines and equipment 681.7 Optical apparatus and instruments 681.8 Technical acoustics. Musical instruments

NEW KNOWLEDGE EMERGES

Relocated to a new class 004 UDC 004/006 Dewey

Page 16: Aida Slavic Managing KOS: Evolution of concepts and their representation

IT HAPPENS ALL THE TIME... STARTS AS ONE CONCEPT...

§  Finding logical place for new and pervasive concepts

NANOTECHNOLOGY medicine

technology

industry

computer technology

agriculture

BIOTECHNOLOGY agriculture

biology genetics

industry

medicine

Page 17: Aida Slavic Managing KOS: Evolution of concepts and their representation

=2 Western langauges =20 English =3 Germanic languages =4 Romance or Neo-Latin languages =50 Italian =60 Spanish =690 Portuguese =7 Classic languages. Latin and Greek =81 Slavonic langauges =88 Baltic languages =9 Oriental, African and other languages =91 Various Indo-European languages =92 Semitic languages =94 Hamitic languages ...

REMOVING BIAS

Wrong classification of languages - causes wrong classification of: - peoples - literatures - philology

Page 18: Aida Slavic Managing KOS: Evolution of concepts and their representation

=2 Western langauges =20 English =3 Germanic languages =4 Romance or Neo-Latin languages =50 Italian =60 Spanish =690 Portuguese =7 Classic languages. Latin and Greek =81 Slavonic langauges =88 Baltic languages =9 Oriental, African and other languages =91 Various Indo-European languages =92 Semitic languages =94 Hamitic langauges ...

CORRECTED 25 YEARS AGO (UDC)

causes wrong classification of: - peoples - literatures - linguistics

Change to new scientific classification (1980s) =1/=2 Indo-European languages =3 Caucasian & other languages. Basque =4 Afro-Asiatic, Nilo-Saharan, Congo-Kordofanian, Khoisan =5 Ural-Altaic, Japanese, Korean, Ainu, Palaeo-Siberian,

Eskimo-Aleut, Dravidian, Sino-Tibetan =6 Austro-Asiatic. Austronesian =7 Indo-Pacific, Australian =8 American Indian (Amerindian) languages =9 Artificial languages

Page 19: Aida Slavic Managing KOS: Evolution of concepts and their representation

MORE CULTURAL BIAS…

2 RELIGION. FAITHS

21/28 CHRISTIANITY

21 Natural theology. Theodicy. De Deo 22 The Bible. Holy scripture 23 Dogmatic theology 24 Practical theology 25 Pastoral theology 26 Christian church in general 27 General history of the Christian church 28 Christian churches, sects 29 NON CHRISTIAN RELIGIONS

Page 20: Aida Slavic Managing KOS: Evolution of concepts and their representation

EXAMPLE 3: CORRECTED 15 YEARS AGO

2 RELIGION. FAITHS

21/28 Christianity 21 Natural theology. Theodicy. De Deo 22 The Bible. Holy scripture 23 Dogmatic theology 24 Practical theology 25 Pastoral theology 26 Christian church in general 27 General history of the Christian church 28 Christian churches, sects 29 NON CHRISTIAN RELIGIONS

NOW.....

2 RELIGION. FAITHS 21 Prehistoric and primitive religions 22 Religions of the Far East 23 Religions of the Indian subcontinent 24 Buddhism 25 Religions of antiquity 26 Judaism 27 Christianity 28 Islam 29 Modern spiritual movements

-1 Theory, nature of religion -2 Evidence of religion -3 Persons in religion -4 Religious practice -5 Worship. Rites. Cult -6 Processes in religion -7 Religious organization -8 Various properties -9 History of the faith, religion,

denomination or church

Page 21: Aida Slavic Managing KOS: Evolution of concepts and their representation

GEO-POLITICAL ENTITIES

§  new entities are being created, many entities become ‘historical’

§  administrative subdivisions of modern countries change (approximately every 20 years) §  counties, districts, administrative units

§  at the same time.. §  ‘old’ subjects have and will continue to have literature

written about them §  Roman Empire, Venetian Republic, Austro-Hungarian Empire (Bukowina,

Galizia), British Empire, USSR, Czechoslovakia, Yugoslavia §  Living and inanimate objects and cultural artefacts are studied and written

about long after they are extinct, out of use or practice

Page 22: Aida Slavic Managing KOS: Evolution of concepts and their representation

TYPE OF CHANGES IN SCHEME

§  Relocation: moving/introducing entire hierarchies from one place of classification structure to another e.g. 40% of UDC has changed from 1990-2008

§  class is cancelled §  new classes added §  class scope may change §  description may change, references may

change

Page 23: Aida Slavic Managing KOS: Evolution of concepts and their representation

TRADITIONAL APPROACH IN HANDLING CHANGES Changes as published in the Extension and Corrections to the UDC

More information about semantics of changes is kept in the UDC database (apart from revision field indicators, date of changes, date of introduction, source of change)

Page 24: Aida Slavic Managing KOS: Evolution of concepts and their representation

NOTATION BECOMES AMBIGUOUS

Bible 27-23 now represented 22

reused

was represented

22 Religions originating in Far East

Reuse of a notation for different concepts 26-23

§  unpopular but unavoidable

§  can happen 10-50 years apart (desirable) or instantly (to avoid)

Page 25: Aida Slavic Managing KOS: Evolution of concepts and their representation

CANCELLATION MAPPING DATA

CLASS ID: 16544 NOTATION: 22 CAPTION: Religions originating in the Far East INTRODUCED/DATE: 0012

REPLACES ID 15991: 299.1 Religions of Oriental Peoples NOTATION HISTORY: yes USED FOR: ID:17054: Bible REPLACED BY: ID:17355: Christian Bible

Managing notation history in the UDC database:

Page 26: Aida Slavic Managing KOS: Evolution of concepts and their representation

UDC CHANGES AND LIBRARIES

§  libraries continue to use classification numbers 20-50

years or longer – few libraries have resources to re-classify

§  libraries rarely record the UDC number provenance – if they do this may represent a particular language edition

§  consequence: new and old concept representations are

used side by side causing many issues in managing/mapping changes to facilitate information exchange

Page 27: Aida Slavic Managing KOS: Evolution of concepts and their representation

COMPLEX CLASSIFICATION STRINGS

Any part of the complex subject description can change over time

Such complex UDC codes are typical of in bibliographic databases/library catalogues

Page 28: Aida Slavic Managing KOS: Evolution of concepts and their representation

GOOD PRACTICE IN MANAGING SUBJECT ACCESS

DOCUMENT

IsDescribedBy

IsDescribedIn

Page 29: Aida Slavic Managing KOS: Evolution of concepts and their representation

CAN LINKED DATA SOLVE THE PROBLEM?

Page 30: Aida Slavic Managing KOS: Evolution of concepts and their representation

LINKED DATA THAT CANNOT BE LINKED

§  National library of Hungary <bibo:Document rdf:about="http://nektar.oszk.hu/resource/manifestation/2645471"> <dcterms:subject> <rdf:Description> <dcam:memberOf rdf:resource="http://purl.org/dc/terms/UDC"/>

<rdf:value>894.511-32</rdf:value> </rdf:Description> </dcterms:subject>

§  Trondheim - Library of Norwegian University Of Science And Technology (NTNU) – TEKORD http://ckan.net/package/tekord)

•  all sets contain obsolete records cancelled from UDC 25 years ago or longer

•  all sets contain complex UDC numbers that need to be parsed in order to be validated and linked

Page 31: Aida Slavic Managing KOS: Evolution of concepts and their representation

ON THE OTHER HAND…

§  UDC archive contains historical data and tracks changes of UDC numbers (from 1900-1990 in paper form)

§  from 1990-2014 changes in UDC recorded in the database – these can be accessed in the UDC Online service

§  UDC Online can be used as a vehicle for a proper support to libraries – allowing for validation, parsing, number builder but also for storing and downloading UDC strings as authority records

Page 32: Aida Slavic Managing KOS: Evolution of concepts and their representation

URI

§  Option 2: notation + database ID ....//UDCMRF/22_17054 [Bible] ...//UDCMRF/22_16554 [Religions originating in the Far East]

§  Option 1: using unique database ID for the class (avoiding notation as an identifier as it can have different meanings over time):

....//UDCMRF/17054 [Bible] ...//UDCMRF/16554 [Religions originating in the Far East]

This approach was used in UDC Summary LD http://udcdata.info/

§  Option 3: notation + ‘release stamp’ - problem: does notation introduced in UDC MRF93 continues to mean the same in MRF00 release?

....//UDCMRF/MRF93/22 [Bible]

....//UDCMRF/MRF99/22 [Bible]

....//UDCMRF/MRF00/22 [Religions originating in the Far East]

Together with the ‘absolute’ MRF ….UDCMRF/22

Page 33: Aida Slavic Managing KOS: Evolution of concepts and their representation

STANDARDS: LACKING APPROPRIATE SOLUTION

§  Solution 2 (by C. Guéret): extending SKOS/MRF data with either

§  event ontology (LODE http://linkedevents.org/ontology/)

§  PROV ontology (provenance)

which would allow publishing/sharing information about what is actually happening with the class.

§  SKOS lacks solution to represent historical data or to track historical changes and one has to look for solutions in other ontology-type standards for representing vocabularies

§  Solution 1 (by A. Isaac): Extending SKOS using dc terms to model changes as isVersionOf and isReplacedBy relationships – introducing notation as a udcmrf:reference that can aggregate different concepts - but most importantly to allow for the introduction of concept into UDC (an empty node)

But it is not only about indicating the relationship – rather it is about documenting the change. Hence a more complex model would be needed

Page 34: Aida Slavic Managing KOS: Evolution of concepts and their representation

SOLUTION 1: TOWARDS UDC CONCEPT (A. Isaac)

udcmrf:reference/22

"22"^^udc:notationskos:notation

udcmrf:22_17054

skos:prefLabel "Bible"@en

"22"^^udc:notation

skos:notation

udcmrf:22_16544

skos:prefLabel

"Far Eastreligions"@en

dct:isReplacedBy

ore:aggregates

ore:aggregates

"299.1"^^udc:notation

skos:notation

udcmrf:299.1_15999

skos:prefLabel

"Religions ofOriental

Peoples"@en

dct:isReplacedBy

udc:concept-FarEastReligion

dct:isVersionOf dct:isVersionOf

Page 35: Aida Slavic Managing KOS: Evolution of concepts and their representation

SOLUTION 2: CLASS CHANGES AS AN EVENT (C. Guéret)

§  this would allow to publish/share all UDC classes that ever existed with all data related to the class lifecycle as well as with the various attributes relevant for automatic linking or replacement

§  Such an approach would have to be supported with an appropriate service model

§  Works with URI that is based on a ‘release stamp’ and notation

dc:Creation rdfs:subClassOf lode:Event udc:Replacement rdfs:subClassOf lode:Event udc:Reuse rdfs:subClassOf lode:Event to get something similar to the following: udc:class/22 ical:hasEvent udc:event/1 udc:event/1 rdf:type udc:Creation udc:event/1 lode:involved udc:release/MRF10 udc:event/1 rdfs:comment “new class”

Page 36: Aida Slavic Managing KOS: Evolution of concepts and their representation

DATA NEED SHARING: NOTATION & CONCEPT HISTORY

§  Whenever UDC notation is re-used e.g. §  notation used for: term describing concept for which the notation

was previously used §  old concept moved to: ID of the class to which the concept was moved §  date of concept move §  source of concept move

§  Whenever a concept is moved from one class to another §  concept that moved: term representing concept §  concept previously at: ID of the class at which concept

was before §  date of move §  source of move

Page 37: Aida Slavic Managing KOS: Evolution of concepts and their representation

DATA NEED SHARING: CANCELLATION

§  UDC number may be cancelled but its record and its ID stays permanently §  cancellation date (date of cancellation) §  cancellation source (issue of Extensions & Corrections in which this is

published) §  replaced by: ID of the record to which the UDC number is redirected §  replacement type

controlled list of types, expressing what the cancelled number is replaced with: new class, colon combination; combination with common auxiliary; combination with special auxiliary; other

§  replacement (semantic) alignment controlled list: exact match, to broader, to narrower, approximation

Page 38: Aida Slavic Managing KOS: Evolution of concepts and their representation

UDC LINKED DATA ARCHITECTURE WILL GET MORE COMPLICATED

Towards look-up service based on classification RDF triple store… (C. Guéret)

Page 39: Aida Slavic Managing KOS: Evolution of concepts and their representation

CONCLUDING REMARKS

§  UDC RDF triple store should contain all data necessary to resolve and interpret strings coming from library catalogues (including historical UDC data)

§  libraries should not need to worry about resolving the semantics of UDC codes

§  UDC linked data should be supported by a front-end service (number look-up/resolution service) – which would enable parsing, validating and resolving URI for UDC codes

Page 40: Aida Slavic Managing KOS: Evolution of concepts and their representation

CFP closes on 8th March http://seminar.udcc.org/2015/

THANK YOU