13
Noname manuscript No. (will be inserted by the editor) Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies Matias Frosterus · Jouni Tuominen · Sini Pessala · Eero Hyv¨ onen the date of receipt and acceptance should be inserted later Abstract Traditionally the structure of the controlled vo- cabularies used for annotation can be utilised for reasoning for information retrieval. However, this can be problematic when applied in the Linked Data context. Linked data typi- cally comes from different organisations and domains with mutually incompatible vocabularies without explicit links between them resulting in data silos. This paper argues that to solve the problem one has to transform the annotation vo- cabularies into a Linked Open Ontology cloud. We present a method for transforming a set of legacy thesauri into a cloud of interlinked ontologies while ensuring the validity of the transitive subclass relations and the means for main- taining the system when component ontologies are updated. Our approach has been used and evaluated in practice build- ing a cloud called KOKO of sixteen ontologies, with a total of 47,000 concepts. KOKO has been published as an ontol- ogy service and is in use in various organisations for both data indexing and semantic search. Keywords Light-weight Ontologies · Linked Open Ontology Cloud · Ontology change propagation · SKOS · Ontologisation · Cross-domain interoperability 1 Introduction Libraries, archives, museums, and other organisations have been using classifications and thesauri for content indexing (annotation) for a long time. There exist vast amounts of high-quality annotations describing vast amounts of docu- ments and other objects but these metadata descriptions are often divided into silos without machine-traversable links Semantic Computing Research Group (SeCo), Aalto University P.O. Box 15500, 00076 Aalto, Finland http://www.seco.tkk.fi/, firstname.lastname@aalto.fi between the values used in the metadata. Semantic interop- erability between heterogeneous cross-domain data reposi- tories of distributed content providers has become the crit- ical challenge for the Semantic Web and Linked Data [13]. Enabling different thesauri and vocabularies to link to one another in meaningful ways would allow different organisa- tions to benefit from each others’ work by enriching a com- mon pool of linked knowledge [15]. 1.1 Why a Linked Open Ontology Cloud? The Linked Data movement 1 has focused its efforts on build- ing cross-domain interoperability by creating and using (typ- ically) owl:sameAs mappings between the entities (e.g. places, persons) in the datasets of the Linked Open Data (LOD) cloud. However, when linking metadata, not only the data entities but also ontologies used in describing them need to be interlinked for interoperability. This calls for more refined ontology alignment techniques [8] maintaining the integrity of the concept hierarchies. This paper focuses on aligning light-weight domain on- tologies intended for metadata annotations. A light-weight ontology in our terminology is a hierarchy of concepts with subsumption, partitive, and associative relations like in a tra- ditional thesaurus [2], and can be represented using RDFS 2 , simple OWL 3 constructs, or SKOS 4 . The research hypothesis of this paper is that the LOD cloud could be complemented by developing one or more light-weight “Linked Open Ontology” (LOO) clouds. The idea of LOO is to provide a shared cross-domain ontology for data annotations based on a set of interlinked domain 1 http://linkeddata.org/ 2 http://www.w3.org/TR/rdf-schema/ 3 http://www.w3.org/standards/techs/owl#stds 4 http://www.w3.org/TR/skos-reference/

Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Noname manuscript No.(will be inserted by the editor)

Linked Open Ontology Cloud – Managing a System of InterlinkedCross-domain Light-weight Ontologies

Matias Frosterus · Jouni Tuominen · Sini Pessala · Eero Hyvonen

the date of receipt and acceptance should be inserted later

Abstract Traditionally the structure of the controlled vo-cabularies used for annotation can be utilised for reasoningfor information retrieval. However, this can be problematicwhen applied in the Linked Data context. Linked data typi-cally comes from different organisations and domains withmutually incompatible vocabularies without explicit linksbetween them resulting in data silos. This paper argues thatto solve the problem one has to transform the annotation vo-cabularies into a Linked Open Ontology cloud. We presenta method for transforming a set of legacy thesauri into acloud of interlinked ontologies while ensuring the validityof the transitive subclass relations and the means for main-taining the system when component ontologies are updated.Our approach has been used and evaluated in practice build-ing a cloud called KOKO of sixteen ontologies, with a totalof 47,000 concepts. KOKO has been published as an ontol-ogy service and is in use in various organisations for bothdata indexing and semantic search.

Keywords Light-weight Ontologies · Linked OpenOntology Cloud · Ontology change propagation · SKOS ·Ontologisation · Cross-domain interoperability

1 Introduction

Libraries, archives, museums, and other organisations havebeen using classifications and thesauri for content indexing(annotation) for a long time. There exist vast amounts ofhigh-quality annotations describing vast amounts of docu-ments and other objects but these metadata descriptions areoften divided into silos without machine-traversable links

Semantic Computing Research Group (SeCo), Aalto UniversityP.O. Box 15500, 00076 Aalto, Finlandhttp://www.seco.tkk.fi/, [email protected]

between the values used in the metadata. Semantic interop-erability between heterogeneous cross-domain data reposi-tories of distributed content providers has become the crit-ical challenge for the Semantic Web and Linked Data [13].Enabling different thesauri and vocabularies to link to oneanother in meaningful ways would allow different organisa-tions to benefit from each others’ work by enriching a com-mon pool of linked knowledge [15].

1.1 Why a Linked Open Ontology Cloud?

The Linked Data movement1 has focused its efforts on build-ing cross-domain interoperability by creating and using (typ-ically) owl:sameAs mappings between the entities (e.g.places, persons) in the datasets of the Linked Open Data(LOD) cloud. However, when linking metadata, not onlythe data entities but also ontologies used in describing themneed to be interlinked for interoperability. This calls for morerefined ontology alignment techniques [8] maintaining theintegrity of the concept hierarchies.

This paper focuses on aligning light-weight domain on-tologies intended for metadata annotations. A light-weightontology in our terminology is a hierarchy of concepts withsubsumption, partitive, and associative relations like in a tra-ditional thesaurus [2], and can be represented using RDFS2,simple OWL3 constructs, or SKOS4.

The research hypothesis of this paper is that the LODcloud could be complemented by developing one or morelight-weight “Linked Open Ontology” (LOO) clouds. Theidea of LOO is to provide a shared cross-domain ontologyfor data annotations based on a set of interlinked domain

1 http://linkeddata.org/2 http://www.w3.org/TR/rdf-schema/3 http://www.w3.org/standards/techs/owl#stds4 http://www.w3.org/TR/skos-reference/

Page 2: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

2 Matias Frosterus et al.

ontologies. This idea is also complementary with the idea of”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto eachother. In our implementation of the idea, we created the LOOcloud from light-weight ontologies based on existing the-sauri. If the resulting LOO cloud is then mapped to, e.g.,DBpedia, legacy data annotated using the original thesaurican be linked to the LOD cloud quite easily.

Developing a LOO cloud is in many ways different fromlinking entities in datasets or elements in metadata schemas.A major difference is that in LOO the linked structure is usedfor reasoning based on the hierarchical subclass relation, thebackbone of ontologies [31]. This fundamental task requiresspecial consideration at the ontology boundaries as other-wise cross-domain reasoning and ontology-based query anddocument expansion [3,17] in applications may fail [16].

For example, assume that the concept “Mirror” is presentin a given ontology A of daily utensils and has the subclass“Make-up-mirror”:

a:Make-up-mirror rdfs:subClassOf a:Mirror.In a related ontology B of furniture, the class Mirror is

used as a subclass of the class Furniture:b:Mirror rdfs:subClassOf b:Furniture.Without context, the concept of mirror looks like the

same in both ontologies, i.e.a:Mirror owl:sameAs b:Mirror.Reasoning and query expansion works fine in A and B

separately, but when using A and B linked together, expand-ing a search query for “furniture” would return falsely hand-held make-up mirrors in addition to pieces of furniture. Alarger context than the concept alone has to be consideredwhen linking ontologies.

There are also other difficulties specific to developing aLOO cloud. For example, the principle of dividing a sharedconcept X into subclasses in different ontologies may bedifferent. For example, “clothes” can be divided into sub-classes based on the gender or the age of their wearers. Then,from a human perspective, X may have a confusing mix-ture of subclasses in the linked ontology, hampering its usein user interfaces (e.g. as a search facet). Addressing issueslike these is hard to automate, and therefore LOO develop-ment in practice requires more coordinated collaboration be-tween the developers of linked ontologies than when linkingdatasets of instances. By collaboration, better quality linkscan be created and various critical issues of linked data qual-ity6 can be addressed. Coordinated collaboration also facili-tates larger scale ontology development, which prevents thecreation of interoperability problems, and minimises redun-dant ontology development work in overlapping areas of on-tologies.

5 http://lov.okfn.org/dataset/lov/6 Cf. e.g. http://pedantic-web.org/fops.html

Our approach therefore emphasises the systematic de-velopment process of a coherent aggregated ontology froma set of component ontologies that are maintained by differ-ent domain communities. This proactive idea of developinga larger ontology based on different domain ontologies [16]is different from traditional ontology mapping, where onetakes a set of existing, independently developed ontologiesand tries to map them together afterwards. In contrast tosimply mapping individual ontologies together we take themappings as an integral part of the ontologies. The ontolo-gies are considered both as individual entities but also asintegral parts of the cloud forming a greater whole. This as-pect is taken into account at all levels of the developmentand publication of the ontologies, thus leading to a differentprocess and underlying philosophy compared with the usualapproach of linking independently developed ontologies.

1.2 A National Effort in Building a LOO Cloud

In order to test and evaluate these hypotheses in practice, aLOO cloud called KOKO of cross-domain ontologies (e.g.health, cultural heritage, agriculture, seafaring, government,defence) has been realised in Finland on a national level dur-ing the FinnONTO research project (2003–2012). Variouslibraries, archives, museums, and governmental actors haveannotated vast amounts of documents, each with their ownthesauri. The aim of the KOKO cloud is to maintain back-wards compatibility with the existing annotations while al-lowing for better interoperability between different datasets.Thus, the system is based on transforming a set of legacythesauri in use into light-weight ontologies and interlinkingthem with each other. Currently, KOKO encompasses six-teen ontologised thesauri with more to be integrated into thesystem in the future. The current version of KOKO is a har-monised global ontology of some 47,000 concepts alignedinto a single hierarchy.

In the centre of the KOKO cloud is the General FinnishOntology YSO, based on the General Finnish Thesaurus YSA7,which has been developed since 1987 by the National Li-brary of Finland and is used across various organisations inFinland. YSO provides the upper hierarchy, originally in-spired by the ideas of “foundational ontologies”, such asDOLCE [9], and shared “upper ontologies”, such as IEEESUMO [27]. The idea is to provide the LOO cloud with thecommon upper concepts needed in many domains. YSO isthen complemented by more refined ontologies for specificdomains. These component ontologies are developed by theexperts responsible for the original thesauri preserving thedomain know-how while allowing for asynchronous updat-ing based on the resources of each organisation. From anend user’s point of view, KOKO ontology is seen and used

7 http://vesa.lib.helsinki.fi/ysa/

Page 3: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies 3

as a single ontology without boundaries; for the ontologydevelopers, domain boundaries are needed in order to di-vide the responsibilities of distributed ontology work basedon domain expertise needed in different parts of the KOKOcloud.

KOKO ontology was originally published in the ontol-ogy library system ONKI8 [36,37] operating as a living labresearch environment. Over the years ONKI has been in-tegrated into, e.g. several museums, libraries, and web por-tals. In 2013 the National Library of Finland launched a jointproject with the Ministry of Finance and the Ministry of Ed-ucation and Culture to build a permanent, national ontologyservice Finto9 based on the ONKI system [35]. The NationalLibrary also took on the responsibility of coordinating fur-ther development of the KOKO cloud. Finto ontology ser-vice provides a centralised publication channel for the on-tologies with common interfaces for accessing them in vari-ous applications.

1.3 Challenges in Developing a Linked Ontology Cloud

Several issues need to be taken into account when mov-ing from developing individual thesauri into developing andmaintaining a cloud of interlinked ontologies. In this paper,the following challenges are discussed.

– Creation of ontologies. How is an existing legacy the-saurus transformed into an ontology? How is a new on-tology mapped into the linked ontology system? Howare the URIs of the concepts formed?

– Development of ontologies. How are the overlappingparts of ontologies recognised in order to minimise theduplicate work of the ontology developers? How are thechanges in one ontology communicated to other relatedontologies? How are the errors and other quality issuesrecognised in a system of several ontologies?

– Publication of ontologies. How should the linked ontol-ogy system be presented to the end user in order to makeit comprehensible? What kind of services for using theontologies are needed for different user groups?

1.4 Structure of the Paper

This paper is divided into four main parts. Section 2 presentsa model for creating and updating a linked ontology cloudand lists a set of seven principles guiding the process. Thefollowing Sections 3–5 describe the development cycle ofthe cloud in more detail with an emphasis on how a sin-gle domain ontology is handled by the process. Throughout,the KOKO cloud provides an illustrative use case on how

8 http://onki.fi/9 http://finto.fi/en/

the process has been applied to practice. Since the tradi-tional, comparative evaluation of the process is difficult, theextensive application to practice has been used as a proofof concept with the process being adjusted based on real-life experiences in accordance with the principles of actionresearch [5]. The main user groups for this have been theontology developers on the one hand, and the systems thatKOKO has been integrated into on the other hand. In the fi-nal Section 6, related work is presented and the contributionsof the paper are summarised.

2 A Model for Managing a Linked Ontology Cloud

Our ontology development work started by a field study onhow thesauri in use are actually developed. The result wasthat thesauri are typically developed by independent expertgroups focusing on concepts in their own domains of inter-est, with little collaboration between the groups. The situa-tion seems to be more or less similar in other countries, too.Obviously, this model leads to redundant work in develop-ing overlapping areas of thesauri and, at the same time, to in-teroperability problems between the thesauri, since differentparties define their concepts without considering each oth-ers’ choices. To address these problems we propose a morecoordinated collaborative model for developing a linked on-tology cloud, which is depicted in Fig. 1. Note that in thefollowing discussion the bolded numbers in parentheses re-fer to the numbered parts in the figure.

2.1 Ontology Creation Phase

First, existing thesauri (1) and ontologies (2) are selectedfor building blocks of the ontology cloud. A thesaurus isconverted into RDF format using a shared ontology schema(3) and aligned with a general upper ontology (GUO) (4).Aligning domain ontologies with a GUO forms the basis forinteroperability by providing a complete concept hierarchyand is much easier to maintain than direct, pair-wise map-pings between domain ontologies [12]. This idea was sug-gested, e.g. by the IEEE Standard Upper Ontology (SUO)10

working group.The alignment can be done in a semi-automatic fash-

ion by first generating equivalency mappings automaticallyand then correcting them manually. In addition to equiva-lency mappings, subsumption and partitive relations mightbe used in the case of light-weight ontologies. For exam-ple, the concept “antique furniture” in a museum domainontology may be aligned with the concept “furniture” in theupper ontology using a subclass-of relation. In order to cre-ate a complete, fully connected linked ontology, all concepts

10 http://suo.ieee.org/

Page 4: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

4 Matias Frosterus et al.

of a domain ontology should be mapped to the GUO eitherdirectly or through other concepts in the domain ontology.The transformation is discussed in more detail in [16]. Inour case study, a natural basis for a GUO was the GeneralFinnish Thesaurus YSA that was transformed into the Gen-eral Finnish Ontology YSO. already as a reference in manyspecialised thesauri.

Fig. 1 The model of Linked Ontology Cloud formation and manage-ment

2.2 Ontology Integration Phase

The ontology integration phase begins after the domain on-tologies have been aligned with the GUO (5). There maybe mutually overlapping parts between the domain ontolo-gies because the alignments were made between the domainontologies and the GUO only. To facilitate the integration ofdomain ontologies (DO) (6), processes, and tools for discov-ering overlapping parts of the ontologies are needed. Basedon the analysis, it is possible to eliminate redundant devel-opment work by deciding between domain ontology devel-opers which ontologies maintain which overlapping parts.

Changes in the GUO create pressure for the domain on-tologies to be updated accordingly to ensure the consistencyof the cloud (7). For example, in our case study system,some 2,000 changes are made annually in the upper ontol-ogy YSO. The changes should be taken into account in thedevelopment process of the domain ontologies. Fortunately,not all changes in the GUO are relevant to all domain on-

tologies, but only those related to them via equivalency, sub-sumption or partitive relations.

2.3 Cloud Publication Phase

When a development cycle of an ontology cloud has beencompleted, its logical consistency and other quality aspectsshould be validated (8) making sure that the resulting ontol-ogy adheres to the constraints of the properties and classesused. Some of the problems encountered can be fixed au-tomatically [34] (e.g. overlaps in disjoint semantic relationsand cycles in the concept hierarchy) but they should nonethe-less be communicated to the developers. However, automaticvalidation has difficulty in finding problems on the semanticlevel, which usually requires manual checking. Finally, theontology cloud can be published as services for humans andmachines, e.g. via user interfaces, APIs, and downloadablefiles (9).

2.4 Principles for Building a Linked Ontology Cloud

In our case study, the KOKO cloud based on the sixteenontologised thesauri presented in Table 1 was created. Be-low, the lessons learned during the work are summarised asa seven-point list of practical building principles. We con-sider the proposed principles novel, as we are not aware ofprevious ontology design patterns focused on managing anontology cloud as a whole.

I The ontology cloud consists of one general upper on-tology and several domain ontologies that are linked tothe upper ontology with subsumption, equivalency, as-sociative and partitive relations.Reason: This means that the domain ontologies do nothave to be linked to each other pairwise, as the upperontology acts as semantic glue for joining all the on-tologies together. Shared concepts are included in theupper ontology. The idea is to minimise the links be-tween the domain ontologies, which simplifies their de-velopment since a given developer needs only concernherself with her own domain ontology and the GUO.

II Every concept in a domain ontology has a subsumptionor equivalency relation to a concept in the GUO or asubsumption relation to a concept in the same domainontology.Reason: This means that every concept in a domain on-tology needs to be able to trace a subsumption relationto a concept in the general upper ontology. This en-sures a consistent concept hierarchy for the whole cloudand that domain ontologies can not define new top-levelconcepts.

Page 5: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies 5

Name of theontology

Domain Number ofconcepts

YSO General upper ontology(GUO)

27,200

AFO Agriculture and forestry 7,000

JUHO Government 6,300

KAUNO Literature 5,000

KITO Literary research 850

KTO Linguistics 900

KULO Cultural research 1,500

LIITO Economics 3,000

MAO Museum artefacts 6,800

MERO Seafaring 1,300

MUSO Music 1,000

PUHO Military 2,000

TAO Design 3,000

TERO Health 6,500

TSR Working and employment 5,100

VALO Photography 2,000

Table 1 The ontologies comprising the LOO cloud KOKO

III If a concept in a domain ontology has an equivalency toa concept in the GUO, it may not have broader conceptsin the domain ontology, which lack an equivalency re-lation to a concept in the GUO.Reason: This is needed to avoid having dependenciesfrom the GUO towards a domain ontology by forbid-ding domain ontologies from introducing broader con-cepts to concepts in the GUO (through inference overequivalency relation). Otherwise domain ontology de-velopers could propagate contradictions or unwantedconcept (re)definitions into the GUO, especially in caseswhere more than one domain ontology is involved.

IV The domain ontologies are focused on a clearly boundeddomain and are as self-contained as possible.Reason: This allows the domain ontology developer toconcentrate on the area of her expertise. This also min-imises dependencies between domain ontologies andfacilitates ontology development work.

V A concept in a domain ontology may not have an equiv-alency to a concept in another domain ontology. A con-cept in a domain ontology may have an associative re-lation or have a broader concept in another domain on-tology at the discretion of the developers.Reason: Dependencies between domain ontologies can-not always be avoided due to inherent relations acrossthe borders of different domains. This means that thedevelopers need to monitor the changes to the other do-main ontology but this is allowed since it does not affect

the other domain ontology directly. The use of broaderand associative relations extends the target domain on-tology but does not affect its semantics (strongly), whereasthe equivalency relation would possibly introduce newbroader concepts to concepts in the target ontology andthus redefine their semantics.

VI The GUO and the domain ontologies use a shared on-tology schema and are based on domain-independentstandards.Reason: Thus the ontologies can be easily merged andused together with various data sources outside of thosedirectly linked to the cloud, using standard software,e.g. SKOS or RDFS/OWL tools.

VII The resulting ontology cloud should be logically con-sistent, e.g. by ensuring the integrity of the concept sub-sumption hierarchy over ontology boundaries (since tran-sitivity is assumed).Reason: This allows for reasoning and query expansionover the whole cloud.

3 Creating a Thesaurus-based Ontology for the LinkedOntology Cloud

The creation of a cloud of linked ontologies begins withthe formation of the general upper ontology. This ontologyprovides a completely connected hierarchy of general con-cepts including the topmost division, e.g. between abstract,endurant and perdurant concepts [9]. This forms the basicstructure for the domain ontologies to map into, thus savingthem from having to repeat the higher parts of the hierarchy.

Fig. 2 Ontology Creation Phase

Fig. 2 depicts the process of creating a new domain on-tology for the cloud of linked ontologies, which begins withthe conversion (3) of the domain thesaurus (2) from a legacy

Page 6: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

6 Matias Frosterus et al.

format into RDF, typically SKOS. This is often a straight-forward operation but in some cases the original thesauruscan include relations that are not easy to convert to SKOS.In these cases it can be a good idea to retain the original re-lations in the form of a temporary predicate which can thenbe harmonised in the publication phase of the process.

In the FinnONTO case, we used ad-hoc scripts for thetransformation since the domain ontologies were in differ-ent formats. We also transformed them originally into a cus-tom OWL-based format since at the beginning of the projectSKOS had not yet been established as a standard way of rep-resenting light-weight RDF ontologies. We continued thispractice in part because of existing tools, such as Protege11,but also because some thesauri used relations not present inSKOS. An example of this was the Agriforest12 thesaurus ofagriculture and forestry, which uses different relations to dif-ferentiate between names of concepts that were derived fromdifferent sources. These the were preserved for the benefit ofthe ontology developers but were then combined into a com-mon predicate for publication.

The next step of the process is the automatic mapping(4) to the GUO (1). This can be done roughly through stringcomparison between labels or, alternatively, by also utilisingthe structure of the ontologies, depending on the nature ofthe original thesaurus. Since different thesauri may have dif-ferent labelling conventions (for example plural vs. singularforms) some sort of stemming or lemmatisation may also beneeded for the label comparison. The result is a preliminarymapping which then needs to be checked manually in thenext part (5 and 6) by a domain expert ontologist since theaim is to produce as good and reliable a result as possible.Aside from checking the mapping, the human developmentpart also entails the ontologisation work proper, which needsto be done when changing vaguely defined terms into moreprecise ontology concepts [16,21]. In many cases, a term inthe original thesaurus becomes several concepts in the on-tology depending on whether the term has multiple mean-ings. When a single term corresponds to several concepts,we have added a qualifier in parentheses to the preferred la-bel for each concept in order to make it easier to differentiatebetween them. For example, the ambiguous keyword ”child”could be split into concepts ”child (age)” and ”child (familyrelation)”.

In the FinnONTO project, we did the preliminary map-ping between a domain ontology and the GUO by label com-parison using lemmatisation and custom scripts, while mostof the actual ontologisation work was done by the expertsfrom the organisations that maintained the thesauri. Nowthere exist specialised ontology matching tools, such as Agree-

11 http://protege.stanford.edu/12 http://www-db.helsinki.fi/agri/agrisanasto/Welcome eng.html

mentMakerLight13, which could be utilised for the initialmapping.

Finally, the URI scheme of the ontology needs to be con-sidered. A good starting point is the URI design principlespresented by W3C in Cool URIs for the Semantic Web14.Human-readable meaning-bearing URIs pose difficulties inthat they are language-dependent and, most importantly, inthe case where further development ends up changing thelabel of the concept, the persistent URI can not keep upwith the changes, thus gradually leading to the degenera-tion of the mapping between the labels and URIs. Since weare building on thesauri that have been in active use and de-velopment for long, we must also consider the long term ef-fects of our current decisions. Therefore, we decided to uselanguage- and meaning-neutral URIs where the local nameconsisting of a letter and a string of numbers (e.g. p3612).

Since ontologies evolve with new concepts and URIsintroduced and old ones possibly deleted, a system (7) isneeded for tracking the use of the URIs. In FinnONTO, wecreated our own tool for this called Purify15. Purify har-monises all local names in a given namespace to the letterfollowed by a number format. The tool keeps a log of themappings between the possible temporary URIs used duringthe early stages of the development since human-readableURIs were found to be useful at the beginning of the ontol-ogisation. Finally, Purify makes sure that no two resourcesget the same URI and that if the URI generation needs to berepeated, the result will be the same every time.

With the first version of a domain ontology completed(8) and successfully mapped to the GUO, the next step is tointegrate it into the ontology cloud proper. Table 1 lists allthe ontologies completed for our LOO cloud KOKO at themoment. The name of the ontology is followed by a descrip-tion of its domain and the number of concepts.

4 Maintaining a Cloud of Interlinked Ontologies

There are two main concerns when integrating and manag-ing domain ontologies in a cloud: how to manage the domainontologies with respect to one another and how to managetheir relation to GUO. This phase is depicted in Fig. 3, con-tinuing from Fig. 2, and is explained in depth below.

4.1 Avoiding Overlap Between Domain Ontologies

The Principles IV and V presented in Section 2 posit that inorder to keep relations between domain ontologies to a min-imum, the domains covered need to be precisely set. With

13 https://github.com/AgreementMakerLight14 http://www.w3.org/TR/cooluris/15 http://puri.onki.fi/info/

Page 7: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies 7

Fig. 3 Ontology Integration Phase

minimal links between domain ontologies, they can be de-veloped independently from one another, but it also meansthat the curation of the ontological domains is an on-goingprocess. Overlaps can come into being especially in the bor-der areas of two domains where a given set of concepts cannot be unequivocally said to belong to either one of two dif-ferent ontology domains.

In Fig. 3, we can see that the process of discovering theseoverlaps between a given domain ontology (1) and the restof the domain ontologies in the cloud (3) makes use of atool (4) to help the domain ontology developers in findingand even analysing the overlaps. This tool can be based onstring matching similar labels and reporting on the poten-tially overlapping concepts, but it can also make use of theontological structure and relations to find overlapping con-cepts [8].

Once an overlap between two or more domain ontolo-gies has been discovered, a dialogue (5) should be startedbetween the developers of those ontologies. Its possible endresults are as follows:

a) The concepts that overlap are really the same concept andthe developers of the domain ontologies and the GUOcan agree that the concept is general enough to be in-cluded in the GUO. In this case, the concept can be re-moved from the domain ontologies.

b) The concepts that overlap are really the same concept butthe developers can agree that the concept is not neededin both (or all) of the domain ontologies and agree on asingle domain ontology that should host the concept inquestion and handle changes and development further onthat concept and its subconcepts. This is most commonin situations where the concept is clearly in the domainof one ontology and only included in the other due tohistorical reasons, for example.

c) The concepts that overlap are really the same conceptand the ontology developers wish for it to remain presentin both (or all) of the ontologies. A note should be madethat if the concept is changed or developed further, theother developers should be informed as needed.

d) The concepts that overlap are actually different conceptsbut might share a preferred label. In this case, the labelsshould be differentiated from one another if possible soas to avoid confusion when the ontologies are used to-gether.

Great care should be exercised when choosing option c)since it clashes with Principle V and can easily lead to in-creasing complexity in development. This should be avoidedas much as possible, since one of the main goals of the pre-sented system is to make the asynchronous development ofontologies possible and having the same concept in two do-main ontologies means that both sets of developers need toagree on possible further development.

In our case study, we implemented a tool called KOAN,based on simple label matching, for finding ontology over-laps since the light-weight ontologies do not offer much struc-ture to use as a basis for the discovery task. Furthermore,identical labels pose the most difficulty for the annotatorsusing the ontology since they would need to decide betweendifferent concepts with the same names based on their placein the hierarchy. Having concepts with different labels thatend up being the same is much less of a problem since, whenthe mistake is discovered, it is relatively easy to combine theannotations made using the duplicated concepts.

When applied to KOKO, KOAN found lots of overlap-ping concepts even between ontologies of seemingly verydistinct domains. Table 2 shows some of the comparison re-sults between ontologies by showing the per cent amount ofoverlap. For instance, from the first row, second cell, we cansee that the agriculture and forestry domain ontology AFOcontains 8 % of the concepts of the government domain on-tology JUHO. Comparing with Table 1 we can see that evenontologies from seemingly very distinct domains can havea lot of overlap between them. Maintaining the overlaps inmultiple places at the same time creates a lot of unnecessarywork and can lead to inconsistencies in the transitive rela-tions. Our aim is to implement a systematic process for theelimination of these overlaps.

In addition to overlapping concepts, a domain ontologymay have a concept with an associative relation or a broaderconcept in another domain ontology because cross-domainrelations cannot always be avoided. For example, in our usecase, the museum domain ontology MAO contains the con-cept ”catapults”, which is a subclass of the concept “weapons”in GUO. On other hand, the military domain ontology PUHOhas the concept “single-shot weapons”, which could be usedas the superclass of “catapults” for more refined semantics.However, using relations between domain ontologies may

Page 8: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

8 Matias Frosterus et al.

Ontology AFO JUHO KAUNO MERO TERO

AFO 8% 2% 3% 25%

JUHO 7% 16% 5% 40%

KAUNO 2% 12% 1% 28%

MERO 0% 1% 0% 2%

TERO 23% 41% 36% 13%

Table 2 Number of overlapping concepts in five domain ontologies ofKOKO

be a justification of moving such cross-domain concepts (“single-shot weapons” in the example) into GUO in order to min-imise dependencies between domain ontologies.

4.2 Keeping the Domain Ontologies up to Date RegardingGUO

The second part of the cloud management process, as de-picted on the right hand side of Fig. 3, is the handling of thechanges in the upper ontology. As an implication of Prin-ciple II, the domain ontologies react to the changes in theupper ontology. In other words, when the GUO developersrelease a new version (6), the domain ontologies need to bepotentially updated.

The structure of the cloud aims to allow for the asyn-chronous updating of the domain ontologies, which can re-sult in long intervals between the updates of a given domainontology. Additionally, the ontologies are often developed indifferent organisations, and using separate ontology editorscreates a challenge in how to communicate the changes be-tween the developers. The two main approaches to convey-ing the changes are push and pull synchronisation [6]. Thepush version propagates the changes in one ontology imme-diately to the other ontologies, whereas in the pull versionthe change listing is requested when needed by the ontologydeveloper. The push approach is good for situations wherethe ontologies are updated frequently, so that the ontologydeveloper can quickly ensure the consistency of the ontol-ogy after the changes. However, this approach is challeng-ing if the ontology reacting to changes is update infrequentlydue to, e.g., lack of resources. Then it would be preferablethat the changes could be acquired when needed and notpropagated immediately. A long update interval also meansthat the amount of changes can build up over time and whenthe update process of the domain ontology is started, thenumber of changes that need to be checked can be in thethousands.

In order to ease the work of the domain ontology devel-oper, a tool (7) needs to be used for propagating the changes.It would also be beneficial to order or categorise the changesof the GUO somehow according to their relevance to the do-

main ontology in question. A set of criteria for estimatingrelevance should take into account the differences in ontolo-gies and the relations between them as well as the likelychanges that are going to occur in development.

In the FinnONTO project, we created a change propaga-tion tool MUTU and a set of accompanying relevance crite-ria as described in [29]. The basic idea is that a change in theGUO is likely to be relevant to domain ontology developersif it concerns a concept that has been directly linked to fromthe domain ontology (a connecting concept). If a concept inthe GUO has been marked as equivalent or as a superclass toa concept in the domain ontology, any change to it is likelyto be of interest to the domain ontology developers. Further-more, if the concept hierarchy of an ancestor changes for aconnecting concept in GUO, this is likely to be relevant tothe domain ontology developer due to the transitive natureof the relation. If a concept is removed, rare though that is,that is deemed as interesting due to the fact that this con-cept could have been in use by the domain ontology usersbut has not been duplicated to the ontology itself. Finally, ifa concept has been added that has the same label as a con-cept already existing in the domain ontology, this is alwaysrelevant.

MUTU simply finds out the changes between the newversion of the GUO and the one that was used for the map-ping of the domain ontology, lists these changes accordingto the type of the change and relevance, and adds helperclasses to the development version of the domain ontologyfor grouping the concepts in the ontology editor suite thatneed to be checked by the developer (8). MUTU also allowsthe domain ontology developer to configure it according toher needs based on, e.g., a specific priority of languages oron blocking changes from certain properties deemed irrele-vant to the domain ontology development.

5 Publishing a Linked Open Ontology Cloud

Once the individual component ontologies have been devel-oped and the links between them have been curated, the on-tology cloud needs to be published. The aim is to providethe users with a single, unified whole that can be used es-sentially as a single ontology like any other. The processof publishing a Linked Open Ontology Cloud is depicted inFig. 4.

5.1 Validation, Merging and Publication

The publication phase starts when a new version of a do-main ontology (1) is ready and the developer wants to pub-lish it in the LOO cloud. The first step is to clean up theontology for publication (2), by removing structures needed

Page 9: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies 9

only in the development of the ontology. For example, tem-porary concepts that are used for grouping concepts that areunder development and editorial notes for internal use bythe ontology developers can be removed. Similarly, tempo-rary ontology-specific predicates should be converted to thecommon schema.

Fig. 4 Cloud Publication Phase

In the FinnONTO case, the domain ontologies were de-veloped in a Protege project with the general upper ontologyincluded. To help the ontology developer to focus on theconcepts of the domain ontology, the topmost concepts ofthe domain ontology (the ones that do not have subclass-ofrelation to a concept in the domain ontology) are connectedwith subclass-of relation to a temporary concept acting as aroot concept for the ontology. This root concept is removedin the clean up process as the goal is to present the resultingontology cloud to the end users as a single complete hierar-chy with a single root concept.

Before the new domain ontology version is merged intothe cloud it is validated (5) for compliance with Principle VII,e.g. by checking the logical consistency and spotting the vi-olations of best practices. The results of the validation arecommunicated to the ontology developer who can then fixthe problems. A validator may also fix some of the prob-lems automatically, e.g. by removing cycles in the concepthierarchy.

For the validation of the ontologies we are using theSkosify tool [34], which converts RDFS/OWL/SKOS vo-cabularies into proper SKOS format. Overlaps in disjoint se-mantic relations (skos:related and skos:broaderTransitive)are checked and inconsistencies are corrected automaticallyby removing skos:related relations in problematic cases.In cases where a concept has more than one skos:prefLabel

per language one of the labels is arbitrarily selected for useas skos:prefLabel while the others are simply convertedinto skos:altLabels. A check is performed in order to de-tect overlaps in disjoint label properties (skos:prefLabel,skos:altLabel, and skos:hiddenLabel) and the super-fluous ones are removed. As a best practice, the cycles inthe concept hierarchy are removed, and finally extra whites-paces are removed from concept labels. The violations arealso reported to the ontology developer so he may fix theproblems in a controlled way instead of relying on the auto-matic fixing procedure. We also used Skosify to convert theontologies from the custom OWL format to SKOS for pub-lication. SKOS was chosen due to the amount of tools avail-able and for its suitability for our use case of light-weightontologies.

After the validation, the new version of the domain on-tology is merged (6) with the GUO (3) and the other domainontologies (4). The idea is to build a single representationof the linked ontology cloud (7) by merging equivalent con-cepts and giving them a single URI. The differing ontologyschemas are harmonised so that the end user does not getconfused with the ontology-specific structures and namingconventions.

When annotating using the linked ontology cloud, theannotator should be making choices between concepts asopposed to ontologies. To this end, the cloud is merged (6)into a single whole (7) by taking all the concepts linkedwith skos:exactMatch relations between them (marked asequivalent) and combining them. In case a clearer divisionbetween the development version of individual ontologiesand the published version of the cohesive cloud needs to bemade, new URIs can also be assigned to the concepts of thecloud. Here care needs to be taken in order to ensure that thesame concepts always map to the same URIs even in caseswhere that concept might at first originate from the upperontology and later have been moved to a domain ontologyor vice versa.

In FinnONTO, we gave all the concepts in the KOKOcloud new URIs in a new namespace. The combination ofthe ontologies listed in Table 1 resulted in a combined cloudof some 47,000 concepts. In order to allow the end user toselect only from a single domain, we assigned domain ontol-ogy specific concept types as subclasses of skos:Concept.

Once the ontology cloud is merged, it is ready for pub-lication (8). In accordance with the “open” part of the nameLOO, the cloud is published as Linked Data [13], so that in-dividual concept URIs can be referenced from various datasetsand URIs are resolvable to information about the conceptsin human- and machine-readable forms. The LOO cloud ispublished for humans via user interfaces for searching, brows-ing and visualising the ontologies, and as widgets for inte-grating ontologies into applications. For machine use, addi-tional REST and Web Service APIs are provided to facilitate

Page 10: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

10 Matias Frosterus et al.

even deeper integration of the ontologies. Finally, the ontol-ogy cloud is published as a SPARQL endpoint enabling adhoc queries for more complex needs.

These services were provided by the ONKI OntologyService [37], part of which was further developed by theNational Library of Finland in the form of Finto thesaurusand ontology service. These services act as repositories forvocabularies, thesauri, and ontologies, providing support forpublishing, finding and accessing them. The main user groupsof ONKI and Finto are ontology developers, content index-ers and information searchers. Ontology developers need away to visualise the ontology they are developing, and es-pecially in the context of the LOO cloud, where the domainontology developers map their ontologies only to the gen-eral upper ontology, there is a need for getting an overviewof the whole cloud. This way the developers can see howtheir domain ontologies are situated in the LOO cloud anddiscover possible overlaps with other domain ontologies.

Content indexers and information searchers need waysof finding ontologies and concepts suited for their needs.The Linked Open Ontology Cloud approach eliminates theneed for finding ontologies and making selections betweenthem, as one ontology system covering all domains of lifeaims to fulfill the needs of everyone. For finding suitableconcepts, ONKI/Finto service provides an ontology browserwhich visualises the ontologies as a tree hierarchy and showsother relations between concepts. The user may use auto-completion search for finding concepts based on their labels.In addition to dereferenceable URIs, machines are servedwith REST API, and a SPARQL endpoint16. The generalontology service software powering Finto is developed cur-rently by the National Library of Finland as an open sourceproject Skosmos17 and can be freely used to set up a similarservice. The ontology cloud is published under a permissiveCreative Commons Attribution license18.

5.2 Use Cases

During the FinnONTO research project, the adoption of KO-KO was hampered by the uncertainty regarding its future.With the ontology service project of the National Libraryof Finland, the development and publication of KOKO wasgiven sustainable governmental resources thus securing itsfuture. However, due to the length of the process of securingfunding, the wide-spread adoption of KOKO is only begin-ning.

KOKO and ONKI/Finto have been in daily use in manymuseums, such as the Espoo City Museum, where it has

16 http://api.finto.fi/sparql; The KOKO cloud is available in thenamed graph http://www.yso.fi/onto/koko/.

17 http://github.com/NatLibFi/Skosmos/18 http://creativecommons.org/licenses/by/3.0/

been integrated into the collection management system Kauko.The first large scale system using KOKO was the seman-tic cultural heritage portal CultureSampo19 [25] aggregatingcontents from various data sources. KOKO is a basis of thedeployed BookSampo20 [24] literature portal of the Finnishpublic libraries with some 60,000 monthly end users on theWeb.

At the moment, Finnish archives are integrating KOKOontologies via Finto into their common search registry ser-vice for metadata on both digital as well as conventionaldocuments with the AHAA project [19]. A recent companypilot user of KOKO has been the Swedish language divi-sion of the national public service broadcasting company ofFinland, Svenska YLE. They have used KOKO in the anno-tation of their web content and have complemented it withFreebase21 for instance references, such as people and or-ganisations. The pilot has been a success and they are con-sidering adopting the system in the Finnish part of YLE aswell as their archive.

The multi-disciplinary nature of KOKO means that itis especially suited to organisations that deal with cross-domain contents such as media organisations, museums, li-braries, and archives. Furthermore, using the concepts fromdomain ontologies links the content to more in-depth datafrom the specialist organisations that originally developedthe domain ontologies for their data annotation needs.

6 Conclusions and Discussion

We described a process for creating and managing a LinkedOpen Ontology Cloud, based on existing legacy thesauri. Toconclude, we compare our approach to related work, sum-marise our contributions, and suggest future work.

6.1 Related Work

Our work on linking ontological concepts used in annota-tions complements the idea of the Linked Open Data cloud,where emphasis is on data entity linking using (typically)owl:sameAs properties [11]. We also complement the workon Linked Open Vocabularies (LOV)22, which focuses onmetadata schema linking based on property hierarchies ratherthan linking domain ontologies. Linking the data throughontologies allows additional interoperability due to the in-ferenced knowledge gained through the shared ontology se-mantics [14]. Our approach follows this principle and pro-vides a framework for interlinked ontology cloud develop-

19 http://www.kulttuurisampo.fi/20 http://www.kirjasampo.fi/21 https://www.freebase.com/22 http://lov.okfn.org/dataset/lov/

Page 11: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies 11

ment. Backwards compatibility with existing annotations isretained by basing the ontologies on legacy thesauri in use.

Another example of highly linked ontologies can be foundin BioPortal23, an ontology repository that has features sup-porting collaborative (inter-)ontology development. In addi-tion to uploading new ontologies to BioPortal, users can alsocreate and upload mappings between the concepts of dif-ferent ontologies [28]. The mappings can be used to bridgeoverlapping ontologies or to extend general ontologies withspecialised ones. Users can comment and create discussionthreads on ontologies, their parts, and mappings, thus sup-porting a collaborative and open inter-ontology developmentprocess. However, a set of mapped ontologies does not ap-pear as a single whole to the user though one can browsethe ontologies by following the mappings between concepts.Moreover, the mappings are not utilised in the ontology de-velopment process to propagate the changes of an (upper)ontology to ontologies extending it, as they are in our LOOmodel.

In order to move from a group of ontologies into a co-herent ontology system, the ontologies need to be recon-ciled. Ontology reconciliation [12] is a broad term, cover-ing ontology merging, alignment, and integration. Most ofthe reconciliation methods are automatic or semi-automatic,which can lead to lower quality, especially if the ontologieswere originally expert-made [7]. Our approach emphasisesthe importance of manual development work in the recon-ciliation process.

In ontology modularisation [1], ontologies are dividedinto smaller interlinked parts to facilitate distributed devel-opment and re-use. Our approach merges ontologies basedon existing thesauri to form a Linked Open Ontology Cloud,while the development of the individual ontologies contin-ues in a modularised way. Furthermore, we present the re-sulting interlinked cloud as single whole so that the endusers do not have to make selections between ontologies.

There have been several previous efforts on building ageneral upper ontology [26] that can be used as a founda-tional basis for domain ontologies. Some of the upper on-tologies have been developed from scratch, while, e.g. theSuggested Upper Merged Ontology SUMO [27] was cre-ated by merging existing ontologies. We used the GeneralFinnish Ontology YSO, transformed from an existing gen-eral thesaurus YSA, as the upper ontology in our LinkedOpen Ontology Cloud. In contrast to the previous work onupper ontologies, which focuses on the creation of an upperontology, our work emphasises the model for managing thedomain ontologies as part of the ontology cloud and keep-ing them up to date and synchronised with the changes ofthe upper ontology.

Related to the principles of forming and managing theontology cloud presented in this paper, similar guidelines

23 http://bioportal.bioontology.org/

have been presented in the context of the OBO Foundryinitiative [30]. However, in OBO Foundry the focus is oncoordinating the development of different ontologies undershared principles, but not on merging them into a single on-tology cloud and the challenges therein. General, domain-independent ontology design principles have been proposedby several researchers [10,38]. Our model uses their ideas asa foundation, e.g. by organising orthogonal concept domainsinto separate ontologies and supporting ontologies of differ-ent granularity levels in the form of a GUO and the domainontologies extending it. The principles presented in this pa-per extend the general ontology design principles by cover-ing modelling issues related to the management of changesin a linked ontology cloud.

Keeping domain ontologies up to date with the changesof the GUO is closely related to the topic of ontology evo-lution, which concentrates on addressing the changes in on-tologies over time. Different change types have been listedin [33], whereas [20] considers more abstract change pat-terns constructed from atomic changes. According to [4]users would have liked to see explicitly when a concept cre-ated by them had been modified, indicating that not all thechanges occurring in an ontology are equally relevant to alldevelopers providing the impetus for our work on buildinga set of priorities for different changes. The detection ofchanges in an updated ontology can be done using logs [32]or by comparing two versions of the ontology [22], whichwas the approach we chose. Additionally, the extra chal-lenges of distributed ontology development have been ad-dressed in [20,23,18]. In our approach there is no assump-tion that all the ontology developers use the same ontologyeditor, as the support tools are implemented separately fromthe ontology development suite.

6.2 Contributions and Future Work

We presented the idea of the Linked Open Ontologies (LOO)for fostering data integration. LOO complements dataset en-tity linking in LOD and metadata schema linking in LOV.The realisation of the LOO cloud was achieved by facili-tating an environment of interconnected but easily manage-able set of cross-domain ontologies allowing for distributedontology development. This can be more efficient since themappings between ontologies can be re-used for differentdatasets.

Furthermore, we presented a detailed description of man-aging the overlaps between domain ontologies and the in-consistencies resulting from the asynchronous updating ofthe GUO compared with the domain ontologies. Finally, weimplemented a LOO in practice with sixteen ontologies anda full set of tools for the development cycle from a set ofthesauri to a fully realised Linked Open Ontology Cloud.Based on this work, we accompanied the LOO model with

Page 12: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

12 Matias Frosterus et al.

a set of seven principles that guide the process of buildingand managing an ontology cloud. The principles are generalenough to be applied to other ontology clouds in addition tothe one we have built.

A central challenge in the linking process is in main-taining the integrity of the transitive relations across differ-ent ontologies and we have achieved this through the useof a general upper ontology acting as a central hub for thelinking of various domain ontologies. We consider proactivelinking as an integral part of the development process itselfas opposed to simply mapping independently developed on-tologies.

The model was piloted by implementing it into prac-tice in extensive scale in various organisations encompass-ing many different domains. The model evolved based onlessons learned during the multi-year implementation anddevelopment process. Feedback was gathered from ontologydevelopers as well from the systems integrating the linkedontology cloud. Based on the success of the prototype, themodel is now being applied on a national scale in archives,libraries, and museums, as well as in ministries and othergovernmental agencies.

For our future research, we are focusing on building bet-ter tools and refining the processes for tracking and com-municating the changes in the GUO to the domain ontologydevelopers. To this end, we are also devising a more formaladministrative process for the development and overall co-ordination of the KOKO cloud. Since it is a joint operationby over a dozen different organizations the particulars of theadministration need to be both flexible yet well-defined.

References

1. S. B. Abbes, A. Scheuermann, T. Meilender, and M. d’Aquin.Characterizing modular ontologies. In Proceedings of the 6th In-ternational Workshop on Modular Ontologies. CEUR WorkshopProceedings, http://CEUR-WS.org, July 2012.

2. J. Aitchison, A. Gilchrist, and D. Bawden. Thesaurus Construc-tion and Use: A Practical Manual. Aslib IMI, 2000.

3. J. Bhogal, A. Macfarlane, and P. Smith. A review of ontologybased query expansion. Information Processing & Management,43(4):866–886, 2007.

4. S. Braun, A. Schmidt, A. Walter, and V. Zacharias. The OntologyMaturing Approach to Collaborative and Work Integrated Ontol-ogy Development: Evaluation Results and Future Directions. InProceedings of the International Workshop on Emergent Seman-tics and Ontology Evolution at (ISWC 2007), pages 5–18, 2007.

5. F. Burstein and S. Gregor. The systems development or engineer-ing approach to research in information systems: An action re-search perspective. In Proceedings of the 10th Australasian Con-ference on Information Systems, pages 122–134. Victoria Univer-sity of Wellington, New Zealand, 1999.

6. P. Deolasee, A. Katkar, A. Panchbudhe, K. Ramamritham, andP. Shenoy. Adaptive Push-Pull: Disseminating Dynamic WebData. In Proceedings of the 10th international conference onWorld Wide Web (WWW 2001), pages 265–274, 2001.

7. Z. El Jerroudi and J. Ziegler. iMERGE: Interactive OntologyMerging. In Proceedings of the International Conference on

Knowledge Engineering and Knowledge Management (EKAW2008), pages 52–56, 2008.

8. J. Euzenat and P. Shvaiko. Ontology Matching. Springer–Verlag,2007.

9. A. Gangemi, N. Guarino, C. Masolo, A. Oltramari, and L. Schnei-der. Sweetening ontologies with dolce. In Proceedings of the 13thInternational Conference on Knowledge Engineering and Knowl-edge Management. Ontologies and the Semantic Web, EKAW ’02,pages 166–181. Springer-Verlag, 2002.

10. T. Gruber. Toward principles for the design of ontologies used forknowledge sharing. International Journal of Human-ComputerStudies, 43(5-6):907–928, 1995.

11. H. Halpin, P. J. Hayes, J. P. McCusker, D. L. McGuinness, andH. S. Thompson. When owl:sameAs isn’t the same: An analysis ofidentity in linked data. In P. F. Patel-Schneider, Y. Pan, P. Hitzler,P. Mika, L. Zhang, J. Z. Pan, I. Horrocks, and B. Glimm, editors,The Semantic Web – ISWC 2010, volume 6496 of Lecture Notesin Computer Science, pages 305–320. Springer Berlin Heidelberg,2010.

12. A. Hameed, A. Preece, and D. Sleeman. Ontology reconciliation.In S. Staab and R. Studer, editors, Handbook on ontologies, pages231–250. Springer-Verlag, Germany, 2004.

13. T. Heath and C. Bizer. Linked Data: Evolving the Web into aGlobal Data Space (1st edition). Morgan & Claypool, Palo Alto,California, 2011.

14. P. Hitzler and F. van Harmelen. A reasonable semantic web. Se-mantic Web Journal, 1(1-2):39–44, 2010.

15. E. Hyvonen. Publishing and using cultural heritage linked dataon the semantic web. Morgan & Claypool, Palo Alto, California,October 2012.

16. E. Hyvonen, K. Viljanen, J. Tuominen, and K. Seppala. Build-ing a National Semantic Web Ontology and Ontology ServiceInfrastructure—The FinnONTO Approach. In Proceedings of theEuropean Semantic Web Conference (ESWC 2008), pages 95–109,2008.

17. K. Jarvelin, J. Kekalainen, and T. Niemi. ExpansionTool:Concept-based query expansion and construction. Information Re-trieval, 4(3/4):231–255, 2001.

18. E. Jimenez Ruiz, B. Cuenca Grau, I. Horrocks, and R. Berlanga.Supporting concurrent ontology development: Framework, algo-rithms and tool. Data & Knowledge Engineering, 70(1):146–164,2011.

19. J. Kilkki, O. Hupaniittu, and P. Henttonen. Towards the new eraof archival description – the Finnish approach. In Proceedings ofthe International Council on Archives Congress, August 2012.

20. M. Klein. Change Management for Distributed Ontologies. PhDthesis, Vrije Universiteit Amsterdam, 2004.

21. D. Kless, S. Milton, and E. Kazmierczak. Relationships and relatain ontologies and thesauri: Differences and similarities. AppliedOntology, 7(4):401–428, 2012.

22. K. Kozaki, E. Sunagawa, Y. Kitamura, and R. Mizoguchi. Aframework for cooperative ontology construction based on depen-dency management of modules. In Proceedings of the Interna-tional Workshop on Emergent Semantics and Ontology Evolution(ISWC2007), pages 33–44, 2007.

23. A. Maedche, B. Motik, and L. Stojanovic. Managing multipleand distributed ontologies on the Semantic Web. The Interna-tional Journal on Very Large Data Bases (The VLDB Journal),12(4):286–302, 2003.

24. E. Makela, K. Hypen, and E. Hyvonen. BookSampo—lessonslearned in creating a semantic portal for fiction literature. In Pro-ceedings of ISWC-2011, Bonn, Germany. Springer–Verlag, 2011.

25. E. Makela, T. Ruotsalo, and E. Hyvonen. How to deal with mas-sively heterogeneous cultural heritage data—lessons learned inCultureSampo. 3(1), 2012.

26. V. Mascardi, V. Cordı, and P. Rosso. A comparison of upperontologies. In Proceedings of the 8th Workshop Dagli Oggetti

Page 13: Linked Open Ontology Cloud – Managing a System of ... · ”Linked Open Vocabularies”5 that focus on mapping meta-data schemas (e.g. Dublin Core, FOAF, and Bibo) onto each other

Linked Open Ontology Cloud – Managing a System of Interlinked Cross-domain Light-weight Ontologies 13

agli Agenti: ”Agenti e Industria: Applicazioni tecnologiche degliagenti software” (WOA 2007), pages 55–64, September 2007.

27. I. Niles and A. Pease. Towards a standard upper ontology. In Pro-ceedings of the 2nd International Conference on Formal Ontologyin Information Systems (FOIS 2001), pages 2–9. ACM, October2001.

28. Natalya F. Noy, Nicholas Griffith, and Mark A. Musen. Collect-ing community-based mappings in an ontology repository. In Pro-ceedings of the 7th International Semantic Web Conference (ISWC2008), pages 371–386. Springer–Verlag, October 2008.

29. S. Pessala, K. Seppala, O. Suominen, M. Frosterus, J. Tuominen,and E. Hyvonen. MUTU: An analysis tool for maintaining a sys-tem of hierarchically linked ontologies. In Proceedings of theWorkshop on Ontologies come of Age Workshop (ISWC 2011),2011.

30. B. Smith, M. Ashburner, C. Rosse, J. Bard, W. Bug, W. Ceusters,L. J. Goldberg, K. Eilbeck, A. Ireland, C. J. Mungall, The OBIConsortium, N. Leontis, P. Rocca-Serra, A. Ruttenberg, S.-A. San-sone, R. H. Scheuermann, N. Shah, P. L. Whetzel, and S. Lewis.The OBO Foundry: coordinated evolution of ontologies to supportbiomedical data integration. Nature Biotechnology, 25(11):1251–1255, 2007.

31. S. Staab and R. Studer, editors. Handbook on Ontologies (2ndEdition). Springer–Verlag, 2009.

32. L. Stojanovic, A. Maedche, B. Motik, and N. Stojanovic. User-driven ontology evolution management. In Proceedings of theInternational Conference on Knowledge Engineering and Knowl-edge Management (EKAW 2002), pages 133–140, 2002.

33. E. Sunagawa, K. Kozaki, Y. Kitamura, and R. Mizoguchi. An en-vironment for distributed ontology development based on depen-dency management. In Proceedings of the 2nd International Se-mantic Web Conference (ISWC 2003), pages 453–468. Springer–Verlag, 2003.

34. O. Suominen and E. Hyvonen. Improving the quality of SKOSvocabularies with Skosify. In Proceedings of the 18th Inter-national Conference on Knowledge Engineering and KnowledgeManagement (EKAW 2012), pages 383–397. Springer-Verlag, Oc-tober 2012.

35. O. Suominen, S. Pessala, J. Tuominen, M. Lappalainen, S. Nykyri,H. Ylikotila, M. Frosterus, and E. Hyvonen. Deploying nationalontology services: From ONKI to Finto. In Proceedings of the In-dustry Track at the International Semantic Web Conference 2014.CEUR Workshop Proceedings, http://CEUR-WS.org, Octo-ber 2014.

36. J. Tuominen, M. Frosterus, K. Viljanen, and E. Hyvonen. ONKISKOS server for publishing and utilizing SKOS vocabularies andontologies as services. In Proceedings of the 6th European Se-mantic Web Conference (ESWC 2009), pages 768–780. Springer–Verlag, 2009.

37. K. Viljanen, J. Tuominen, and E. Hyvonen. Ontology libraries forproduction use: The Finnish ontology library service ONKI. InProceedings of the ESWC 2009, Heraklion, Greece, pages 781–795. Springer–Verlag, 2009.

38. X. Wang, J. S. Almeida, and A. L. Oliveira. Ontology design prin-ciples and normalization techniques in the web. In Proceedingsof the 5th International Workshop on Data Integration in the LifeSciences (DILS 2008), pages 28–43. Springer–Verlag, June 2008.