9
Ontology based semantic conflicts resolution in collaborative editing of design documents Ning Gu * , Jun Xu, Xiaoyuan Wu, Jiangming Yang, Wei Ye Department of Computing and Information Technology, Fudan University, 220 Handan Road, Shanghai, People’s Republic of China Abstract Semantic conflicts happen frequently during collaborative editing of design documents. The problem of semantic consistency of words and the maintenance of users’ editing intentions are the two major challenges when resolving those semantic conflicts. Based on WordNet this paper presents an ontology description language—FLoDL and introduces it to describe the global ontology library (GOL) and the individual ontology library (IOL) in collaborative editing. By this means, we reconstruct the architecture of collaborative editing and propose a mixed peer-to-peer structured semantic collaborative editing architecture. Then a new algorithm for inserting operations, from which semantic conflicts are often caused, is designed to solve the problem of semantic consistency of words. Moreover, by adding users’ individualized semantic information into their IOL, we provide the users with individualized services and successfully maintain users’ editing intensions. Finally, through some detailed experiments, we perform a compared analysis to show that semantic collaborative editing not only keeps smaller clients (less storage space) but also localizes many editing operations and thereby improves the performance of collaborative editing. q 2005 Elsevier Ltd. All rights reserved. Keywords: Collaborative design; Collaborative editing; Ontology; Ontology description language; Semantic conflicts; WordNet 1. Introduction In collaborative design designers often insert, delete and update design documents collaboratively [12]. That is, designers often edit design documents collaboratively through network. There are two types of collaborative editing: synchronous and asynchronous. In this paper, we mainly discuss synchronous case. Concurrence control is essential in synchronous collaborative editing. It consists of the operation consistency control and the semantic consist- ency control [1,5]. Now there are many solutions to concurrence control in collaborative editing, such as locks. There are different types of locks: the pessimistic lock, the semi-optimistic lock, the optimistic lock and etc. The more optimistic the lock is, the more cost the system undo has to spend, which leads to low system efficiency; on the other hand, the more pessimistic the lock is, the weaker the system’s response ability is [15]. Therefore, locks fail to meet the strict response requirement in synchronous collaborative editing. Besides locks, serialization, operation transform- ation [2], Reduce [3,4] have also been suggested to solve concurrence control in synchronous collaborative editing. However, all these methods only solve the syntax problems. There are also many semantic problems in synchronous collaborative editing such as semantic consistency of words, which belongs to the semantic conflict problem [5]. If the semantic consistency of words is not maintained, design documents completed by multi-users using collaborative editing will be full of semantic errors. So far, there is little achievement on solving the semantic consistency of words in synchronous collaborative editing except for the user- centered model which is proposed by Xue [6]. In this model, when dealing with the problem of semantic consistency of words in collaborative editing, according to different users’ operations, many different document versions are generated. After that users will discuss these versions to get one which is agreed on by most users. This model needs users to discuss many times in order to solve the semantic consistency problem of one word. Since there might be plenty of words which have semantic consistency problems in synchronous [7] collaborative editing of design documents, this model has a poor efficiency. Moreover, because many sentences and phrases of design documents are produced through users’ discussions, the writing style of Advanced Engineering Informatics 19 (2005) 103–111 www.elsevier.com/locate/aei 1474-0346/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.aei.2005.05.005 * Corresponding author. Tel: C86 21 5566 4465. E-mail address: [email protected] (N. Gu).

Ontology based semantic conflicts resolution in collaborative editing of design documents

  • Upload
    ning-gu

  • View
    217

  • Download
    2

Embed Size (px)

Citation preview

Ontology based semantic conflicts resolution in collaborative

editing of design documents

Ning Gu*, Jun Xu, Xiaoyuan Wu, Jiangming Yang, Wei Ye

Department of Computing and Information Technology, Fudan University, 220 Handan Road, Shanghai, People’s Republic of China

Abstract

Semantic conflicts happen frequently during collaborative editing of design documents. The problem of semantic consistency of words and

the maintenance of users’ editing intentions are the two major challenges when resolving those semantic conflicts. Based on WordNet this

paper presents an ontology description language—FLoDL and introduces it to describe the global ontology library (GOL) and the individual

ontology library (IOL) in collaborative editing. By this means, we reconstruct the architecture of collaborative editing and propose a mixed

peer-to-peer structured semantic collaborative editing architecture. Then a new algorithm for inserting operations, from which semantic

conflicts are often caused, is designed to solve the problem of semantic consistency of words. Moreover, by adding users’ individualized

semantic information into their IOL, we provide the users with individualized services and successfully maintain users’ editing intensions.

Finally, through some detailed experiments, we perform a compared analysis to show that semantic collaborative editing not only keeps

smaller clients (less storage space) but also localizes many editing operations and thereby improves the performance of collaborative editing.

q 2005 Elsevier Ltd. All rights reserved.

Keywords: Collaborative design; Collaborative editing; Ontology; Ontology description language; Semantic conflicts; WordNet

1. Introduction

In collaborative design designers often insert, delete and

update design documents collaboratively [12]. That is,

designers often edit design documents collaboratively

through network. There are two types of collaborative

editing: synchronous and asynchronous. In this paper, we

mainly discuss synchronous case. Concurrence control is

essential in synchronous collaborative editing. It consists of

the operation consistency control and the semantic consist-

ency control [1,5]. Now there are many solutions to

concurrence control in collaborative editing, such as locks.

There are different types of locks: the pessimistic lock, the

semi-optimistic lock, the optimistic lock and etc. The more

optimistic the lock is, the more cost the system undo has to

spend, which leads to low system efficiency; on the other

hand, the more pessimistic the lock is, the weaker the

system’s response ability is [15]. Therefore, locks fail to meet

the strict response requirement in synchronous collaborative

1474-0346/$ - see front matter q 2005 Elsevier Ltd. All rights reserved.

doi:10.1016/j.aei.2005.05.005

* Corresponding author. Tel: C86 21 5566 4465.

E-mail address: [email protected] (N. Gu).

editing. Besides locks, serialization, operation transform-

ation [2], Reduce [3,4] have also been suggested to solve

concurrence control in synchronous collaborative editing.

However, all these methods only solve the syntax problems.

There are also many semantic problems in synchronous

collaborative editing such as semantic consistency of words,

which belongs to the semantic conflict problem [5]. If the

semantic consistency of words is not maintained, design

documents completed by multi-users using collaborative

editing will be full of semantic errors. So far, there is little

achievement on solving the semantic consistency of words in

synchronous collaborative editing except for the user-

centered model which is proposed by Xue [6]. In this

model, when dealing with the problem of semantic

consistency of words in collaborative editing, according to

different users’ operations, many different document

versions are generated. After that users will discuss these

versions to get one which is agreed on by most users. This

model needs users to discuss many times in order to solve the

semantic consistency problem of one word. Since there

might be plenty of words which have semantic consistency

problems in synchronous [7] collaborative editing of design

documents, this model has a poor efficiency. Moreover,

because many sentences and phrases of design documents are

produced through users’ discussions, the writing style of

Advanced Engineering Informatics 19 (2005) 103–111

www.elsevier.com/locate/aei

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111104

the documents may not comply with every single user’s own.

This is called the maintenance of users’ editing intentions

problem which is another kind of semantic conflicts. Users

will feel uncomfortable due to the strange writing style of

documents when they are synchronously collaboratively

editing design documents. This will have a bad effect on the

quality of documents. Therefore the maintenance of users’

editing intentions is also of great importance in synchronous

collaborative editing. In this paper, we use ontology in

synchronous collaborative editing to solve the above two

semantic problems. Ontology is an explicit specification of

conceptualization [13,16]. It has powerful syntax description

ability. With the appearance of the ontology description

languages such as OIL, SHOE, XOL [9,11], ontology can be

used in WEB, which further improves syntax description

ability of ontology [10,13,14]. Based on ontology, this paper

reconstructs the architecture of collaborative editing. We not

only solve the semantic consistency of words efficiently in

synchronous collaborative editing of design documents but

also provide users with individualized services and resolve

the maintenance of users’ editing intention problems

successfully. As a result, the performance of collaborative

editing is fundamentally improved.

This paper first introduces the semantic problems in the

collaborative editing of design documents. Then we discuss

the ontology description and use ontology library to

reconstruct the architecture of collaborative editing. After

that, by making some detailed experiments, we perform a

compared analysis to show that semantic collaborative

editing not only keeps smaller clients (less storage space)

but also localizes many editing operations and thereby

improves the performance of collaborative editing. Finally,

we discuss the future development of collaborative editing.

2. Semantic conflicts in collaborative editing of designdocuments

2.1. The semantic consistency problem of words

Here is an example: suppose there is a sentence ‘Tom

will buy a’ in the initial design document. Two users are

editing this document using the traditional collaborative

editing. User S1 adds ‘bike’ after ‘a’ in this sentence and

user S2 adds ‘bicycle’ at the same position. If these

operations are performed by traditional algorithms, this

sentence will be ‘Tom will buy a bike bicycle’ which is a

wrong sentence. However, user S1 and S2 both want to write

the sentence ‘Tom will buy a bike (bicycle)’. This is a

semantic consistency problem of words [5], which is quite

common in collaborative editing of design documents.

2.2. The maintenance of user’s editing intention

After solving the semantic consistency of words, the

writing style of the design documents completed by

collaborative editing may not comply with writers’ own

expression styles, that is, users’ editing intentions may not

be maintained. For example, while user S2 adds ‘bicycle’

but the sentence is changed into ‘Tom will buy a bike’. Only

when he knows that ‘bicycle’ and ‘bike’ are synonyms and

does not care such replacement will he accept this sentence.

However, if he prefers to use ‘bicycle’ or he does not know

the word ‘bike’, he will not accept this modification. So we

should provide users with individualized semantic services

to meet the need of different users’ editing intentions. We

call these individualized semantic services the maintenance

of users’ editing intentions, which is also of great

importance in collaborative design editing.

3. Semantic collaborative editing

3.1. The ontology description in semantic collaborative

editing

In semantic collaborative editing, we use WordNet to

resolve the semantic conflicts. WordNet is a semantic

network of concepts organized as taxonomy. It can describe

the connotations and relations of concepts in conceptualiz-

ation and is a huge ontology resource [17]. Originally, it was

designed on the grounds of psycho-linguistic, trying to

simulate the way concepts are organized in human brain.

However, its broad applications in the field of Natural

Language Processing have led to the opening of its ports in

the official languages in the European Union, and in many

other languages such as Catalan and Basque. In order to

satisfy the needs of different fields, WordNet has been

developed very intricately [18]. To meet the strict

requirement on response time in semantic collaborative

editing, we improve WordNet by designing a Four-Layered

ontology Description Language, FLoDL. In this paper, we

use FLoDL to describe the ontology in WordNet. Compared

with WordNet, FLoDL is a concise ontology description

language. It adopts the knowledge description method of

framework theory and conforms to XML syntax, which

makes it easy for computers to process and for users to

understand.

FLoDL has four layers: object layer, attribute layer,

definition layer and relation layer. Object layer describes the

relation between the connotation concepts of objects and

their different representations. Attribute layer describes

objects’ attributes, each of which includes an attribute name

and its corresponding value. Definition layer defines objects

in informal natural language. Relation layer describes the

relation among different objects in ontology. This paper

mainly discusses the object layer. But relation layer is also

covered so as to give a comprehensive description of

ontology. Object layer description is based on concepts,

connecting different names and representations that belong

to the same connotation concepts together. One concept

may have different terms and representations. In this paper,

Table 1

Eight object relations

Relation Meaning Example

PAR Parent Vehicle)car

CHD Child Roadster/car

SIB Sibling Car4truck

RB Broader CarRwagon

RN Narrower Wagon% car

RS Similarity Carwauto

AQ Allowed Excellent[car

QB Qualified-by Car / excellent

Table 2

‘Part_of_speech’ and ‘stringtype’ attributes

Part_of_speech String form

Noun singular

plural

Verb preterite

pluperfect

Adjective comparative_degree

superlative_degree

Adverb comparative_degree

superlative_degree

Concepts(CUI) C0000001

bike

Strings(SUI)S0000001bike(singular)S0000002bikes(plural)S0000003bicycle(singular)S0000004bicycle(plural)

L0000002bicycle

Terms(TUI)L0000001

bike

Fig. 1. Description of ontology ‘bike’ using FLoDL.

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111 105

we organize different terms and representing strings of the

same concepts by a three-level structure which is made up

of string (the first level) term (the second level) and concept

(the third level). String is the original word without

abstraction and has a unique string identifier (SUI). The

upper and lower cases of the same word are different strings

and have different SUI. Term is the linguistic expression of

string and different strings with the same original word but

different grammatical forms all belong to the same term.

Term of ontology is tied with its different strings by the

term’s unique concept identifier (TUI). Concept is an object

abstracted from a group of terms having the same

connotation and also has a unique concept identifier (CUI)

. In this way, different strings are associated with one term

and different terms with a similar meaning are associated

with one concept. As a simple example, Fig. 1 shows how to

use FLoDL to describe ontology ‘bike’. In this figure, ‘bike’

can be replaced by ‘bicycle’ and ‘bikes’ can substitute

‘bicycles’.

The object relations referred to in FLoDL are not the

specific relations between objects but abstract relations

based on hierarchy. The most common object relations are

parent relation, child relation and sibling relation inherited

from the hierarchical relation. Besides, there are some other

non-hierarchical relations, such as similarity relation,

broader relation, narrower relation, allowed qualification

relation, and qualified-by relation. Table 1 shows the eight

object relations defined by FLoDL.

The semantic collaborative editing needs to resolve two

operations with semantic conflicts into an equivalent correct

operation according to users’ writing styles. So the semantic

collaborative editing should be able to replace conflicting

strings with others of the same grammatical form. For

example ‘bikes’ is replaced by ‘bicycles’ based on the plural

form of noun. This string replacement can hardly be

implemented in WordNet. We propose ‘part_of_speech’

and ‘string form’ attributes in FLoDL to realize this

replacement. Table 2 shows these attributes’ values in

FLoDL.

The ‘part of speech’ attribute has four types: noun, verb,

adjective and adverb. Each ‘part of speech’ has different

‘string forms’. For instance, if we want to replace ‘bikes’

with ‘bicycles’, we should first query the ‘string form’ of

‘bikes’ and then use ‘bicycles’ which belongs to term

‘bicycle’ and has the same ‘string form’ with ‘bikes’ to

replace ‘bikes’. FLoDL can complete this type of

replacement efficiently and helps to solve the semantic

conflicts in collaborative editing.

FLoDL is a symbol system composed of a set of

components, some of which are optional. We use

component1 to represent those components that must at

least appear once and component* representing those that

can be used any time. The syntax of FLoDL is:

(1) Name: the name of ontology.

(2) CUI: the unique concept identifier of ontology.

(3) Documentation: the documents describing ontology.

For example, ontology definitions and its expla-

nations.

(4) Part_of_speech: the part of speech of ontology.

(5) Term1: the term of the concept that ontology belongs

to. A term is composed of strings and has some

components:

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111106

(a) TUI: the unique term identifier.

(b) String1: string.

† SUI: the unique string identifier.

† Stringname: the name of the string.

† Stringform: the type of the string.

(6) PAR*: the parent relation. It consists of the CUI of

object and tree-number. Tree-number is the number of

the tree where this object is. The same object may

exist in different trees. The object we are trying to

define is the child of it.

(7) CHD*: the child relation. It consists of the CUI of

object and tree-number. The object we are trying to

define is the parent object of it.

(8) SIB*: the sibling relation. It consists of the CUI of

object and tree-number. The object we are trying to

define is the sibling object of it.

(9) RS*: the similarity relation. It consists of the CUI of

object and tree-number. The object we are trying to

define is the similar object of it.

(10) RB*: the broader relation. It consists of the CUI of

object and tree-number. The object we are trying to

define is the broader object of it.

(11) RN*: the narrower relation. It consists of the CUI of

object and tree-number. The object we are trying to

define is the narrower object of it.

(12) AQ*: the allowed qualification relation. It consists of

the CUI of object and tree-number. The object we are

trying to define is the allowed qualification object of it.

(13) QB*: the qualified-by relation. It consists of the CUI of

object and tree-number. The object we are trying to

define is the qualified-by object of it.

For example, we use FLoDL to describe ontology bike as

follows:

!ontology-defO!ontology nameZ‘bike’/O!CUIOC0000001!/CUIO

!documentationOA bike.!/documentationO!part_of_speechOnoun!/part_of_speechO

!termO!TUIO L0000001!/TUIO!termnameObike!/termnameO!stringO!SUIO L0000001!/SUIO!stringnameObike!/stringnameO

!stringformOsingular!/stringformO!/stringO!stringO

!SUIO S0000002!/SUIO!stringnameObikes!/stringnameO!stringformOplural!/stringformO

!/stringO!/termO!termO

!TUIO L0000002!/TUIO

!termnameObicycle!/termnameO!stringO!SUIO S0000003!/SUIO!stringnameO bicycle !/stringnameO!stringformOsingular!/stringformO

!/stringO!stringO

!SUIO S0000004!/SUIO!stringnameO bicycles!/stringnameO!stringformOplural!/stringformO

!/stringO!/termO

!/ontology-defO

3.2. The architecture of semantic collaborative editing

The structure of traditional collaborative editing is Peer–

Peer. Since it can not express semantic, it is not able to solve

the semantic consistency problem of words and maintain

users’ editing intentions. The traditional ontology library is

centralized. As one server must process all queries from the

clients, the ontology library has poor response performance,

which largely holds back the application of collaborative

editing. To overcome these drawbacks we reconstruct the

architecture of traditional collaborative editing and bring

forward a new semantic collaborative editing architecture.

The semantic collaborative editing is different from

traditional collaborative editing and has a mixed Peer–

Peer architecture. It resolves the operations that have

semantic conflicts by querying the Global Ontology Library

(Database) (GOL) and the Individuation Ontology Library

(Database) (IOL). The IOL contains the personal semantic

information of users and provides users with individualized

services in order to fit users’ expression styles and solve the

maintenance of users’ editing intention problems. More-

over, every IOL in clients is a cache of GOL, which makes

the operations that have semantic conflicts processed

locally. The network communication load is greatly

reduced. Fig. 2 shows the architecture of semantic

collaborative editing.

This architecture consists of four parts: server, GOL

(GODB), client and IOL (IODB). Server accepts operations

from clients and solves the semantic consistency problems

of words by querying GOL. Then server forwards the

processed operations to other clients. If it is an inserting

operation, the server should also query GOL to see whether

there is ontology description information of the inserted

words in clients. If not, the server sends the needed

information together with forwarded operations to clients

including the client that initiates this inserting operation.

GOL is placed in server. The WordNet ontology infor-

mation described by FLoDL is stored in GOL and is used to

verify semantic conflicts. Client executes local operations

and concurrent operations from other clients. Clients also

store the ontology description information from server into

Server S9

Client S8

Client S2 Client S7

Client S3 Client S6

Client S4 Client S5

Client S1IOL s1

IOL s2

IOL s3

IOL s4

IOL s7

IOL s6

IOL s5

GOL s8

IOL s8

Fig. 2. The architecture of the semantic collaborative editing.

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111 107

their corresponding IOL and resolve semantic conflicts by

querying IOL. Every client in collaborative editing has an

IOL. It stores the personal semantic information and some

of ontology descriptions that are a subset of those in GOL

and can be updated by clients automatically.

3.3. Solutions to the semantic conflict problem

3.3.1. Solution to the semantic consistency of words

After ontology is introduced into collaborative editing,

the semantic consistency of words in deletion, update and

insertion operations is effectively solved, especially in

concurrent inserting operations. Before demonstrating the

inserting algorithm of the semantic collaborative editing we

introduce three functions:

(1) Equal (string, string): this function determines whether

two strings belong to the same concept.

(2) GetCUI (string): this function returns the CUI of the

object.

(3) GetConcept (CUI)/ GetConcept (string): this function

returns the object’s name.

In the following we designed the inserting algorithm of

the semantic collaborative editing:

1. Client s1 calls the method r1 !1,V1, insert (string1, i)Oand executes it locally. Then client s1 broadcasts and

checks its IOL. If the word inserted is not in its IOL, it

queries the word’s ontology information from GOL and

updates its IOL.

2. Client s1 gets TUI and CUI of string1 by querying its

IOL. According to CUI it judges whether there are

synonyms in the same place. If yes, they are merged.

3. If there are no synonyms client s1 asks users to choose the

preferred TUI representations and matches the words

according to SUI’s attributes. If the SUI’s corresponding

attributes do not exist, it will ask users to input them.

4. Client s2 calls the method r2!2, V2, insert (string2, i)Oand executes it locally. Then it will receive the method

call r1 from client s1 and repeats step (1–3). When it

realizes r1 and r2 are concurrent inserting operations at

the same position and they are synonyms by querying

IOL, it modifies method call r2 !2, V2, insert (string2,

i)O to !1, V1, insert (GetConcept (string1), i)O but

does not change the document in client s2.

5. For other operations without insertion conflicts, clients

execute step (1–3) straightly and ask users to choose their

preferred personal representations.

6. Client s3 executes step 4 first. When the user of client s3

wants to define string3 as individualized word of

GetConcept (string1), client s3 replaces GetConcept

(string1) with string3 and updates its IOL to record this

personal choice.

In the previous example, the algorithm finds that ‘bike’

and ‘bicycle’ belong to the same object concept, and then

it chooses the word appointed by CUI and drops the

others. In this way, the semantic collaborative editing

solves the semantic consistency of words. Fig. 3 shows

this process.

3.3.2. Maintenance of users’ editing intentions

To maintain users’ editing intentions, we add users’

individualized semantic information into their IOL. Such

extended IOL can provide users with individualized

services to maintain users’ editing intentions. This process

works as follows: after cracking the semantic consistency

problem of words, the document in each client remains

unchanged, preserving users’ writing styles. Users’ selec-

tions of words are stored into their IOL to provide them in

future with individualized semantic services to maintain

their editing intentions.

In previous example, after solving the semantic consist-

ency of words, user S1 inputs the same word as the selected

one, so he need not change his word. On the contrary, since

user S2 inputs the word different from the selected one, our

algorithm queries IOL of user S2 to get ‘bicycle’ which is the

S2’s preferred expression of ‘bike’ and changes the sentence

Tom will buy a bike bicycle

Tom will buy a

bike bicycle

S1 S2

Tom will buy a bike

Tom will buy a

bike bicycle

S1 S2

Equal (bike, bicycle)

Get Concept (get CUI (bicycle))

Fig. 3. Solving the semantic consistency problem of words.

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111108

‘Tom will buy a bike’ in the design document into ‘Tom will

buy a bicycle’ which is displayed in user S2’s client. At the

same time ‘bike’ and ‘bicycle’ are stored into the IOL of the

user S1 and S2, respectively.

At the beginning, users can only use the default

configuration of IOL because individualized IOL have not

been established. Therefore, when dealing with synonyms,

each user is asked to select a preferred one from those

synonyms. This seems to be bothersome. But in fact, this is

an inevitable and most important step in building personal

IOL. When enough information is collected, IOL can

Tom will buy

Tom will b

bike

S1

Equal (bike, b

Get Concept (get CTom will buy a bike

Fig. 4. Maintenance of use

maintain users’ editing intentions through providing users

with different personal documents views but storing the

same design document in the server. That is, after the user’s

IOL has been established, editing system will select a right

word from synonyms automatically according to IOL

instead of turning to users. For example, there is a ‘bike’

in the design document. The document in the client s1 is

‘bike’ while that in the client s2 is ‘bicycle’, which both well

conform to each user’s editing intention. Meanwhile the

document in the server remains ‘bike’. Fig. 4 shows the

process of maintaining user’s editing intention.

a bike

uy a

bicycle

S2

icycle)

UI (bicycle)) Tom will buy a bicycle

r’s editing intention.

Table 4

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111 109

4. Experiment and analysis

The relation between the words’ appearance frequencies range and the

query-hit ratio

The words’ appear-

ance frequencies

range

The number of words

appearing in the

established IOL

The query-hit ratio

of the established

IOL (%)

S1 63413 91.75

S2 60465 87.48

S3 58882 85.19

S4 56966 82.42

S5 55671 80.55

S6 54883 79.41

S7 53816 77.86

S8 52567 76.06

S9 51951 75.16

S10 51393 74.36

4.1. The performance of the semantic collaborative editing

In collaborative editing because of the network delay, the

cost of local operation is far smaller than that of remote

operation. So the ratio of locally successful query in IOL

which is the cache of GOL determines the performance of

collaborative editing. We measure this ratio by a group of

experiments. We select WordNet [17] as the ontology

source of GOL and use FLoDL to describe the ontology in

WordNet. There are 2,62,046 SUI-leveled words in

WordNet. We randomly select ten design documents

about ships as the training set. We suppose that the words

of these documents that appear in WordNet are likely to be

used by users and we put these words into IOL. Table 3

shows the appearance frequencies of these words in the

documents in WordNet.

According to Table 3 we reach a conclusion that with the

growth of words’ appearance frequencies, the ratio of the

size of IOL established by these words to the size of GOL

decreases.

We select another 40 design documents about ships as

the input set to check the query-hit ratio of the IOL

established by the above training sets. We find out that the

appearance times in GOL of the words in these documents

are 69,118 including repetitions. Table 4 shows the relation

between the words’ appearance frequencies range and the

query-hit ratio of the established IOL

According to Table 4 we arrive at the conclusion that

with the growth of the words’ appearance frequencies, the

query-hit ratio of the established IOL decreases relatively

slowly.

Comparing Tables 3 and 4 we conclude:

– The ratio of the size of established IOL to the size

of GOL is 2.43% which economizes the storage

space of clients.

– The query-hit ratio using IOL is very high

(91.75%), which largely reduces the network

Table 3

The relation between the words’ appearance frequencies range and the size

proportion

The words’

appearance

frequencies

The number of words The ratio of the size of

established IOL to the

size of GOL (%)

S1 13041 2.43

S2 12758 2.38

S3 12506 2.33

S4 12245 2.28

S5 11997 2.23

S6 11832 2.20

S7 11652 2.17

S8 11477 2.14

S9 11293 2.10

S10 11149 2.08

communication load and improves the performance

of collaborative editing.

– We can reduce the size of IOL without large

decrease of the query-hit ratio. According to Tables

3 and 4we know that when the size proportion

decreases from 2.43 to 2.08% the query-hit ratio

only decreases from 91.75 to 74.36%, which

further saves the storage space of clients.

4.2. Compared analysis of the performance

of collaborative editing

There are one server and n clients during collaborative

editing. We suppose the communication cost between the

server and clients is the same as that between clients and

define it as one. The overall cost of collaborative editing is

T. The communication cost of server is S and that of clients

is C. The proportion of the server’s communication cost to

the overall is Ps and the proportion of the clients’

communication cost to the overall is Pc. In the user-

centered collaborative editing [6] when the server receives

the editing operations from n clients and discovers semantic

conflicts it will report these conflicts to those users who will

then work out a document version agreed on by most users

through discussion. With the growth of the number of users,

discussions among users will certainly increase. Suppose

every one-person growth in the number of users will result

in one more time discussions. Then the overall communi-

cation cost is:

T1 Z 3n C ðn K1Þ!n Z n2 C2nðnO1Þ

3n is the server’s communication cost and (nK1)!n is the

clients’ communication cost. That is,

S1 Z 3nðnO1Þ C1 Z n2 KnðnO1Þ

In C/S-structured collaborative editing [8], its ontology

library is in the server and no ontology information in the

clients. When the server receives the editing operations

from n clients and discovers semantic conflicts, it will

Table 5

Comparison of Ps and Pc

Collaborative editing Ps Pc

The user-centered 3/(nC2) (nK1)/(nC2)

The C/S-structured 100% 0

The composite Peer–

Peer

(2nC2)/(27nK23) (25nK25)/(27nK23)

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111110

resolve these conflicts by querying ontology library and

send the results to each client. In C/S-structured collabora-

tive editing the overall communication cost is:

T2 Z 2nðnO1Þ

There is no communication in clients. That is,

S2 Z 2n C2 Z 0

The semantic collaborative editing brought forward in this

paper is a mixed Peer–Peer structure. When a client wants to

run an editing operation it first runs this operation locally and

then broadcasts the result to others. After that it queries the

local IOL to find out whether there is the corresponding

ontology information. If not, it will query the GOL in the

server to get relevant information and update local IOL

according to the querying results. At the same time, the server

also sends the results to other clients. From the above

experiments we know that the probability of finding the

corresponding ontology information in IOL is about 92%. In

these occasions collaborative editing needs not query the

GOL. Therefore the overall communication cost is:

T3 Z 0:92ðn K1ÞC0:08ð2 C ðn K1ÞC ðn K1ÞÞ

Z 1:08n K0:92

The costs of the server and the clients are:

S3 Z 0:08ð2 C ðn K1ÞÞ Z 0:08n C0:08

C3 Z 0:92ðn K1ÞC0:08ðn K1Þ Z n K1

According to the analysis above, here are our conclusions:

firstly, with the growth of the number of clients, the overall

Table 6

Comprehensive comparison of the performance of three kinds of collaborative ed

Comparison The user-centered T

The semantic consistency problem of

words

Solved S

The maintenance of users’ editing

intention problem

Partly solved N

The overall communication cost Much S

The communication cost of the server Much S

The communication cost of the clients Much N

Ps Small M

Pc Small N

The dependence on the server Small M

The client’s size No ontology information N

The system’s performance Poor A

communication cost of the user-centered collaborative

editing increases dramatically while that of the C/S-

structured collaborative editing is relatively slower. The

overall communication cost of the semantic collaborative

editing is the smallest. Secondly, the semantic collaborative

editing spends the smallest server side communication cost

while the user-centered collaborative editing spends the

most. The semantic collaborative editing spends a smaller

client side communication cost than the user-centered

collaborative editing.

We work out the proportion of the server side

communication cost to the overall cost (Ps) and the

proportion of the client side communication cost to the

overall cost (Pc) in three kinds of collaborative editing

where PsZS/T and PcZC/T.

According to Table 5, the C/S-structured collaborative

editing has the most dependence on the server (100%) while

the semantic collaborative editing has the smallest: when

the number of the clients is two it is 19% and with the

growth of the number of clients the dependence on the

server decreases. When the number of the clients is eight it

is 9%. The dependence on server of the user-centered

collaborative editing is between the above two.

4.3. Experiment and analysis conclusion

According to the experiments and the compared

analysis of three kinds of collaborative editing above we

work out Table 6. Table 6 shows the comprehensive

comparison of the performance of the three kinds of

collaborative editing.

According to Table 6 we find that the semantic

collaborative editing has a small client while providing an

effective IOL to localized most operations. So the system’s

overall communication cost is reduced and thereby the

system’s performance is improved. Moreover, the semantic

collaborative editing has a weak dependence on the server

so it is more robust. In short the semantic editing efficiently

resolves the semantic consistency of words and maintains

users’ editing intention successfully.

iting

he C/S-structured The mixed Peer–Peer

olved Solved efficiently

ot solved Solved efficiently

mall Very small

mall Very small

one Small

uch Very small

one Much

uch Very small

o ontology information A little ontology information

verage Excellent

N. Gu et al. / Advanced Engineering Informatics 19 (2005) 103–111 111

We have implemented the semantic collaborative editing

system proposed in this paper on CSCWDP (CSCW Design

Platform) which is developed by our group. Couples of

users tried it and are still using it. They find that this system

not only solves the semantic consistency problem of words

and the maintenance of users’ editing intention problem

successfully but also has a short response latency and good

stability.

5. Conclusion and future work

In this paper, we introduce ontology to collaborative

editing of design documents and propose a new architecture

and technology of semantic collaborative editing. It resolves

the semantic consistency problem of words and maintains

users’ editing intentions efficiently.

However, there are still a lot of unsolved problems in

collaborative editing. For example, there is another kind

of semantic conflict: Suppose the original sentence in the

design document is ‘Tom forgot lock the door’. User S1

revises it to ‘Tom forgot locking the door’ while user S2

changes it to ‘Tom forgot to lock the door’. In traditional

collaborative editing the sentence will be revised to ‘Tom

forgot to locking the door’ which is a grammatically

wrong sentence [5]. Such a semantic problem is called

the semantic consistency problem of structure. We are

probably able to use the relationships between concepts

in ontology to solve this problem. That is why we adopt

ontology instead of thesaurus in this paper. Ontology is

an excellent tool for solving semantic problems. In

addition, the collaborative editing in Chinese language is

an unsolved problem too. Present research on collabora-

tive editing mainly focuses on English. There are lots of

Chinese-special techniques needed to solve the semantic

consistency of Chinese words.

Acknowledgements

Project supported by the National Natural Science

Foundation of China (Grant No.: 60473124) and SEC

E-Institute: Shanghai High Institutions Grid (Grant NO.

200309).

References

[1] Li D, Patrao J. An approach towards customizable group editors. In:

ACM CSCW’2000 workshop on collaborative editing systems,

Philadelphia, PA, USA 2000.

[2] Vidot N, Cart M, Ferrie J, Suleiman M. Copies convergence in a

distributed real-time collaborative environment. In: Proceedings of

ACM conference on computer supported cooperative work, Philadel-

phia, PA, USA 2000; pp. 171–80.

[3] Sun C. Undo as concurrent inverse in group editors. ACM Trans

Comput–Hum Interact 2002;9(4):309–61.

[4] Sun C. Consistency maintenance in real-time collaborative graphics

editing systems. ACM Trans Comput–Hum Interact 2002;9(1):1–41.

[5] Sun C, Jia X, Zhang Y, Yang, Chen D. Achieving convergence,

causality-preservation, and intention-preservation in real-time coop-

erative editing systems. ACM Trans Comput–Hum Interact 1998;

5(1):63–108.

[6] Xue L, Orgun M, Zhang K. A user-centered consistency model in real-

time collaborative editing systems. In: Proceedings of fourth

international conference on distributed communities on the web,

Sydney, Australia, LNCS 2468, Springer, Berlin 2002; pp. 138–50.

[7] Neuwirth CM. Computer support for collaborative writing: a Human–

Computer interaction perspective. In: ACM CSCW’ 2000 workshop

on collaborative editing systems, Philadelphia, PA, USA 2000.

[8] Wu X. The research of semantic collaborative editing based on

ontology. Master’s thesis, Fudan University, Shanghai, P.R. China

2002.

[9] Web Ontology Language (OWL) Use cases and requirements. W3C

Working Draft, 3 February 2003. http://www.w3.org/TR/webont-req/.

[10] Resource Description Framework (RDF) Model and Syntax Specifi-

cation. W3C Recommendation, 22 February 1999. http://www.w3.

org/TR/REC-rdf-syntax/.

[11] Fensel D, Horrocks I, van Harmelen F, Decker S, Erdmann M, Klein

M. Oil in a nutshell. In: Proceedings of the workshop on applicatons of

ontologies and problem-solving methods, 14th European Conference

on Artificial Intelligence ECAI’00, Berlin, Germany 2000; pp. 1–16.

[12] David C, Sun C. Categorization of operations in collaborative editing

systems. In: ACM CSCW’ workshop on collaborative editing

systems, Philadelphia, PA, USA 2000.

[13] Delgado J, Gallego I, Garcia R, Gil R. An ontology for intellectual

property rights: IPROnto. Extended poster abstract, first international

semantic web conference (ISWC2002), Sardinia, Italy 2002.

[14] Pease A, Niles I, Li J. The suggested upper merged ontology: a large

ontology for the semantic web and its applications. In: Working notes

of the AAAI-2002 Workshop on Ontologies and the Semantic Web,

Edmonton, Canada 2002.

[15] Sun C, Zhang Y, Jia X, Yang Y. A generic operation transformation

scheme for consistency maintenance in real-time cooperative editing

systems. In: Proceedings of the international ACM SIGGROUP

conference on supporting group work: the integration challenge,

Phoenix, AZ, USA 1997; pp. 425–34.

[16] Gruber TR. A translational approach to portable ontology specifica-

tions. Knowl Acquis 1993;5:199–220.

[17] Miller GA. WordNet: a lexical database for English. Commun ACM

1995;38(11):39–41.

[18] Alfonseca E. A WordNet interface to APL2. ACM SIGAPL: APL

Quote Quad 2002;42(4):7–16.