Supporting the Dynamic Evolution of Web Service Protocols in

13

Supporting the Dynamic Evolution of WebService Protocols in Service-OrientedArchitectures

SEUNG HWAN RYU

University of New South Wales

FABIO CASATI

University of Trento

HALVARD SKOGSRUD

ThoughtWorks, Australia

and

BOUALEM BENATALLAH and REGIS SAINT-PAUL

University of New South Wales

In service-oriented architectures, everything is a service and everyone is a service provider. Web

services (or simply services) are loosely coupled software components that are published, discov-

ered, and invoked across the Web. As the use of Web service grows, in order to correctly interact

with them, it is important to understand the business protocols that provide clients with the infor-

mation on how to interact with services. In dynamic Web service environments, service providers

need to constantly adapt their business protocols for reflecting the restrictions and requirements

proposed by new applications, new business strategies, and new laws, or for fixing problems found

in the protocol definition. However, the effective management of such a protocol evolution raises

critical problems: one of the most critical issues is how to handle instances running under the

old protocol when it has been changed. Simple solutions, such as aborting them or allowing them

to continue to run according to the old protocol, can be considered, but they are inapplicable for

many reasons (for example, the loss of work already done and the critical nature of work). In this

article, we present a framework that supports service managers in managing the business protocol

evolution by providing several features, such as a variety of protocol change impact analyses au-

tomatically determining which ongoing instances can be migrated to the new version of protocol,

and data mining techniques inferring interaction patterns used for classifying ongoing instances

migrateable to the new protocol. To support the protocol evolution process, we have also developed

Authors’ addresses: S. H. Ryu, B. Benatallah, and R. Saint-Paul, CSE, University of New South

Wales, Sydney, NSW 2052, Australia; email: {seungr, boualem, regiss}@cse.unsw.edu.au; F. Casati,

DIT, University of Trento, Via Sommarive 14, 38050, Povo (Trento), Italy; email: [email protected];

H. Skogsrud, ThoughtWorks, 16 O’Connell Street, Sydney, NSW 2000, Australia; email:

[email protected].

Permission to make digital or hard copies of part or all of this work for personal or classroom use is

granted without fee provided that copies are not made or distributed for profit or direct commercial

advantage and that copies show this notice on the first page or initial screen of a display along

with the full citation. Copyrights for components of this work owned by others than ACM must be

honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,

to redistribute to lists, or to use any component of this work in other works requires prior specific

permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn

Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]© 2008 ACM 1559-1131/2008/04-ART13 $5.00 DOI 10.1145/1346237.1346241 http://doi.acm.org/

10.1145/1346237.1346241

ACM Transactions on the Web, Vol. 2, No. 2, Article 13, Publication date: April 2008.

13:2 • S. H. Ryu et al.

database-backed GUI tools on top of our existing system. The proposed approach and tools can help

service managers in managing the evolution of ongoing instances when the business protocols of

services with which they are interacting have changed.

Categories and Subject Descriptors: D.2.7 [Software Engineering]: Distribution, Maintenance,

and Enhancement; H.3.5 [Information Storage and Retrieval]: Online Information Services

General Terms: Management

Additional Key Words and Phrases: Business protocols, change impact analysis, dynamic evolution,

ongoing instances, Web services, decision trees

ACM Reference Format:

Ryu, S. H., Casati, F., Skogsrud, H., Benatallah, B., and Saint-Paul, R. 2008. Supporting the

dynamic evolution of Web service protocols in service-oriented architectures. ACM Trans. Web

2, 2, Article 13 (April 2008), 46 pages. DOI = 10.1145/1346237.1346241 http://doi.acm.org/10.

1145/1346237.1346241

1. INTRODUCTION

Web services and more, in general service-oriented architectures (SOAs), arequickly becoming the preferred choice for distributed application developmentand application integration. Web service interfaces today are described usingthe Web Services Description Language (WSDL). Besides interfaces, businessprotocols are rapidly gaining impetus and awareness as a necessary part of theservice description [Benatallah et al. 2006a]. A business protocol1 for a servicespecifies the sequence of messages that a service and its clients exchange toachieve a certain business goal [Alonso et al. 2004], for example, booking flighttickets. Business protocols play an important role in Web service environments.They inform developers how to write clients that correctly interact with a givenservice, and they allow development tools and runtime middleware to deliverfunctionality that simplifies the service development lifecycle (for example, au-tomatically generating code skeleton) [Benatallah et al. 2004b].

One of the main motivations behind the adoption of SOA is the need for dy-namic applications that can be quickly adapted to changes in business needsand/or regulations. Correspondingly, it is necessary to provide services withthe ability to evolve and the ability to minimize the impact of such business-driven evolution in terms of (i) development efforts to implement the changesand (ii) disruption in the services provided to clients while the change is ap-plied. When a service changes, the externally visible behavior of a service, andin particular the protocols on which services base their interactions, can alsoevolve.

As an example scenario, used for illustration throughout this article, considera service providing working visas in Australia. The immigration department isthe service provider, and tens of thousands of protocol instances (conversations)are active at any given time. The completion of the entire service protocol,corresponding to the approval or rejection of the work permit, takes months. Ata certain point, changes in immigration laws (quite frequent these days) mayrequire changes in the service protocol. For example, new documents must be

1In this article we will use “business protocol,” “protocol,” and “Web service protocol” inter-

changeably.


Web Service Protocols in Service-Oriented Architectures • 13:3

provided by the applicant, or the order in which documents have to be providedis modified.

The dynamic protocol evolution problem is that of managing the ongoingconversations in the context of a protocol evolution.

Dynamic protocol evolution is an important and challenging problem. Froma business continuity perspective, in most cases, we cannot abort all conversa-tions and ask clients (users) to restart the service invocation from the beginning.In the visa example, we cannot ask immigrants to repeat the application andbegin from scratch by resubmitting all the documents. However, continuingand completing ongoing conversations with the old protocol may be unaccept-able, as the changes are introduced for a reason (for example, a digital passportshould be submitted as well); not complying with the new regulation and withthe required protocol changes may be unacceptable. Hence, the problem lies infinding some acceptable middle ground for each of the active conversations, es-pecially when the number of such conversations is very large, so that individualmanual analysis is unfeasible.

In this article we propose a set of methods and a tool for managing dynamicprotocol evolution. The end goal is to provide techniques for understandingwhich conversations are affected by a protocol evolution and, for those thatare, for facilitating the definition and enactment of criteria for managing them.We are not concerned with how a protocol is changed, or whether the protocolchanges are syntactically correct. Rather, we focus on making ongoing conver-sations dynamically comply with protocol changes.

In particular, this article makes the following contributions:

—It presents a method to automatically classify conversations based on theimpact protocol evolution has on them. We do so by studying two properties,called forward and backward protocol compatibility. The properties are usedas requirements for determining conversations which can be migrated to anew protocol when an old protocol has been changed. Migrating a conversa-tion to a protocol P means that in the future the conversation will have toobey the rules and constraints of protocol P. Then, based on the properties, wedefine operators for analyzing how the protocol changes impact on ongoingconversations, and for classifying them into migrateable and non-migrateableconversations (Section 3).

—For cases where this analysis cannot be applied (for example, we do not have aformal description of the protocol followed by the clients), we analyze serviceinteraction logs recorded by a Web service monitoring tool and infer, usingdata mining techniques, interaction patterns of conversations that have com-pleted their executions under an old protocol in the past. From this, we inferif it is likely that conversations may proceed without errors under a newprotocol (Section 4).

—We provide management tools for modifying protocols, for supporting changeimpact analysis based on protocol models and data mining-based migrationanalysis, and for assisting users in determining migration strategies. Thesetools are crucial to facilitate evolution, particularly when the number of on-going conversations is high (Section 6).


13:4 • S. H. Ryu et al.

Besides the presentation of the technical contributions, we also discuss re-lated work in Section 7 and conclude with a summary of results and directionsfor future work in Section 8.

2. VERSIONING AND EVOLUTION FRAMEWORK

In this section we first present an example of the protocol model that will be usedthroughout this article to illustrate our approach. We then introduce protocolevolution concepts: the possible migration strategies and the properties thatshould be preserved during protocol migration. Although our presentation isbased on protocols modeled through a finite state machine, the concepts aregeneric and can be applied regardless of the modeling formalism.

2.1 Business Protocols Modeling

A business protocol specifies which message exchange sequences are supportedby a service. Following our previous work [Benatallah et al. 2006a], we modelprotocols as finite state machines (FSMs). The reason for using FSM is becauseit is a well-known paradigm based on a formalism that is easy to understand fornon-expert users, and that is appropriate to represent reactive behaviors. FSMconsists of states and transitions. States represent the different phases thata service may go through during its interaction with clients while transitionsare triggered by messages sent by the clients to the service provider. Thus,transitions are labeled with a message, corresponding to the invocation of aservice operation.

Figure 1 shows a graphical representation of a protocol for an Australianworking visa application service. State names, such as Eligible, are logical anddo not affect the actual usage of a service. The visa application service is initiallyin the Start state, and service usage begins when a client sends a checkEligi-bility message, upon which the service moves to the Eligible state. In general,clients seeking to work in Australia can proceed to state Lodged by filling inthe application, submitting their work experience, and testing their Englishability, while clients reapplying for the working visa after visa expiry can goto the same state only by filling in the application and providing an employerreference letter. Overseas students who complete eligible studies in Australiaproceed to state S-Lodged by filling in the application for overseas students,and submitting graduation certificate and passport. Then, they check their ap-plication status and complete the application. Although, in reality, the protocolcould be much more complex, for simplicity, we omit the other stages necessaryfor the visa application service.

An instance or conversation of the visa application service corresponds toa particular visa application process initiated by a particular client. Severalinstances of the service may be active at the same time; each instance may bein any of the possible states defined by protocols. It is important to observe thatthe entire procedure takes weeks or months to complete. Therefore, any timea modification is applied, it is certain that there will be thousands of activeconversations that need to be handled.



Start EligiblecheckEligibility

submitWorkExperience

submitGraduationCertificate

GCSubmitted

WESubmitted

checkApproval

S-ApplicationReady

fillInApplicationForOverseasStudent

ApplicationReady

fillInApplication

testEnglishAbility

Lodged

Checked

confirm

ProcessedReviewedReviewed

reassess

CancelledCancelledcancel

checkApproval

S-Lodged

submitPassport

submitReferenceLetter

Fig. 1. Initial business protocol for an Australian working visa application service.

Formally, a business protocol is defined as follows:

Definition 2.1. A business protocol is a tuple P = (S, s0, F , M, R), whichconsists of the following elements:

—S is a finite set of states.—s0 ∈ S is the initial state.—F is a set of final states.—M is a finite set of messages. In our model, we assume that M is a set of

operation names.—R ⊆ S2 × M is the transition relation. Each transition (s, s′, m) identifies a

source state s, a target state s′, and a message m, that is consumed duringthis transition.

2.2 Changed Visa Application Protocol

The initial version of the visa application protocol (Figure 1) may change forvarious reasons. For example, suppose that immigration laws are amended asfollows:

—Applicants reapplying for the working visa after their visa expiry shouldsubmit an employer reference letter as well as the result of a medical exam-ination to lodge their application.

—Applicants cannot ask for a review of the application result.


13:6 • S. H. Ryu et al.



submitGraduationCertificate

GCSubmitted

WESubmitted

checkApproval

S-ApplicationReady


ApplicationReady

fillInApplication

testEnglishAbility

Lodged

Checked

confirm

ProcessedReviewed

reassess

CancelledCancelledcancel

checkApproval

S-Lodged

submitPassport


RLSubmitted

reportMedicalExamination

Add

Remove

Fig. 2. Changed protocol for an Australian working visa application service.

The service protocol must be modified to meet the new requirements. Thenew protocol is depicted in Figure 2 (changed parts are in bold). In this protocol,a state and a transition have been added: the new state RLSubmitted, afterstate ApplicationReady, and the transition reportMedicalExamination, betweenRLSubmitted and Lodged. In addition, the state Reviewed and the transitionreassess have been removed.

2.3 Protocol Version and Migration Strategy

Within a dynamic Web service environment, service managers can define dif-ferent versions of protocols over time to meet a variety of new requirements. Inthis subsection we explain the possible strategies that can be applied to migratethe instances running under an old protocol. We adopt the strategies suggestedin Skogsrud et al. [2004].

—Continue: Active instances are allowed to continue to run according to theold protocol, while new instances will start following the new version. Servicemanagers apply this strategy to ongoing instances when it is acceptable tocomplete them according to the old protocol. However, in some cases, thisstrategy could be inapplicable since letting the instances complete accordingto the old one may not be acceptable, for example, if there are security holesin the old immigration protocol.



—Migration to the new protocol: Active instances are migrated to the new ver-sion of protocol. Whenever there is a migration, the state in the new protocolfrom which the conversation will resume should be defined. This strategy isthe most appealing, but it is not always applicable, as we will see in detail.For example, migration to a new version of the visa application protocol maynot make sense if continuing with the new protocol means that certain legalrequirements (e.g., submission of medical documents, to be done at the be-ginning of a conversation in the new protocol) are not met. Another possibleproblem is that clients’ implementations may be unable to interact with thenew protocol unless changes are applied, and to send and receive messagesas required.

—Migration to ad hoc protocol. Service managers may define ad hoc protocolsfor the instances that cannot be migrated to the new protocol. Ad hoc pro-tocols are defined to manage those active conversations for which the otherstrategies are not applicable. They mediate between the need for capturingthe changes prescribed by the new protocol and allowing continuation of ac-tive conversations. The details will be described in Section 5.

2.4 Compatibility Properties to be Considered When Migrating Conversations

In order to determine whether a conversation can be migrated to the new pro-tocol, we identify two different degrees of compatibility between conversationsand protocols, corresponding to different requirements that service managermay impose on the migration process to assess which migration strategy shouldbe applied: forward and backward compatibility.

—Forward compatibility refers to the ability for clients of active conversationsto continue to interact correctly, without runtime errors, with a given serviceafter it is migrated to the new protocol. In some cases, the effects of protocolchanges may make the active conversations fail to send required messageswhen needed; clients are not prepared to interact with the new protocol sincethey have been developed to interact with the old one.

Example 1. Consider a conversation at state Eligible in the initial proto-col (Figure 1). If the conversation is migrated to state Eligible in the new pro-tocol (Figure 2), a violation of the forward compatibility property might occurbecause the applicant (client) might take one of the changed paths in the newprotocol (Eligible.fillInApplication().ApplicationReady.submitReferenceLetter().RLSubmitted.reportMedicalExamination(). ...) with which the client cannot in-teract.

—Backward compatibility means that, after migration to the new protocol, thebackward path (also called history) of an instance (the message sequencefollowed by the instance so far) must be compatible in the context of the newprotocol. This property is not concerned with the possible future progressionof the conversation, but on whether the past interactions, up to the time ofevolution, correspond to a valid interaction as defined in the new protocolversion.


13:8 • S. H. Ryu et al.

Example 2. Consider again the old protocol (Figure 1) and the new pro-tocol (Figure 2). We assume that there is an applicant currently in the stateLodged of the old protocol. This conversation cannot be directly migrated sothat it continues from the same state of the new protocol, since the applicantmight have followed the message sequence (Start. ... .Eligible.fillInApplication().... .submitReferenceLetter().Lodged) in the old one, which is incompatible with(is not allowed by) the new protocol. Hence, the migration causes a violationof backward compatibility property.

These two properties are important in managing the dynamic protocol evo-lution. Forward compatibility is necessary in order to guarantee successful fu-ture interaction between clients and services, while backward compatibility isrequired if we need every conversation, in every instant (including at its com-pletion) to be a valid instance of the protocol to which it is migrated, that is, itsexecution is one of the executions allowed by the protocol.

We next define these properties formally.

—Let P = (S, s0, F , M, R) be an old business protocol and P ′ =(S ′, s′

0, F ′, M′, R′) be a new business protocol.

—StateIP denotes the current state of an instance I in protocol P.

—Let an execution path p = < s1.m0.m1....mk−2.mk−1.s2 > be a sequence ofmessages, from a state s1 to a state s2, such that for 0 ≤ i ≤ k − 1, mi ∈ Mand s1 ∈ S ∪ s0 and s2 ∈ S ∪ F .

—Historys0,sI denotes an execution path actually taken by instance I that starts

from an initial state s0 and ends at a state s = StateIP , s ∈ S in protocol P.

—PathsFromStarts0,sP denotes a set of execution paths that start from an initial

state s0 and end at a state s = StateIP , s ∈ S in protocol P.

—PathsToCompletions, fP denotes a set of execution paths that start from a state

s = StateIP , s ∈ S and end at a final state f ∈ F in protocol P.

Definition 2.2. In the migration of instance I from protocol P to protocolP ′, forward paths that can be taken by the instance I in protocol P are forwardcompatible in the context of protocol P ′ if and only if PathsToCompletions, f

P ⊆PathsToCompletions′, f ′

P ′ , where s′ is a corresponding state of s.

This definition states that, if the new protocol includes all the possibleforward paths that can occur between the instance’s current state and finalstates in the old protocol, the instance can correctly interact with the newprotocol.

Definition 2.3. In the migration of instance I from protocol P to protocolP ′, the execution history of instance I in protocol P is backward compatible in

the context of protocol P ′ if and only if Historys0,sI ∈ PathsFromStart

s′0,s′

P ′ , wheres′ is a corresponding state of s.

This definition states that, if the actual history taken by an instance belongsto the set of possible paths from the initial state to the state corresponding tothe instance’s current state in the new protocol, the history is compatible in thecontext of the new protocol.



Unknown client

protocols

Known client

protocols

Old service protocol New service protocol

Interactions

?

Model-based

analysis

Future interaction analysis

by inferring interaction

pattenrs

conversations running

under old protocol

Handling non-migrateable

conversations

Non-migrateable conversations

Fig. 3. Process of managing the business protocol evolution.

Definition 2.4. Migration of instance I from protocol P to protocol P ′ is safeif and only if the forward paths of I satisfy Definition 2.2 and the history of Isatisfies Definition 2.3.

2.5 Protocol Evolution Process

Protocol evolution management can be seen as a multi-step process illustratedin Figure 3. The first step is the modification, by service managers, of the proto-col model and its implementation as a service. New instances of the service candirectly interact in the realm of the new protocol version, and thus do not createmigration problems (assuming of course that client implementations have tobe updated to interact with the new version of the service). Ongoing conversa-tions that have been initiated according to the obsolete version of the protocolare more problematic. Service managers have to decide on an appropriate mi-gration strategy for each instance. This involves classifying instances as eithermigrateable or non-migrateable. To achieve this classification, we propose athree-step approach as follows:

(1) Model-based analysis. This analysis is done on a model level (Section 3).Old and new protocols are compared to check for their replaceability(that is, whether they can support the same set of message exchanges).There are different ways in which old and new protocols of a service canbe replaceable, with different implications in terms of migrateability of


13:10 • S. H. Ryu et al.

conversations:

—Full replaceability. In this case, the new protocol can support all the mes-sage exchanges the old protocol supports. This means that the protocolevolution problem is in fact trivial: all instances can be safely migrated.The change is transparent with respect to the clients: they can continuetheir conversations as the service will support them.

—State replaceability. If the new protocol cannot replace the old one in thegeneral case, we need to compute which states of the old protocol are notaffected by changes. Conversations in the unaffected states can be safelymigrated to the new version of the protocol.

—Path replaceability. As for the states that are affected by changes, welook at the paths from the initial state to these states (backward path)and from the states to the final state (forward path) in order to deter-mine to or from which states the changes are transparent to the clients.Conversations not in change-transparent states are further analyzed onthe basis of their past and future interaction in the following analysissteps.

—Replaceability with respect to a history. We examine the past interactionof conversations to filter out migrateable ones from the states that havedifferent backward paths in old and new protocols. In general, past inter-actions of conversations are known (e.g., documents for visa applicationare stored in the database and time stamped with the date of submission).

—Replaceability analysis based on client protocols. The future interactionanalysis of conversations based on client protocols, is conducted to theones in the states that have different forward paths in the two protocols.This analysis enables service managers to identify which conversationcan take the unaffected forward path from its current state to the end.The future interaction may or may not be known depending on the serviceconsidered. In Section 3.6, we discuss the case where future interaction,i.e., client protocol, is known to the service manager. The result of thisstep is a conclusive (deterministic) classification of instances as eithermigrateable or non-migrateable.

(2) Future interaction analysis by inferring interaction patterns. As outlinedhere, client protocol might not always be known. To overcome such a sit-uation, we propose applying data mining techniques to infer interactionpatterns of already terminated instances and then using the patterns topredict future interactions of ongoing instances. Service interactions aretypically logged by a monitoring tool. Interaction logs contain valuable in-formation, for example, data sent by clients during service invocation, whichcan be used to induce a classification model of instances. In a nutshell, in-stances that completed their interaction can be automatically classified ascompatible or incompatible with the new version of the protocol. Their fea-tures can be used to train an automatic classification engine for buildingclassification models; the underlying assumption is that the patterns gen-erated from the models can serve as predictors for the future interactionof clients. If such predictors can be found, they allow further classification,



although only probabilistically, of instances for which future interaction isnot known. Section 4 details this approach.

(3) Handling non-migrateable conversations. After the best efforts detailed inthe two previous steps have been applied to migrate instances, some mayremain non-migrateable. We propose two approaches to handle these in-stances: protocol adapters and ad hoc protocols. Protocol adapters bridgethe differences between the old and new protocols so that non-migrateableinstances can continue to interact with the new protocol as if they wereinteracting with the old one. As another solution, we develop ad hoc pro-tocols, which enable the non-migrateable instances to satisfy the require-ments newly proposed in the new protocol without aborting them. The adhoc protocols are defined to handle the cases for which adapters cannotmeet the new requirements. Such protocols vanish when there are no ac-tive instances under them (Section 5).

3. MODEL-BASED ANALYSIS

In this section, we propose a method to perform change impact analysis on ongo-ing instances, based on protocol models, and describe how to classify the activeinstances as migrateable or non-migrateable using the result of the analysis.We perform different kinds of analysis at different levels of details and com-plexity to identify the largest possible number of conversations that can bemigrated.

3.1 Full Replaceability

This analysis checks whether a new protocol can support all the message ex-changes that are supported by the old protocol [Benatallah et al. 2004a]. Whenthis is the case, all instances can be safely migrated and continue their inter-action with the new one. This situation typically occurs when changes to theprotocol are additive, for example, when a new protocol path is added withoutdiscarding any of the existing ones.

Definition 3.1 (Full replaceability). Let P = (S, s0, F , M, R) and P ′ =(S ′, s′

0, F ′, M′, R′) be two protocols. Let a complete execution path be a paththat starts from an initial state and ends at a final state. A protocol P ′ canreplace P if and only if P ′ supports all the complete execution paths that Psupports.

Algorithm 1. F-replaceability

Input: P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′).

Output: Replaceable or not.

begin1: Let Replaceability:= true;

2: Let CompletePaths:= φ and CompletePaths′:= φ;

3: CompletePaths:= Recursive-ComputePaths(P, s0, F);

4: CompletePaths′:= Recursive-ComputePaths(P′, s′0, F ′);

5: foreach path ∈ CompletePaths do


13:12 • S. H. Ryu et al.

6: if path /∈ CompletePaths′ then7: Replaceability:= false;

8: break;

9: endfor10: return Replaceability;

end

Function Recursive-ComputePaths(P, s, F)

begin31: Let Path:= ””;

32: Let PathSet:= φ and ReturnPathSet:= φ;

33: foreach m ∈ outgoing messages of s do34: s′:= ending state of m;

35: if s′ /∈ F then36: ReturnPathSet:= Recursive-ComputePaths(P, s′, F);

37: foreach RPath ∈ ReturnPathSet do38: Path:= m+”.”+RPath;

39: PathSet:= PathSet ∪ Path;

40: endfor;

41: else42: PathSet:= PathSet ∪ m;

43: endfor44: return PathSet;end

We present an algorithm, called F-replaceability, that implements the fullreplaceability analysis. Informally, we obtain a set of complete paths from theinitial state to each final state in old and new protocols (lines (3) to (4)). Theprocedure Recursive-ComputePaths(P, s, F) computes all the paths that canexist from the start state to a final state. If one of the complete paths of theold protocol does not belong to a set of complete paths of the new protocol,this means that replaceability between these protocols is not possible (lines(5) to (9)).

3.2 State Replaceability

The previous analysis is “black and white” replaceability analysis since it doesnot consider the current states of instances. In fact, even if protocols are notreplaceable, there is still hope to classify some instances as depending on theircurrent state.

In particular, this class of analysis determines the states of the old protocolsuch that conversations in those states are not affected by the changes. Thesestates have the same forward and backward paths in the old and new protocols,which means that the instances in the states followed an unaffected backwardpath from the initial state to their current state and will follow an unaffectedforward path from their state to a final state. Thus, all the instances in thesestates can be safely migrated to the new protocol without generating any kind



of migration property violations. We call these states as replaceable states. Toperform this analysis, we develop a function that takes as input two protocols,old and new, and generates as output the replaceable states.

Example 3. In Figure 1, the analysis function returns four states:S-ApplicationReady, GCSubmitted, S-Lodged, and Cancelled, since their back-ward paths and forward paths are not changed in the new protocol (Figure 2).We can migrate all the instances in these states to the corresponding states ofthe new protocol.

Once we understand that an instance in a certain state is migrateable, wehave to determine to which state it is migrated in a new protocol. There areseveral ways to find a state in a new protocol corresponding to one in an oldprotocol:

—Name-based mapping. The common situation is the one in which protocolchanges add or remove states, but do not change state names. In this casewe can simply migrate instances to the new protocol and keep them in thesame state as they were in the old protocol.

—Path-based mapping. This is the general case. A corresponding state can bedetermined by looking at the paths leading to the state from the initial state.In this case, a corresponding state is the one that is obtained by following thesame path from the initial state in the old and new protocols. Note that inthis definition we assume state replaceability, therefore it is irrelevant whichpath we follow: if two paths (sets of message exchanges) p1 and p2 lead to asame state in the old protocol, they will also lead to a same state in the newprotocol.

—User-defined mapping. Another approach is for users (e.g., service managers)to manually choose which state in an old protocol corresponds to which statein a new protocol. This approach is not recommended, as arbitrary statemappings not computed via path-based mappings cannot guarantee that mi-gration properties hold, and in particular do not guarantee backward com-patibility, which is instead guaranteed by path-based mapping.

Definition 3.2. Let P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′) be two

protocols. The corresponding state of a state s ∈ S is a state s′ ∈ S ′ if either ofthe following holds:

—�(s) = s′ where � : S → S ′ is a partial function, with � = {(x, y)|x ∈S ∧ y ∈ S ′∧ Name(x)=Name( y)} and Name(x) means the name ofstate x.

—∨n

i=1 EqualPath(pi, p′i), with p1, p2, . . . , pn ∈ PathsFromStarts0,s

P , p′1, p′

2,

. . . , p′n ∈ PathsFromStart

s′0,s′

P ′ and EqualPath(p1, p2) means p1 is equal to p2

in terms of message sequences.—�(s) = s′ where � : S → S ′ is a partial function, with � = {(x, y)|x ∈ S ∧ y ∈

S ′∧ MAPUSER(x, y)} and MAPUSER(x, y) symbolizes x is mapped to y by theuser.


13:14 • S. H. Ryu et al.

Using the corresponding state concept, we can formalize state replaceabilityas follows.

Definition 3.3. Let P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′)

be two protocols. The replaceable states are a set of states s ∈ S suchthat for ∀ b ∈ PathsFromStarts0,s

P , ∀ f ∈ PathsToCompletions, fP , then b ∈

PathsFromStarts′0,s′

P ′ and f ∈ PathsToCompletions′, f ′P ′ , where s′ is a corresponding

state of s.

Algorithm 2. S-replaceability

Input: P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′).

Output: A set of states.

begin1: Let Candidates:= φ;

2: foreach s ∈ S do3: s′ = findCorrespondingState(P, P ′, s);

4: if s′ �= null then5: PathsFromStart:= GetBackwardPaths(P, s);6: PathsFromStart′:= GetBackwardPaths(P′, s′);7: foreach b ∈ PathsFromStart do8: if b /∈ PathsFromStart′ then9: go to 18;

10: endfor11: PathsToComplete:= GetForwardPaths(P, s);12: PathsToComplete′:= GetForwardPaths(P′, s′);13: foreach f ∈ PathsToComplete do14: if f /∈ PathsToComplete′ then15: go to 18;

16: endfor17: Candidates:= Candidates ∪ s;

18: endif19: endfor20: return Candidates;

end

The S-replaceability algorithm implements the identification of replace-able states. Given the old and new protocols, the algorithm first obtainsthe corresponding state of a state by calling the procedure findCorrespond-ingState(P,P’,s) (line (3)). Then, if the corresponding state exists in the newprotocol, it calculates two sets of backward paths from the initial state tothe state in the two protocols (lines (5) to (6)). Next, the algorithm examineswhether the set of backward paths in the new protocol includes the set ofbackward paths in the old protocol (lines (7) to (9)). For checking the forwardpaths, the algorithm proceeds similarly (lines (11) to (16)). If all the backwardpaths and forward paths to/from a state in the old protocol also exist in thenew protocol, the state is added to the variable Candidates (line (17)).



Algorithm 3. findCorrespondingState

Input: P = (S, s0, F , M, R), P ′ = (S ′, s′0, F ′, M′, R′) and a state.

Output: a state.

begin1: Let FoundState:= ′′′′;2: foreach s′ ∈ S′ do3: if name of s= name of s′ then4: FoundState:= s′;5: go to 17;

6: foreach p ∈ GetBackwardPaths(P, s) do7: foreach s′ ∈ S′ do8: foreach p′ ∈ GetBackwardPaths(P ′, s′) do9: if p= p′ then10: FoundState:= s′;11: go to 17;

12: endfor13: foreach s′ ∈ S′ do14: if s is mapped to s’ then15: FoundState:= s′;16: go to 17;

17: return FoundState;

end

Algorithm 4. GetBackwardPaths

Input: protocol P = (S, s0, F , M, R) and state s

Output: a set of paths from s0 to s.

begin1: Let executionPaths:= φ;

2: Let executionPath:= ′′′′;3: if s = s0 then4: executionPaths:= executionPaths ∪ executionPath;

5: else6: parentStates:= parent states of s;

7: incomingMessages:= incoming messages of s;

8: foreach parentState ∈ parentStates and message ∈ incoming Messages do9: parentPaths:= getParentPaths(parentState);10: foreach parentPath ∈ parentPaths do11: executionPath:=parentPath+ ”.”+ message;

12: executionPaths:=executionPaths ∪ executionPath;

13: endfor14: endfor15: return executionPaths;

16: end


13:16 • S. H. Ryu et al.

We have omitted the details of the other algorithms for space reasons, butinterested readers are referred to Ryu [2007] for further details.

3.3 Path Replaceability

After conducting the two analyses (full replaceability and state replaceability),we focus on states that are not replaceable, and specifically on the paths fromthe start to this state (backward path), and from the state to the end (forwardpath). We observe that there are two properties that states may have withrespect to such paths: forward path replaceability and backward path replace-ability. Together, these properties guarantee state replaceability; however itoften happens that only one of them holds.

3.3.1 Forward Path Replaceability. We identify the states from which thenew protocol can replace the old, and the changes are transparent to the clients.A function doing this analysis takes as input two protocols and generates states:the forward paths from the states are the same in the old and new protocolsand the backward paths to these states are not the same. Some of the instancesin the states might have followed an unaffected path (compatible backwardpath in conformance with the new protocol) while others might have taken apath affected by changes (incompatible backward path). So, if we migrate allthe active instances in the states to the new protocol, the migration could causethe violation of backward compatibility.

Example 4. In our example protocol, the state Processed belongs to thiskind of states, as one of three backward paths leading to this state is affectedby the protocol changes (the addition of state RLSubmitted and reportMedicalEx-amination in Figure 2).

Definition 3.4. Let P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′) be

two protocols. The forward path replaceable states are a set of states s ∈ Ssuch that for ∃ b ∈ PathsFromStarts0,s

P , ∀ f ∈ PathsToCompletions, fP , then

b /∈ PathsFromStarts′0,s′

P ′ and f ∈ PathsToCompletions′, f ′P ′ , where s′ is a corre-

sponding state of s.

3.3.2 Backward Path Replaceability. Compared with forward path re-placeability, this analysis identifies the states that might generate the violationof forward compatibility if that conversations in these states are migrated.The new protocol can replace the old one only to these states. The stateshave the same backward paths, but different forward paths in old and newprotocols. Hence, some of the instances in the states would follow an unaffectedforward path while the others would take the forward path affected by changes(incorrect interaction with the new protocol). In this case, all instances in thesestates cannot be safely migrated, but they are guaranteed to be backwardcompatible in the context of the new protocol. To filter out migrateable ones,we need to predict the forward paths that the instances in the states will takein the future (Section 3.6).

Example 5. In Figure 1, this analysis generates as output, the states Start,Eligible, ApplicationReady, and WESubmitted, which have the same backwardpaths, but changed forward paths in the new protocol (Figure 2).



Definition 3.5. Let P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′) be

two protocols. The backward path replaceable states are a set of states s ∈ Ssuch that for ∀b ∈ PathsFromStarts0,s

P , ∃ f ∈ PathsToCompletions, fP , then b ∈

PathsFromStarts′0,s′

P ′ and f /∈ PathsToCompletions′, f ′P ′ , where s′ is a correspond-

ing state of s.

3.4 Non-Replaceability

The last overall analysis is that of determining the states in which all theinstances cannot be safely migrated, and they are neither guaranteed to beforward nor backward compatible. In Section 3.6.3, for the instances in thesestates, we analyze whether, at the same time, they have followed the compatiblebackward path and will take the compatible forward path in the context of thenew protocol. In addition, removed states can be identified by this analysis.

Example 6. The analysis function returns states Lodged, Checked, and Re-viewed.

Definition 3.6. Let P = (S, s0, F , M, R) and P ′ = (S ′, s′0, F ′, M′, R′) be pro-

tocols. The nonreplaceable states are a set of states s ∈ S such that for ∃b ∈PathsFromStarts0,s

P , ∃ f ∈ PathsToCompletions, fP , then b /∈ PathsFromStart

s′0,s′

P ′

and f /∈ PathsToCompletions′, f ′P ′ , where s′ is a corresponding state of s.

3.5 Replaceability with Respect to a History

In order to filter out migrateable instances, it is not sufficient to classify activeinstances based on the overall replaceability analysis alone, since if we onlycompare old and new protocols, we cannot extract migrateable instances fromthe change-affected states, where migrateable and non-migrateable instancesstay together (for example, states identified by forward path replaceability).Hence, we describe how to further categorize active instances in the states.

This refers to the analysis of the actual backward path (history) that aninstance took, up to the evolution time. This analysis plays an important rolein filtering out migrateable instances from the states identified by the forwardpath replaceability.

Definition 3.7. (Replaceability with respect to a history). An instance I ismigrateable to protocol P ′, with respect to its history Historys0,s

I , if and only ifHistorys0,s

I satisfies the definition 2.3.

Example 7. Consider the state Processed of the old protocol in Figure 1. Ifall the instances in the state are migrated to the state Processed of the new pro-tocol, this might cause a violation of backward compatibility, since some of themmight have followed the backward path (Start. ... .ApplicationReady. submitRe-ferenceLetter().Lodged. ... .confirm().Processed), which cannot be simulated inthe new protocol because they have not provided the result of a medical exami-nation. So, to filter out migrateable instances, there is a need for analyzing theactual execution path taken by individual instances in the state. By doing this,the service manager knows that some of them followed the path 1 (Start. ....ApplicationReady.submitWorkExperience(). ... .Lodged ... .confirm().Processed)


13:18 • S. H. Ryu et al.

while the others followed the path 2 (Start. ... .ApplicationReady. submitRefer-enceLetter().Lodged. ... .confirm().Processed). In this case, only the instancesthat followed path 1 can be migrated to the state Processed of the newprotocol.

3.6 Replaceability Analysis Based on Client Protocols

After performing all the replaceability analyses described previously, there re-main instances that followed compatible backward paths, but are not guaran-teed to correctly interact with the new protocol. However, if we have knowledgeof the clients’ protocols, we can assess whether the protocol changes do notaffect clients since they always (as per their protocol) take certain paths (pos-sibly unmodified by the modifications) rather than others. Hence, virtually allreplaceability analyses discussed here can be modified to take into account theclient protocols, when such definition is available.

3.6.1 Replaceability With Respect To a Client Protocol. In our previouswork [Benatallah et al. 2004a, 2006a], this class of analysis was proposed foridentifying whether a new version of protocol can replace an old one when in-teracting with clients supporting a protocol Pc. If every legal message sequencebetween an old protocol Po and a client protocol Pc is also supported between anew protocol Pn and Pc, we say that Pn can replace Po with respect to Pc.

Example 8. Protocol Pn of Figure 2 can replace protocol Po of Figure 1 wheninteracting with client protocol Pc1 (Figure 4(a)). So, the instances having sucha client protocol can be safely migrated to protocol Pn.

The formal definition of replaceability analysis is given in Benatallah et al.[2006a].

3.6.2 Replaceability With Respect To a State and a Client Protocol. Here,we propose more fine grained replaceability analysis by considering the currentstates of instances as well as client protocols. Using the state and client protocolinformation, we can see that the client will never follow the changed forwardpath from its current state and there is no problem in migrating the client’sinstance to the corresponding state in the new one.

Example 9. As an example of using a state information and a client proto-col to identify the future interaction, assume that client 1 stays in the Eligiblestate of Figure 1, client 2 is in the WESubmitted state in the same protocol, andboth of them have client protocol Pc2 of Figure 4. According to the replaceabilityanalysis only considering client protocols, the clients’ instances are classifiedas non-migrateable, as Pn of Figure 2 allows a client to report a medical ex-amination (message reportMedicalExamination at state RLSubmitting), while Podoes not support this message.

However, if we consider the states and client protocols at the same time,we get different results. Namely, client 1’s instance in state Eligible cannot bemigrated to the corresponding state of Pn, however, client 2’s instance in stateWESubmitted can be migrated to the corresponding state of Pn since the client



(a) A client protocol Pc1 (b) A client protocol Pc2


submitGraduateCertificate

GCSubmitted

S-ApplicationReady


Processed

CancelledCancelled

cancel

S-Lodged

submitPassport



WESubmitted

checkApproval

ApplicationReady

fillInApplication

testEnglishAbility

Lodged

Checked

confirm

Processed

CancelledCancelled

cancel


checkApproval

Fig. 4. Client protocols interacting with the old Australian working visa application service.

will never take the changed forward path from its current state, according to itsprotocol. The migration does not cause the violation of forward compatibility.

Definition 3.8. (Replaceability with respect to a state and a client proto-col). Let P and P ′ be two business protocols, and CP be a client protocol.Let CommonPathss, f (P, CP) be a set of common execution paths between Pand CP, from a state s = StateI

P to a final state f ∈ F . An instance I is mi-grateable to protocol P ′, with respect to client protocol CP, if and only if ∀p ∈CommonPathss, f (P, CP ), p ∈ PathsToCompletions′, f ′

P ′ , where s′ is a correspond-ing state of s.

It should be noted that when performing the forward path analysis usingclient protocols and state information, we determine the migrateability of in-stances only if the set of all forward paths that an instance can follow from itscurrent state to final states in the old protocol also exists in the context of thenew protocol.

3.6.3 Replaceability With Respect To a State, a Client Protocol and aHistory. This analysis is performed on the instances in the states returnedby non-replaceability. To extract migrateable instances, the analysis function


13:20 • S. H. Ryu et al.

examines which backward paths the instances followed in the old protocol aswell as which forward paths they will take based on the protocols of clients.Service managers conduct the analysis to the instances in states Lodged andChecked of the old protocol.

Definition 3.9. (Replaceability with respect to a state, a client protocol anda history). Let P and P ′ be two business protocols, and CP be a client protocol.An instance I is migrateable to protocol P ′, with respect to client protocol CPand its history Historys0,s

I , if and only if Historys0,sI satisfies the definition 2.3

and ∀p ∈ CommonPathss, f (P, CP ), p ∈ PathsToCompletions′, f ′P ′ , where s′ is a

corresponding state of s.

4. FUTURE INTERACTION ANALYSIS BY INFERRINGINTERACTION PATTERNS

In Section 3, we explained how to classify ongoing conversations as either mi-grateable or non-migrateable, through performing model-based analysis on pro-tocols and conversations. However, this analysis (i) assumes that protocol mod-els of clients are available, and (ii) is conservative, in that of multiple possiblefuture paths a client may take through the protocol, it assumes a worst-casescenario (forward compatibility is not guaranteed unless all possible paths arecompatible). In this section, to overcome such a limitation, we present an ap-proach for supporting service managers in further classifying conversationswhen the client protocols are unknown and based on statistically motivatedassumptions on possible future behavior of a client.

4.1 Approach

To carry out future interaction analysis, we propose to apply data mining tech-niques to audit trails of completed conversations (recorded in logs) and fromthem derive a set of models to predict the future behavior of each conversation,or at least its forward compatibility.

The data mining technique chosen for the interaction analysis is that ofdecision trees [Quinlan 1986; Witten and Frank 2005], because they combinethe ability to derive classification and prediction rules, and due to the factthat they can be easily read by users to understand the classification logic. Adecision tree is constructed from a training set of records, each of which has aset of attributes of an instance (e.g., an applicant in working visa applicationservices) and a class label (e.g., “permit” or “reject” applicant). Each node in atree identifies a set of instances that satisfy a node condition (their attributeshave certain values) as well as all conditions in the path from the node tothe root. Non-leaf nodes in a decision tree involve a test comparing a particularattribute with a constant value. Leaf nodes are associated with a class label thatcharacterizes (classifies) all instances in the leaf. Hence, to classify an instancewith a label, we start from the root node and traverse the tree, based on theinstance attributes and the node conditions, until a leaf is reached. Note thatconditions of nodes at the same level identify a partition on attribute values,so there is always one and only one path that can be taken. The combination



of conditions involved in all the tests on a path from the root node to a leafproduces a classification pattern (rule) for determining the class label assignedto the leaf. For example, a classification pattern can be: applicants who havemore than 10 years of experience and speak fluent English can obtain a visapermit.

We map the interaction analysis problem to a decision tree (classification)problem where the conversations are the objects to be classified, the classi-fication features are selected attributes of conversations, and the classes aremigrateable (interaction compatible with the new protocol) or non-migrateable(interaction not compatible with the new one) class.

To this end, the approach we follow is inspired by the ones adopted in busi-ness process prediction [Grigori et al. 2001; Castellanos et al. 2005], with somemodifications to make it applicable and practical for the problem at hand. Theidea is to generate trees for each stage in the process (in our case, for each statein the protocol) where predictions need to be made. In particular, the idea is togenerate one tree per each state and per different path (modulo loop) that canbe taken to reach that state.

In a nutshell, we look at completed conversations and build the tree by firstlabeling the conversations based on the forward path they took from the state,and hence based on whether they are migrateable or not (this can be determinedby seeing if the forward path exists in the new protocol). In building the tree,we only use features of conversations (e.g., parameters of messages) that tookthe path and that are in the state for which the tree is computed, so that nodeconditions only include features that are known (defined) for conversations thattook that path and are in that state.

At migration time, we look at active conversations and select the tree basedon the path and state of the conversation. Then we traverse the tree using theconversation features and classify the instance as migrateable or not. We nextdetail the algorithm and the approach we took to make the problem tractable,in particular in terms of reducing the potentially unlimited number of featuresto be considered for building the trees.

We next discuss how to address these challenges: how the states are identi-fied, which attributes are considered for the analysis, which trees are generated,and how the results are used to determine migration properties of conversa-tions.

4.2 Feature Selection and Data Preparation

Trees are generated from service interaction logs (logs of messages exchangedamong services). We assume that logs are clean of noise and that we have logsof conversations (logs of messages where it is known which message belongs towhich conversation). We refer the reader to Motahari et al. [2007] and Nezhadet al. [2007] for approaches to obtain such a log when it is not available in thefirst place. We also assume that the service interaction logs contain: (i) invokedmessage name, (ii) information on sender, receiver, and timestamp, (iii) SOAPheader, and (iv) SOAP body. This assumption is in agreement with the datalogged by commercial service monitoring tools (e.g., HP SOA Manager).


13:22 • S. H. Ryu et al.

4.2.1 Identifying Candidate States. In our problem domain, the goal ofclassification patterns generated by the decision tree is to identify which at-tributes can be exploited for predicting the likelihood of the conversation to ac-tually follow a certain forward path (i.e., path not affected by protocol changes).To build the decision trees, we determine the most relevant of a set of states,as there is no need to build trees for every state, as discussed in the following.

The candidate states can be identified by Algorithm 5:

—Exclude the initial state from the candidate states because there is no infor-mation to be used for constructing a decision tree, and the final states wherethe already completed conversations stay. In addition, disregard the statesreturned by the state replaceability analysis (lines (3) to (4)). In our visaapplication protocol (Figure 1), those states are Start, Cancelled, Reviewed,Processed, S-ApplicationReady, GCSubmitted, and S-Lodged.

—Exclude the states that have only one forward path to a final state regardlessof whether the path has been changed in the new protocol, since we caneasily predict the forward path that ongoing conversations in those states willtake.

—For the remaining states, determine whether there is more than one forwardpath to final states and one of the paths is changed in the new protocol. If theysatisfy these conditions, we put them into a set of candidate states becausewe have to decide whether the ongoing conversations in those states mighttake the changed forward path in the future. In the example, the identifiedcandidate states are Eligible, ApplicationReady, WESubmitted, Lodged, andChecked (lines (7) to (9)).

Algorithm 5. Identifying candidate states for building decision trees

Input: protocol P = (S, s0, F , M, R).

Output: a set of states.

begin1: Let Candidates:= { };2: foreach s ∈ S do3: if s = s0 or s ∈ F or s ∈ ReplaceableStates(P ) then;

4: continue;

5:

6: PathsToCompletion:= GetForwardPaths(P, s);7: if |PathsToCompletion| = 1 then8: continue;

7: else if |PathsToCompletion| ≥ 2

and one of PathsToCompletion is changed then8: Candidates:= Candidates ∪ s;

9: continue;

11: endfor12: return Candidates;

end



Table I. Some Classification Attributes

Attribute Value Related Message

age Nominal (e.g., {20-29, 30-39, ... }) checkEligibility

visaType Nominal (e.g., {New, Renew, ... }) checkEligibility

odl Nominal (e.g., {Yes, No}) fillInApplication

relevantLicense Nominal (e.g., {Yes, No}) fillInApplication

sponsorship Nominal (e.g., {Yes, No}) fillInApplication

workExperience Numeric (e.g., 1, 2, 3, ...) submitWorkExperience

englishTest Nominal (e.g., {LimitedUser, ... }) testEnglishAbility

referenceResult Nominal (e.g., {Satisfaction, ... }) submitReferenceLetter

checkResult Nominal (e.g., {accept, review}) checkApproval

universityName Nominal (e.g., {UNSW, ...}) fillInApplicationForOverseasStudent

country Nominal (e.g., {EU, ...}) fillInApplicationForOverseasStudent

4.2.2 Identifying Candidate Attributes. Feeding all possible attributes toa decision tree algorithm would not a good approach, because (i) the table mayhave undefined (NULL) values that might cause low accuracy of classifica-tion patterns and (ii) the computation of building the tree with all attributesbecomes heavy. Therefore, based on the states computed by the technique (de-scribed in the previous subsection), we identify which, among the many instanceattributes (Table I) in the input data table, should be selected as the interestingattributes for building decision trees that tell us which future paths the con-versations in the states will take. The approach we followed for this uses thepath information as follows:

—Given a candidate state (CS), we identify only the attributes of the messagesthat belong to a backward path from the initial state to CS. We do not need toconsider the attributes related to the messages exchanged from CS to finalstates, regardless of whether the conversations in CS will follow or not. Ifwe select the attributes from messages exchanged after the CS, targeted atderiving the state-based decision tree, the values of some of the attributesmay be indeterminate for the conversations in progress to that point, and theclassification rules generated by the decision tree including such attributesmay not work very well (lines (4) to (6)).

Algorithm 6. Identifying Attributes as Input to Decision Tree

Input: candidate state CS, input data table T, and protocol P = (S, s0, F , M, R).

Output: a set of identified attributes.

begin1: Let TotalAttr:= an attribute set of T;

2: Let IdentifiedAttr:= {} and AttrSubset:={};3: PathsFromStart:= GetBackwardPaths(P, CS);4: if |PathsFromStart| = 1 then5: AttrSubset:= an attribute set of backPath ∈ PathsFromStart;6: IdentifiedAttr:= AttrSubset ∩ TotalAttr;

7: else if |PathsFromStart| ≥ 2 then8: if PathsFromStart contains incompatible path then9: foreach backPath ∈ PathsFromStart do10: AttrSubset:= an attribute set of backPath;


13:24 • S. H. Ryu et al.

11: IdentifiedAttr:= IdentifiedAttr ∪ AttrSubset;12: M:= findMessagesOfIncompatiblePaths(PathFromStart);13: foreach m ∈ M do14: IdentifiedAttr:= IdentifiedAttr- attributes of m;

15: else16: foreach backPath ∈ PathsFromStart do17: AttrSubset:= an attribute set of backPath;

18: IdentifiedAttr:= IdentifiedAttr ∪ AttrSubset;19: return IdentifiedAttr;

end

—If there is more than one backward path from the initial state to CS, theinstances in CS might have taken different paths to lead to that state, whichmeans that the instances could have different sets of attributes dependingon the path taken. Hence, we divide the input data table into two or moretables, each having a subset of the attributes based on the path from the datatable. In this case, if there are N paths leading to CS, we generate N tables(eventually, N different trees will be generated from them) (lines (16) to (18)).

—In particular, when one of the backward paths is incompatible in the contextof new protocol, we can prune the attributes of messages involved in the path.The pruning is based on the assumption that conversations (for which futureinteraction analysis is needed) in CS already followed a compatible backwardpath and thus we do not need to look at the attributes of messages M of theincompatible path. Specifically, the messages M should not belong to theintersection between the incompatible and compatible backward paths. Forexample, if state Lodged in Figure 1 is the CS , we can prune the attributesof message submitReferenceLetter (lines (8) to (14)).

Therefore, this approach identifies the attributes relevant to the goal of thedecision tree construction and prevents the undefined values from being in-volved in the input data table.

4.2.3 Labeling Instances. After identifying the candidate attributes, weprovide the instances in the data table with class label information. We clas-sify the completed instances into two classes of instances: compatible (C) andnoncompatible (NC) instance. The compatible instance means that, from thecandidate state to a final state, the message sequence taken by the instance iscompatible in the context of the new protocol. The process of labelling instancescan be done by Algorithm 7:

Algorithm 7. Labelling instances for constructing state-based trees

Input: candidate state CS, input data table T, and protocol P = (S, s0, F , M, R).

Output: labelled training set based on a state, which also includes candidate attributes.

begin1: Let CandidateRecords:= { };2: Let CM:= null;3: foreach t ∈ T do



4: CM:= correlated message attribute of t;5: if CM includes all the incoming and outgoing messages to/from CS then6: if CM, from s0 to CS, is compatible then7: if CM, from CS to f ∈ F , is compatible then8: t is labelled with C;

9: else10: t is labelled with NC;

11: CandidateRecords:= CandidateRecords ∪ t;

12: endfor13: return CandidateRecords;

end

—We extract the instances that have passed over the candidate state (line 5).

—From the instances, we exclude the instances that have the incompatiblebackward path to the state (line 6).

—The instances filtered out by these procedures are labelled C or NC, depend-ing on whether, from the candidate state to a final state, message sequencestaken by the instances are compatible in the context of the new protocol (lines(8) to (11)).

The number of instances in the training data set for each candidate statemight be different, depending on the state, since the message sequence followedby an instance is compatible with respect to a certain candidate state, but isincompatible with respect to another candidate state.

Example 10. Suppose we create the labelled training set for the candidatestate Lodged (Figure 1). The instances that did not pass the state Lodged shouldbe excluded from the dataset. We disregard the instances that have taken thepaths (Start. ... .Eligible. fillInApplicationForOverseasStudent(). ... Processed()).Among the remaining instances, we also exclude the instances that havetaken the backward path (Start. ... .Eligible. fillInApplication().ApplicationReady.submitReferenceLetter().Lodged), which is incompatible in the context of thenew protocol. In the end, the instances labelled with C are ones that followedthe message sequence (Start. ... Eligible.fillInApplication(). ... Lodged. ... con-firm().Processed), while the instances labelled with NC followed the messagesequence (Start. ... Eligible.fillInApplication(). ... Lodged. ... reassess().Reviewed).

4.3 Tree Generation and Usage

4.3.1 Generating Tree and Interaction Patterns. Finally, we build thestate-based decision trees for each of the candidate states through feedingthe attributes from the labelled training data set. In our protocol example (inFigure 1), we build five state-based decision trees used to classify ongoingconversations migrateable to the new protocol, instead of constructing twelvetrees corresponding to the number of states. The first tree corresponding tothe state Eligible is constructed by using only data related to the checkEligibilitymessage that visa applicants sent to the service provider. For the last tree,


13:26 • S. H. Ryu et al.

Fig. 5. Simplified state-based decision tree for state Lodged.

corresponding to the state Checked, the decision tree algorithm exploits thedata obtained from the messages exchanged from the initial state Start to thestate Checked.

Example 11. Figure 5 depicts the state-based tree for the state Lodgedshowing that, in order for conversations in the state to continue to correctly in-teract with the service after being migrated to the new protocol, what attributesthey should have. The paths from the root to the middle leaves of the tree showthat, if the applicants’ number of years of experience is greater than four, theirskills belong to occupations in demand list (odl), and they got the English testresult ModestUser or GoodUser, their application instances can be classified asmigrateable because they could follow a compatible forward path in the contextof new protocol. The combination of branching conditions on these paths can beconverted to the classification (interaction) patterns of an IF-THEN statement.The generated patterns are:

• IF workExperience > 4 AND odl= ”yes” AND englishResult= ”ModestUser”THEN forward path= ”C”

• IF workExperience > 4 AND odl= ”yes” AND englishResult= ”GoodUser”THEN forward path= ”C”

For simplicity, we do not show the actual decision tree and some details ofthe results.



Adapterbased on old protocol

Old service protocol New service protocol

Clients Interactions

Fig. 6. Adapters bridging the protocol differences.

4.3.2 Applying Interaction Patterns to Ongoing Conversations. Once a de-cision tree for a certain candidate state (CS) has been built and interactionpatterns have been generated from the tree, we use the patterns to make pre-dictions about which future paths the ongoing conversations in CS will take.At that time we cannot classify conversations in CS due to the unavailabilityof client protocols. The interaction (classification) patterns for the state are re-trieved and applied to the live conversation data in order to assess whethervalues of the conversation attributes satisfy the if condition of patterns stip-ulating the features of instances that took compatible forward paths in thecontext of the new protocol. Furthermore, each rule is also associated with theprobability that the examined conversation will take compatible future inter-action. If the data of conversations in CS satisfy the rules with a probabilityabove a threshold set by service managers, those conversations can be classifiedas migrateable.

5. HANDLING NON-MIGRATEABLE CONVERSATIONS

Conversations for which no satisfactory migration solutions have been foundby the analysis methods described in previous sections are doomed to fail ininteracting with the new version of the protocol. In this section, we examinethe solutions that can be applied to avoid interrupting these conversations.In a nutshell, since the conversations concerned are non-migrateable to thenew protocol, we have to examine if it is possible to adapt the new protocoltemporarily for the sake of these remaining conversations. We first detail themethodology for adapting the protocol and then present how our tool supportsservice managers in building the adaptation.

5.1 Protocol Adaptation Methods

We examine two methods that can be used to modify temporarily the way clientsinteract with the new service protocol:

(1) Developing service adapter. An adapter is a mediator service. Its role is tomake the new protocol look like the old protocol, so that any clients inter-acting correctly using the old protocol can continue to interact with the newone (Figure 6). In our previous work [Benatallah et al. 2005], we proposedan approach for developing adapters based on the concept of mismatchpatterns (e.g., message ordering mismatch, extra message mismatch, andmissing message mismatch). Mismatch patterns capture the possible dif-ferences between two protocols for adapting. Once the pattern to use hasbeen identified, the adapter code can be generated automatically. However,this approach has the limitation of being dependent on the semantics of


13:28 • S. H. Ryu et al.

the protocol differences, which means that certain adaptation is effectiveonly when the differences do not affect or change the functionality of theold protocol. We refer the interested reader to Benatallah et al. [2005] forthe detailed description.

(2) Ad hoc protocols. To cope with the cases where we cannot develop adapters(e.g. an operation from the old protocol is removed in the new one andits features are not offered by any alternative operation), it is possible toconstruct an ad hoc protocol whose aim is to meet the new requirementswithout canceling ongoing conversations. For example, consider the work-ing visa application protocol (Figure 1). There might be still some non-migrateable conversations in state Lodged after classifying conversationsbased on the model-based analysis or the inferred interaction patterns be-cause, although their forward paths are compatible in the context of thenew protocol, they followed an incompatible backward path. In this case, itis not necessary to cancel them to satisfactorily meet the updated visa law.It could be more efficient to make them fulfill the backward compatibility.To do so, we define the ad hoc protocol in Figure 7 as adding the transitionreportMedicalExamination before the final states.

In the next section, we present how the development of ad hoc protocols canbe facilitated using the protocol evolution tool.

5.2 Supporting Ad Hoc Protocol Development

In order to adapt their service with an ad hoc protocol, service managers need toanswer a few questions: (i) which are the types of conversations that will needad hoc protocol development? What is the minimum adaptation set to developin order to satisfy as many conversations as possible? (ii) what are the featuresthat an ad hoc protocol has to fulfill for a given group of conversations?

Regarding the first question, the tool provides support in the form of con-versation clustering. Non-migrateable conversations are grouped with respectto their current state, their history and the forward path they will be taking(when known). For example, consider the state Lodged (Figure 1), for whichthe tool computed the repleaceability status (in this case, non-replaceability).Non-migrateable conversations that are currently in that state can be classifiedinto three groups: (i) conversations that have a non-compatible history and willtake a compatible forward path; (ii) conversations that have a compatible his-tory and will take a non-compatible forward path; (iii) conversations for whichboth history and forward path are non-compatible.

In order to assist the development of an ad hoc protocol, each group of con-versations, obtained as detailed above, is associated with a template that canbe used as a guideline during the ad hoc protocol definition phase. A templateis a tuple composed of <intersection protocol, mismatch types> where the in-tersection protocol consists of the states and transitions that exist in commonexecution paths to flow from the initial state to final states in the old and newprotocols, passing over the state (e.g., Lodged) in which conversations of a givengroup stay. Also, the mismatch types specify the differences between the two pro-tocols, which are only related to the intersection protocol. In particular, service




checkApproval

ApplicationReady

fillInApplication

Lodged

Checked

confirm

CancelledCancelled

cancel


Confirmed

reportMedicalExamination

Processed

Fig. 7. Example of an ad hoc protocol.

managers can automatically construct the intersection protocols of templatesusing protocol management operators [Benatallah et al. 2004a, 2006a] and thenfurther (manually) refine the intersection protocols by looking at the mismatchtype information [Benatallah et al. 2005], for example, a missing mismatch typedescribing that the clients of group 1 (in the previous example) have to providetheir medical examinations (reportMedicalExamination) to satisfy the new visalaw.

6. IMPLEMENTATION, USAGE, AND EXPERIMENTS

This section describes our prototype implementation and gives experimental re-sults. The experiment also aims at illustrating through a scenario, how the toolcan help service managers handle the dynamic evolution of business protocols.

6.1 Prototype Implementation

The tool for dynamic protocol evolution is part of a larger project called Ser-viceMosaic (http://servicemosaic.isima.fr), which is a CASE tool set for Web ser-vice life-cycle management. The ServiceMosaic platform has been implemented


13:30 • S. H. Ryu et al.

Composition editor

Mismatch-pattern editor

Trust negotiation protocol editor

Analysis and management interface

Model representation, storage, and manipulation components

Development environment

Code generator from protocol model

Adapter generator

Protocol analysis and

manipulation operators

Protocol Evolution Manager

Analysis and management components

Mismatch-patterns

templatesService descriptions

and modelsInteraction logs

Mining EngineBusiness protocol editor

Classification

patternsProtocol instances

Service manager

Repositories

Calls from external

clients via SOAP

interface

Fig. 8. Service mosaic platform.

using Java and J2EE technologies as an Eclipse plug-in. It is organized in threemodules (see Figure 8): a development environement, model representation andmanipulation components, and analysis and management components. The de-velopment environment provides means for building and editing protocol mod-els. The model representation and manipulation components are a collection ofmethods for storing and managing service descriptions and protocols. The anal-ysis and management components assist users in performing several types ofanalysis such as protocol discovery or log analysis. A detailed description ofthe platform can be found in Benatallah et al. [2006b]. This article is concernedwith three components (darkly shaded in Figure 8): the business protocol editor,the protocol evolution manager (PEM), and the mining engine.

Business protocol editor. The protocol editor offers a visual environment thatallows service managers to create or edit protocol definitions. It has been imple-mented by modifying the tool, which was developed to model security policies(trust negotiation) in our previous work [Skogsrud et al. 2004]. The editor en-ables the building of state machine diagrams and set properties for state (e.g.,state ID, state name) and transitions (e.g. transition name). The protocol edi-tor uses the XML representations of models to generate control tables, whichprovide the information required to correlate conversation instances with theprotocol’s state and ensures that messages are being exchanged as specified byprotocol definitions.

Protocol Evolution Manager. The protocol evolution manager (PEM) corre-sponds to the GUI front-end used by service managers to carry out model-basedanalysis and migrate active conversations to a new protocol. It presents the re-sult of the analysis: for each group of conversations, the possible migrationstrategies that can be applied.



Using this tool, service managers are able to:

—Load and show old and new protocols, and current active conversations fromDB.

—Choose a particular state and show the conversations at the state.

—Choose a certain conversation, show its current state, and the history takenby it.

—Show a client’s protocol, if possible, and the interaction path between theservice and client protocol.

—Perform different types of model-based analysis, classify ongoing conversa-tions as migrateable or non-migrateable, and show the migrateable ones.

—Migrate classified conversations to the new protocol, and show several mi-gration statistics, for example, percentage of migration.

—When the protocols of clients are unavailable, ask the mining engine to ana-lyze Web service interaction logs and infer interaction patterns that will beapplied to ongoing conversations.

Mining engine. It takes as input, a service interaction log and produces asoutput, the interaction patterns inferred by the state-based decision trees. Themining engine consists of two modules, namely, preprocessor and decision treebuilder. The preprocessor provides facilities for cleaning interaction logs, corre-lating messages, and extracting attributes from exchanged documents. It pro-duces a table representation of the messages exchanged during the serviceexecution grouped by conversations. This table serves as input to the decisiontree builder.

The decision tree builder proceeds as follows:

(1) The states, in the old protocol, that can be used for state-based decisiontrees are selected (Section 4.2.1).

(2) For an identified candidate state, a collection of attributes is selected tocharacterize conversation instances (see Section 4.2.2).

(3) Historical conversation instances are labeled, and the labeled conversationinstances form the training set for the candidate state (Section 4.2.3).

(4) The training set, characterized with the selection of identified attributesand with the label information in the previous steps, is used to build adecision tree for the considered state. In our prototype implementation, weuse Weka software [Witten and Frank 2005] that implements the decisiontree algorithm based on C4.5 [Quinlan 1993] (Section 4.2.4).

(5) From the decision tree, interaction patterns are generated. This last stepof the algorithm is not yet implemented. It only affects the presentation ofthe results shown to user.

6.2 Usage Scenario of the Tool

We propose the following scenario as an illustration of the support providedto service managers. First, the service manager creates the new version of theprotocol by adding or removing states and transitions. (Figure 9(a) presentsa screenshot of the editor.) After modifying the old protocol, she starts the


13:32 • S. H. Ryu et al.

Fig. 9. Usage scenario for classifying migrateable conversations.



state replaceability analysis to identify the states with conversations that canbe safely migrated to the new protocol without any conditions (Figure 9(b)).These states, S-ApplicationReady, GCSubmitted, S-Lodged, and Cancelled aredisplayed in green. In Figure 9(b), the old protocol appears in the center leftand the new one in the center right. Current active conversations are displayedin the left pane, where highlighted conversations correspond to the conver-sations currently in the selected states. Conversations that are migrated aremoved from the left pane to the right pane. The results of analysis and actionsare shown in the bottom part of the window.

After migrating all compatible conversations, the service manager can in-vestigate the remaining ones. She identifies the states of the old protocol thathave compatible forward paths but different backward paths in the context ofthe new protocol (Figure 10(c)), and then classifies migrateable conversationsfrom the computed states by looking at their histories (Figure 10(d)). The con-versations in the identified states (i.e., Processed) are highlighted in the leftpane. She can migrate these conversations to the new protocol.

Next, the service manager can investigate the states with compatible back-ward paths and for which the forward path is affected by the protocol modifi-cation (Figure 11(e)). The states identified by this analysis are Start, Eligible,ApplicationReady, and WESubmitted. If some of the conversations in these stateshave compatible forward paths, they can be migrated. Figure 11(f) shows thehistory of a migrated conversation in the center left window while the clientprotocol corresponding to this conversation appears in the pop-up window andits current state in the new protocol is displayed in the center right window.

For the conversations whose client protocols are unavailable, the servicemanager identifies the candidates states for inferring interaction patterns(Figure 12(g)). Then, for example, to classify the conversations in the stateWESubmitted, she builds the decision tree from which the interaction patterns(rules) can be generated. On the basis of the interaction patterns, conversationsin the WESubmitted state can be further classified (Figure 12(h)).

6.3 Adapting a Service Implementation to Evolving Protocols

Using top-down development approaches [Baina et al. 2004], the externalspecifications of a Web service (protocol specification) can be automaticallytransformed into the internal specifications (service implementation templates/skeletons) that can be extended with business logic by developers. In particular,since the skeletons generated by the approach are BPEL-compatible, a BPELexecution engine such as the IBM’s BPWS4J (www.alphaworks.ibm.com/tech/bpws4j) can be used to execute the skeletons, including operation implemen-tation logic.

If service managers modify the existing protocol specification, it is desirableto adapt the service implementation without regenerating the implementa-tion skeletons from scratch and repeating the enhancement of the skeletonswith business logic. In our previous work [Kongdenfha et al. 2006], we pro-posed a framework, based on aspect-oriented programming (AOP) [Courbis andFinkelstein 2005], for simplifying the adaptation of a service implementation in


13:34 • S. H. Ryu et al.

Fig. 10. Usage scenario for classifying migrateable conversations.



Fig. 11. Usage scenario for classifying migrateable conversations when client protocols are known.


13:36 • S. H. Ryu et al.

Fig. 12. Usage scenario for classifying migrateable conversations when client protocols are

unknown.



(a) Time taken to perform State Replaceability

(SR) and Forward Path Replaceability (FPR)

(b) Time taken to perform Replaceability

with respect to a history (R with respect to H)

(c) Time taken to compare two protocols and time

taken to individually look at conversations

(d) Precision and recall for different sizes of training data (# instances= 15000)

(e) Percentage of excluding irrelevant instances out of input dataset

Fig. 13. Results of experiments.

response to the protocol mismatches or changes by separating the adaptationlogic from the business logic. Such a separation helps to maintain the internalspecifications without destructively modifying them, since there needs to evolveonly the separated adaptation logic when the protocol specification changes. Weidentified mismatches between old and new protocols and wove the adaptationlogic related to the mismatches with the internal specifications (service imple-mentation). After the implementation modification, a service manager migratesthe classified ongoing conversations under the old version of the protocol to thenew version of the protocol controlled by the modified system.

6.4 Experiments

We implemented the prototype in Java using PostgreSQL 8.1 as the databaseengine. All the experiments were performed on a notebook machine with1.73GHz CPU and 1 GB of memory, running Microsoft Windows XP.

6.4.1 Model-Based Analysis Performance. In this experiment, we test thescalability of the model-based analysis (Section 3) in terms of protocol com-plexity and number of ongoing conversations. To this end, we defined five pairsof protocols (each pair consisting of an old and a new protocol) with varyingnumber of states (10, 20, 30, 40 and 50 states). We populated the system witha number of artificial conversations (1000, 2500, 5000, 7500, 10000). Each con-versation was generated by randomly choosing a path from the initial stateof the old protocol to an arbitrary state, considered the current state of theconversation.

In Figure 13, the graphs ((a)–(c)) show the performance of the replace-ability analysis. The graph (a) shows the time needed to complete the statereplaceability (SR) analysis and the forward path replaceability (FPR) analysisfor protocols of a varying number of states. The graph (b) shows the time


13:38 • S. H. Ryu et al.

needed to compute the replaceability analysis with respect to a history (Rwith respect to H) when carried out for a varying number of conversations.For comparison, the graph (c) indicates the time that would be required if theanalysis was performed one conversation at a time rather than being donedirectly at the protocol level. As can be seen from the graphs, the time takento complete these analyses grows linearly with respect to the number of statesand the number of conversations.

6.4.2 Future Interaction Analysis Evaluation. In a second experiment, weevaluated the applicability of the future interaction analysis method (Section 4).The actual accuracy of the method necessarily depends on the specifics of thebusiness process considered (some proceses may be more predictable than oth-ers). The evaluation we performed only aims at testing, in an artificial setting,whether the method can be used with large datasets containing between 1000and 20,000 instances.

The experiments were conducted on a synthetic dataset obtained by sim-ulating the working visa application protocol described in Figure 1. Messagescorresponding to the records of each dataset are correlated and attributes of thecorrelated messages are extracted, for example, SequenceID, Timestamp, Age,RelevantLicense, EnglishResult, WorkExperience, ReferenceResult, and so on.

As a preprocessing step, attribute values, when needed, were discretized anda training set of conversation instances was labeled C or NC for each candidatestate identified (see Section 4.2.3). For each candidate state, the correspond-ing decision tree was built from the training set using Weka software [Wittenand Frank 2005]. We then measured the accuracy of the inferred interactionpatterns in terms of precision and recall, varying the size of the training andvalidating data. In these tests, the accuracy value was based on the proportionof correctly classified instances.

Precision and recall can be defined as follows: given a set X of conversationinstances having a message sequence compatible with the new protocol and aset Y of instances correctly classified as compatible by the inferred interactionpatterns, the recall corresponds to the ratio |X ∩ Y |/|X | and the precision tothe ratio |Y ∩ X |/|Y |. The graph (d) shows these metrics computed from thesame dataset using a different proportion of data for training and testing. Fromthis graph, we can see that the size of the training and testing data affects theaccuracy. As expected, using a larger training dataset increases the precisionand recall of the interaction patterns. The graph (e) shows that the algorithmused for labeling instances (Algorithm 6) filters about 75% of instances thathave undefined (NULL) or irrelevant attribute values. Eliminating irrelevantinstances is important in the sense that the computation cost of trees with areduced number of instances is much lower than that of trees with the wholenumber of instances.

7. RELATED WORK

The business protocol evolution is related to other evolution problems: databaseschema evolution, software component evolution, software refactoring, work-flow evolution, and protocol evolution.



Database schema evolution: The database community has considered theproblem of managing schema evolution, mainly in the field of object-orienteddatabases [Andany et al. 1991; Bertino and Martino 1993; Lautemann 1996;Ferrandina et al. 1994; Estublier and Nacer 2000]. To meet the new require-ments of database applications, the schema definition is changed over time,by adding or removing schema elements. The work in this area has developedseveral techniques that support the mapping of schema elements from the oldschema to the new one. Such approaches include the class versioning, whichallows the old started applications to continue to use the old schema, and theconversion, which transforms the data of the database to make them complywith the modified schema. However, the business protocol evolution differsfrom the database schema evolution in two significant ways. First, in the caseof class versioning, it is acceptable for old applications to run according tothe old schema, whereas there might be situations where it is not possiblefor ongoing conversations to continue to run according to the old protocol,for example, if there are security holes in the definition of security protocol.Second, the database cannot be accessed from all applications during thedatabase reorganization (conversion) and, after conversion completion, theapplications implemented based on an old schema should be updated, compiled,and restarted against a new schema. In contrast to this technique, protocolevolution needs to migrate ongoing conversations to the new protocol withoutadapting and restarting them. In addition, unlike the database conversion, itis not possible to migrate all the conversations to the new protocol. Instead ofthe conversion technique, if we adopt the approach of simulating the schemaevolution by object-oriented views [Bratsberg 1992; Tresch and Scholl 1993],which enables different applications to use different views (seen as differentschemas), there is no need for conversion. However, applying this approachto the protocol evolution causes a similar problem, like the case of classversioning. Therefore, we are unable to use techniques for database schemaevolution in this context.Software component evolution: Software component evolution has been con-sidered important for getting the benefits of component-based software devel-opments, such as component reuse, easy maintenance, and greater flexibility.The components can be evolved since they are hardly flawless. The evolutionis a result of satisfying new application requirements, ranging from softwarestructure changes to problems and bug fixes. Most solutions to this problem arebased on versioning mechanisms [Englander 2001; Rakic and Medvidovic 2001;Eisenbach et al. 2003; Stuckenholz 2005], which provide the ability to distin-guish various versions of a software component evolving over time. To give theversion information for compatibility checks, they support enhancing the file-names of the libraries by version numbers or the libraries by special meta-filesfor example, manifest files of an XML format). So, such mechanisms enablemultiple versions of a component to exist in a system and allow applicationsto use different versions of one component. However, these approaches are notapplicable to our problem, for the objective of our work is different from theirs.Software refactoring: In real-world environments, software is modified, im-proved, and adapted to meet new requirements by adding new features or


13:40 • S. H. Ryu et al.

fixing bugs. Refactoring is the process that reorganizes the internal structureof a software system without altering the external behavior [Fowler 1999]. Soft-ware refactoring has been successful in restructuring the source code of a soft-ware system to improve the quality of the software (e.g., reusability, complexity,modularity, etc.) [Bergstein 1991; Mens and Tourwe 2004; Tokuda and Batory2001]. To facilitate adaptations and extensions for software evolution, refac-toring performs edit functions, such as adding classes, variables, and methods,or moving variables up the class hierarchy. However, our work differs in thatwe are performing change impact analysis (based on protocol models) and datamining-based analysis on conversations, and classifying migrateable ones inorder to deal with the protocol evolution problem.Workflow evolution: Protocol evolution has some similarities to workflow evo-lution [Ellis et al. 1995; Casati et al. 1998; Sadiq 2000; Kradolfer and Gep-pert 1999; Agostini and Michelis 2000; Van der Aalst 2001; Vieira and Silva2005]. Ellis et al. [1995] coined the issues of dynamic workflow change in theirwork. They exploited a Petri net abstraction for modeling dynamic change,which means change on the fly while workflow instances are running. Theirapproach is based on a change region that contains the parts of the Petri net di-rectly affected by the change. However, the change region should be calculatedmanually.

In versioning mechanisms [Joeris and Herzog 1998; Kradolfer and Geppert1999; Vieira and Silva 2005], every change causes the creation of a new versionof workflow, and each instance is bound to a particular version of the workflow.The active instances bound to a certain version of the workflow will continueto run according to this version [Van der Aalst 2001]. They are not affected byworkflow evolution because the version is not altered to reflect the changes.

The MILANO workflow management system [Agostini and Michelis 2000]provides techniques for determining whether an active instance can be mi-grated to the new workflow, and for automatically calculating the states wheremigrateable instances stay. Van der Aalst [2001] proposes an approach for tack-ling the dynamic change bug, which refers to errors caused by migrating aninstance from the old workflow to the new one. As in MILANO, this approachdetermines the change region—the part of the workflow that is affected bychanges. If instances are in the change region, their migration is postponed un-til they exit the region. However, in the protocol evolution, it might be necessaryto immediately transfer the instances in the change region to the new protocolwithout delaying their migration (e.g., by laws). To do so, we look at the prop-erties of conversations within the region and filter out the ones migrateable tothe new protocol.

Some works [Casati et al. 1998; Sadiq 2000; Rinderle et al. 2003] proposemechanisms for checking the compliance of all active instances with the newworkflow, using the information on the state and execution trace of an instance.In WIDE, Casati et al. [1998], present a set of basic operations that allow themodification of a workflow schema and preserve structural and behavioral cor-rectness when they are applied to the workflow modification. The workflowdeveloper should manually group instances according to the workflow evolu-tion impacts. However, compared to their work, the grouping and classifying



instances can be done automatically with replaceability analysis, or by the in-teraction patterns inferred through analyzing service interaction logs. Sadiqintroduces a three-phase modification process consisting of defining, conform-ing, and enacting the workflow modification. They provide two types of groupingmethods, while our framework enables service managers to conduct more fine-grained classification and to choose a greater variety of migration strategies.Also, the mechanisms [Casati et al. 1998; Sadiq 2000; Rinderle et al. 2003]are not sufficient for handling the business protocol evolution, since they can-not guarantee the correct future interaction of the migrated instances in thenew protocol, which is one of the important requirements in determining themigrateability of instances.Web service versioning: In the context of Web services, some recent works[Brown and Ellis 2004; Kaminski et al. 2006] have proposed versioningtechniques for managing the problem of Web service evolution. Brown andEllis [2004] proposed an approach based on the use of version namespaceand the use of version numbers in UDDI entry. The approach allows multipleversions of a Web service to support client services that are dependent onearlier versions of the service, like database schema evolution and softwarecomponent evolution. Kaminski et al. [2006] presented a design techniquecalled Chain of Adapters to handle the problem of managing the Web serviceversion and to achieve backward compatibility with clients written to workwith older versions of the Web service. Their approach has the limitation that,as new versions of a service come out, the chain of adapters becomes long andit takes much more time for clients compatible with earlier versions of theservice to interact with the most recent version of the service.Protocol evolution: Several approaches have been developed to support dynamicprotocol evolution, for example, communication protocol or security protocol.Ryan and Wolf [2004] explored the problem of making a distributed applicationcontinue to run when the intercomponent protocols on which its distributedcomponents base their communication evolve; and they proposed the technologyof what is called event-based translation as a solution. When the new protocolhas been defined, event-based translation techniques avoid the need of alteringan application code by making the application handle the semantic concepts ofelements in the protocol rather than the syntactic details of it. However, ourapproach differs in that we are dealing with the business protocol evolutionat higher layers of the interoperability stack in supporting interaction amongservices, namely, business-level protocols, rather than at the lower layers suchas the communication or transport layers (SOAP) [Alonso et al. 2004].

Ahmed [2006] proposed an approach for helping clients adapt to the evolv-ing business protocols. He defined a set of operations used for modifying proto-cols, and provided an algorithm for calculating new client protocols compatiblewith the changed provider protocol, by considering the list of operations ap-plied to change the old provider protocol. The newly created client protocolsare propagated to the clients so that they can adjust their systems to con-tinue to successfully interact with the new provider protocol. Compared to theclient-side adaptation mechanism, our approach of adapting conversations tothe new protocol is performed transparently to clients. Their solution also has to


13:42 • S. H. Ryu et al.

compute a new client protocol for each client interacting with the provider ser-vice. This may make the computation expensive because there could be a verylarge number of clients.Relationship with our previous work: This article presented our vision for re-solving the dynamic protocol evolution problem in SOAs. This work is partof the project called ServiceMosaic [Benatallah et al. 2006b], which has beencarried out by Service Oriented Computing (SOC) group at the University ofNew South Wales. The ServiceMosaic platform (http://servicemosaic.isima.fr)is a CASE toolset for modelling, analyzing, and managing Web service models,including business protocols, orchestrations, and adapters.

Some of the replaceability analysis described in model-based analysis(Section 3) was presented in [Benatallah et al. 2004a, 2006a; Ryu et al.2007]; this article has extended the previous work with respect to a morefine-grained analysis of protocol evolution. In addition, this work is based onthe methodology developed in our previous work [Skosgrud et al. 2007], whichsupports the evolution management of security protocols (i.e., trust negotiationprotocol). The approach presents constraints specific to security protocolsand provides analysis methods for the change impact of security protocols.When the methodology is applied to business protocols, it cannot be directlyapplicable, and hence the following changes are required: (i) as the securityprotocol model specifies the set of credentials (signed statements describingattributes of a client) disclosed to precede a state, rather than the sequence ofmessages. The analysis of security protocol change impacts should be modifiedto perform the analysis of conversations, in terms of message sequence;(ii) the violation of security-specific constraints occurs only when the conditionfor proceeding to a state is restricted (i.e., additional credentials are required).However, in the management of business protocol evolution, the violationdetecting technique should be extended to consider not only adding messages(in the security protocol, adding credentials), but also removing messages;(iii) to guarantee correct interaction after migrating conversations to the newbusiness protocol, it is important to predict the forward paths that can be takenby each conversation. However, the management of security protocol evolutiondoes not consider which forward path an conversation can take in the newprotocol; (iv) in addition, the approach of this article employed data miningtechniques to overcome the situation that occurs when the analysis methodscannot be performed due to the unavailability of client protocols. Althoughthe methodology is similar to that used in the security protocol evolution, thesolutions and results for the problem are different.

8. DISCUSSION AND FUTURE WORK

This article provided an approach to tackle the problem of dynamic evolutionof business protocols. In particular, we identified properties that can be usedas requirements in determining which conversations can be migrated to a newprotocol when an old one has been changed. In addition, to analyze the impactof protocol changes, we presented the overall change impact analysis, such asa protocol replaceability, and the other types of path/state-based analysis; and



detailed change impact analysis, based on additional knowledge (i.e., forwardpath and backward path). According to the analysis, the grouping (classifica-tion) of ongoing conversations is automatically performed by our developed tool,rather than manually by service managers. The automatic classification playsan important role in supporting flexibility in service-oriented architectures,where there are large numbers of interacting services, and it is required todynamically adapt to the new requirements and opportunities proposed overtime. In addition, we propose a data mining approach that can be applied whenclient protocols are unknown. The main result of this article is that we have pre-sented a comprehensive approach to dynamic protocol evolution management,where we provide a formal model; approaches for change impact analysis; tech-niques for inferring interaction patterns used for predicting forward paths ofconversations; and tool support for migrating active conversations from the oldto the new protocol without generating any problems such as violations of theidentified properties.

The approach presented in this article is not at all specific to business proto-cols; it can be applied to a large extent to processes and traditional services aswell, as long as we have a trace of the execution. The only part that is specificis the analysis of the protocols of clients. The proposed approach might blockthe execution of ongoing conversations during the change impact analysis inorder to guarantee the consistency of the analysis results with current states ofconversations. In addition, we need to discuss the degree of protocol complex-ity supported by the approach. This is one of the important considerations inthe management of dynamic protocol evolution, as excessive complexity mightimpact the performance. While protocol complexity can be measured in termsof state complexity (how many states a protocol has), control-flow complexity,data-flow complexity, and resource complexity [Cardoso 2007], we reduce thescope of the complexity discussion to the first two perspectives, which are mostrelated to our problem context. First, our approach scales well for complexprotocols with many states, as the time taken to perform the change impactanalysis increases linearly with the number of states. Another complexity as-pect is related to service interaction patterns [Barros et al. 2005] or workflowpatterns [van der Aalst et al. 2003]. Our change impact analysis only supportspart of the patterns (e.g., sequence, loop, or-split, etc.). Therefore, the proposedapproach should be extended in order to achieve comprehensive patterns sup-port in protocol evolution management. We leave this issue to our future re-search work. In addition, with respect to performance issues, we believe thatperformance is more influenced by how much a protocol is changed as wellas where changes happen in the protocol, rather than the protocol complexityper-se.

In future work, first, the semantic equivalence of protocol changes will beaddressed, since, when comparing two protocols, we examine only the syntac-tic differences between them, rather than the semantic changes. To do so, wewill present a variety of protocol change operations (e.g., adding a messagesequentially/parallely, removing a message sequentially/parallely, merging twomessages into one, etc.) and we will consider change logs (i.e., which operationsare applied) in analyzing compatibility properties. A good example is that, when


13:44 • S. H. Ryu et al.

two messages are merged into one, from the semantic point of view, the clientthat followed the protocol using the previous two messages is in the same stateas the new client that will use the merged message. It means that this state,located after the two merged messages, can be said to be equivalent from thesemantic point of view. We also plan to extend the change impact analysis towhat-if analysis and other types of analysis, that help service managers or busi-ness users to plan protocol changes and improve the quality of their services totheir business partners. For example, after protocol changes, how many clientscannot be migrated to the new protocol, and as a result of such changes, howthe business profit is affected, if the protocol is relevant to business transac-tions. Second, we plan to identify areas of improvements in business protocoldefinitions and exploit the knowledge generated by the analysis in the contextof service optimization. Finally, providing a variety of these analyses is notstraightforward, since there exist many different types of analyses for users toconduct, and it is difficult to satisfy their needs by predefining some queries.Hence, we plan to provide OLAP-style functionalities for service managers toperform the analysis to fit their needs.

REFERENCES

AGOSTINI, A. AND MICHELIS, G. 2000. Improving flexibility of workflow management systems. In

Business Process Management, Models, Techniques, and Empirical Studies. Springer-Verlag.

AHMED, A. 2006. Management of the impact of change of Web service protocols. Internship report,

National Institute of Applied Sciences, Lyon.

ALONSO, G., CASATI, F., KUNO, H., AND MACHIRAJU, V. 2004. Web Services—Concepts, Architecturesand Application. Springer-Verlag.

ANDANY, J., LEONARD, M., AND PALISSER, C. 1991. Management of schema evolution in databases.

In Proceedings of the 17th Conference on Very Large Data Bases (VLDB’91).BAINA, K., BENATALLAH, B., CASATI, F., AND TOUMANI, F. 2004. Model-driven Web service develop-

ment. In Proceedings of the 16th International Conference on Advanced Information SystemsEngineering (CAiSE’04).

BARROS, A. P., DUMAS, M., AND TER HOFSTEDE, A. H. M. 2005. Service interaction patterns. In

Business Process Management. 302–318. Springer-Verlag.

BENATALLAH, B., CASATI, F., GRIGORI, D., NEZHAD, H. M., AND TOUMANI, F. 2005. Developing adapters

for Web services integration. In Proceedings of the 17th International Conference on AdvancedInformation Systems Eng. (CAiSE’05).

BENATALLAH, B., CASATI, F., AND TOUMANI, F. 2004a. Analysis and management of Web service

protocols. In Proceedings of the 23rd International Conference on Conceptual Modeling (ER 2004).BENATALLAH, B., CASATI, F., AND TOUMANI, F. 2004b. Web service conversation modeling: A Corner-

stone for e-business automation. In IEEE Inter. Comput. 8, 1, 46–54.

BENATALLAH, B., CASATI, F., AND TOUMANI, F. 2006a. Representing, analysing, and managing Web

service protocols. Data Knowl. Eng. 58, 3 (Sept.).

BENATALLAH, B., CASATI, F., TOUMANI, F., PONGE, J., AND NEZHAD, H. 2006b. Service mosaic: a model-

driven framework for Web services life-cycle management. In IEEE Inter. Comput. 10, 4, 55–63.

BERGSTEIN, P. 1991. Maintenance of object-oriented systems during structural evolution. TheoryPract. Object Syst. 3, 3.

BERTINO, E. AND MARTINO, F. 1993. Object-Oriented Database Systems: Concepts and Architecture.

Addison-Wesley.

BRATSBERG, S.-E. 1992. Unified class evolution by object-oriented views. In Proceedings of the11th International Conference on the Entity-Relationship Approach.

BROWN, K. AND ELLIS, M. 2004. Best practices for Web services versioning. IBM Tech. Rep.

CARDOSO, J. 2007. Complexity analysis of bpel Web processes. Softw. Proc. Improv. Pract. 12, 1,

35–49.



CASATI, F., CERI, S., PERNICI, B., AND POZZI, G. 1998. Workflow evolution. Data Knowl. Eng. 24, 3.

CASTELLANOS, M., CASATI, F., SHAN, M., AND DAYAL, U. 2005. iBOM: a platform for intelligent busi-

ness operation management. In Proceedings of the 21th International Conference Data Engineer-ing (ICDE’05).

COURBIS, C. AND FINKELSTEIN, A. 2005. Towards aspect weaving applications. In Proceedings of the27th International Conference on Software Eng. (ICSE’05).

EISENBACH, S., JURISIC, V., AND SADLER, C. 2003. Managing the evolution in .net programs. In

Proceedings of the 6th IFIP International Conference on Formal Methods for Open Object-basedDistributed Systems.

ELLIS, C., KEDDARA, K., AND ROZENBERG, G. 1995. Dynamic change within workflow systems. In

Proceedings of the Conference on Organizational Computing Systems.

ENGLANDER, R. 2001. Developing Java Beans. O’Reilly.

ESTUBLIER, J. AND NACER, M. 2000. Schema evolution in software engineering databases—a new

approach in adele environment. CAI Computer and Artificial Intelligence Journal 19, 183–203.

FERRANDINA, F., MEYER, T., AND ZICARI, R. 1994. Implementation of lazy database updates for

an object database system. In Proceedings of the 20th Conference on Very Large Data Bases(VLDB’94).

FOWLER, M. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley.

GRIGORI, D., CASATI, F., DAYAL, U., AND CASTELLANOS, M. 2001. Improving business process quality

through exception understanding, prediction, and prevention. In Proceedings of the 27th Confer-ence on Very Large Data Bases (VLDB’01).

JOERIS, G. AND HERZOG, O. 1998. Managing evolving workflow specifications. In Proceedings of thethe 3rd IFCIS International Conference on Cooperative Information Systems.

KAMINSKI, P., MULLER, H., AND LITOIU, M. 2006. A design for adaptive Web service evolution. In

Proceedings of the ACM ICSE 2006 Workshop on Software Engineering for Adaptive and. Self-Managing Systems (SEAMS).

KONGDENFHA, W., SAINT-PAUL, R., BENATALLAH, B., AND CASATI, F. 2006. An Aspect-Oriented Frame-

work for Service Adaptation. In Proceedings of the 4th International Conference on Service Ori-ented Computing.

KRADOLFER, M. AND GEPPERT, A. 1999. Dynamic workflow schema evolution based on workflow type

versioning and workflow migration. In Proceedings of the 4th IFCIS International Conference onCooperative Information systems.

LAUTEMANN, S.-E. 1996. An introduction to schema versioning in OODBMS. In DEXA Workshop1996.

MENS, T. AND TOURWE, T. 2004. A survey of software refactoring. In Proceedings of the 20th Inter-national Conference on Data Eng. (ICDE’04).

MOTAHARI, H., SAINT-PAUL, R., BENATALLAH, B., AND CASATI, F. 2007. Protocol discovery from imper-

fect service interaction logs. In Proceedings of the 23th International Conference on Data Eng.(ICDE’07).

NEZHAD, H. M., SAINT-PAUL, R., BENATALLAH, B., CASATI, F., AND ANDRITSOS, P. 2007. Message correla-

tion for conversation reconstruction in service interaction logs. Tech. Rep., UNSW-CSE-TR-0709,

University of New South Wales.

QUINLAN, J. R. 1986. Induction of decision trees. In Machine Learning. Morgan Kaufmann.

QUINLAN, J. R. 1993. C 4.5: Programs for Machine Learning. Morgan Kaufmann.

RAKIC, M. AND MEDVIDOVIC, N. 2001. Increasing the confidence in off-the-shelf components: a soft-

ware connector-based approach. In Proceedings of the 2001 Symposium on Software Reusability.

RINDERLE, S., REICHERT, M., AND DADAM, P. 2003. Supporting workflow schema evolution by efficient

compliance checks. Tech. Rep. 2003-02, University of Ulm.

RYAN, N. AND WOLF, A. 2004. Using event-based translation to support dynamic protocol evolution.

In Proceedings of the 26th International Conference on Software Eng. (ICSE’04).RYU, S. H. 2007. A framework for managing the evolving Web service protocols in service-oriented

architectures. Master’s Dissertation, University of New South Wales.

RYU, S. H., SAINT-PAUL, R., BENATALLAH, B., AND CASATI, F. 2007. A framework for managing the

evolution of business protocols in Web services. In APCCM ’07: Proceedings of the Fourth Asia-Pacific Conference on Conceptual Modelling. Australian Computer Society, Inc., Darlinghurst,

Australia, 49–59.


13:46 • S. H. Ryu et al.

SADIQ, S. 2000. Handling Dynamic Schema Change in Process Models. In Proceedings of the 11thAustralian Database Conference.

SKOGSRUD, H., BENATALLAH, B., AND CASATI, F. 2004. Trust-Serv: model-driven lifecyle manage-

ment of trust negotiation policies for Web services. In Proceedings of the 13th World Wide WebConference (WWW2004).

SKOSGRUD, H., BENATALLAH, B., CASATI, F., AND TOUMANI, F. 2007. Managing impacts of security

protocol changes in service-oriented applications. In Proceedings of the 29th International Con-ference on Software Engineering (ICSE’07).

STUCKENHOLZ, A. 2005. Component evolution and versioning state of the art. In ACM SIGSOFTSoftware Engineering Notes, 30, 1.

TOKUDA, L. AND BATORY, D. 2001. Evolving object-oriented designs with refactorings. In J. Autom.Softw. Eng., 8, 89–120.

TRESCH, M. AND SCHOLL, M. 1993. Schema transformation without database reorganization. In

SIGMOD Record, 22, 1.

VAN DER AALST, W. 2001. Exterminating the dynamic change bug. A concrete approach to support

workflow change. In Inform. Syst. Frontiers, 3, 3.

VAN DER AALST, W. M. P., TER HOFSTEDE, A. H. M., KIEPUSZEWSKI, B., AND BARROS, A. P. 2003. Workflow

patterns. Distrib. Para. Data. 14, 1, 5–51.

VIEIRA, P. AND SILVA, A. 2005. Adaptive workflow management in WorkSCo. In 16th InternationalWorkshop on Database and Expert Systems Aplications (DEXA 2005).

WITTEN, I. AND FRANK, E. 2005. Data Mining : Practical Machine Learning Tools and Techniques.

Morgan Kauffmann.

Received July 2007; revised December 2007; accepted January 2008


Documents

Supporting the Dynamic Evolution of Web Service Protocols in