1
Semantic comparison of Visual Dataflow Programs Anh Dang Phil Cox, Dalhousie University. Semantic equivalence The comparison algorithm Alpha-beta pruning Introduction Concluding remarks •Despite 25 years of continuing research, visual programming languages (VPLs) have made few inroads into the world of industrial software development. •Visual representations are used to specify the architecture of software systems, but have not replaced textual programming languages (TPLs) for representing algorithms. •One reason is that VPLs lack source- code analysis tools, such as differencing tools. •We present a semantic differencing algorithm for controlled dataflow VPLs such as Prograph. Prograph •In Prograph, a program is a set of methods together with a set of persistents, which are globally accessible storage locations. •A method in Prograph consists of a sequence of cases, each of which is a data flow diagram of operations connected by datalinks. For example, • Two program elements in Prograph are semantically equivalent iff with the same inputs they will produce the same outputs. • The semantics of controlled data flow programs, like the semantics of functional programs and unlike those of imperative languages, is closely aligned with the syntax. • Existing differencing tools for VPLs are quite primitive compared with those that are available for TPLs and cannot detect semantic differences. •A set of subgraph isomorphisms between two directed graphs is computed, and for each function in this set, local differences of an isomorphism is calculated, as follows. •Local differences due to an isomorphism is the sum of the differences between matched operations plus differences due to mismatched connections and extra program elements, as shown in Figure 4. •To compare two methods, the algorithm traverses a search tree in which a node is either a pair of methods (M), a pair of cases (C) or an isomorphism (I), as shown in figure 5. •The value of a method or an isomorphism node is the sum of the values of its children plus the local difference. The value of a case node is the smallest value among its child nodes. •The algorithm uses depth-first search to The algorithm: Determines if two programs are semantically equivalent, even if they are syntactically different. • Finds semantic differences rather than syntactic differences Can be adapted to any controlled dataflow VPL Future work Replace subgraph isomorphism with maximal common subgraph search • Investigate domain-specific matching algorithm for cases •Apply semantic differences to program Figure 6. Difference search tree. Comparing operations: In Figure 2, the number of differences between two operations is 5 Comparing cases and methods: •Method local difference is the different number of cases between two methods. •Each case is considered as a directed acyclic graph where the vertices are the operations, and there is an edge from operation A to operation B iff there is a datalink or a synchro from A to B. Figure 3. Directed acyclic graphs for two diagrams in Figure 4. Figure 1. A method with two cases implementing the quicksort algorithm. Figure 5. The search tree. Figure 4. Counting differences between cases •The value of a case node can only get smaller as search proceeds. Hence a modified version of alpha-beta search algorithm is used to reduce the number of nodes explored in the search tree. •If at any time, the difference count of a method or isomorphism node is larger the alpha value, the algorithm does not need to explore further (cut-off). •If at any time, the difference count of a case node is equal to zero, the algorithm does not need to explore further (cut-off). •For a method node, its children nodes are sorted increasingly by case local differences to significantly reduce the total number of nodes searched. •For an isomorphism node, its children nodes are sorted decreasingly by method local differences. Figure 2. Counting differences between operations The numbers of roots is different, count 1 The first terminals have different types, as do their second terminals, count 2 Two operations have different controls and types, count 2 C C C C C A method consists a sequence of cases M C C I I I I M M M M M M C M I1 I2 I3 M1 M2 C 1 C 2 C 3 10 10 Cut- off Input and output bars must be matched Extra synchro Extra nodes and datalinks Mismatched datalinks Input -1 <= FactA Output * 1 Input -1 <= FactA Output * 1 -1

Semantic comparison of Visual Dataflow Programs Anh Dang Phil Cox, Dalhousie University

Embed Size (px)

DESCRIPTION

Semantic comparison of Visual Dataflow Programs Anh Dang Phil Cox, Dalhousie University. M. 10. C. I1. I2. I3. M1. M2. C 1. C 2. C 3. Introduction. Semantic equivalence. - PowerPoint PPT Presentation

Citation preview

Page 1: Semantic comparison of Visual Dataflow Programs Anh Dang  Phil Cox, Dalhousie University

Semantic comparison of Visual Dataflow ProgramsAnh Dang Phil Cox, Dalhousie University.

Semantic equivalence

The comparison algorithm

Alpha-beta pruningIntroduction

Concluding remarks

• Despite 25 years of continuing research, visual

programming languages (VPLs) have made few

inroads into the world of industrial software

development.

• Visual representations are used to specify the

architecture of software systems, but have not

replaced textual programming languages (TPLs) for

representing algorithms.

• One reason is that VPLs lack source-code analysis

tools, such as differencing tools.

• We present a semantic differencing algorithm for

controlled dataflow VPLs such as Prograph.

• Prograph

• In Prograph, a program is a set of methods together

with a set of persistents, which are globally

accessible storage locations.

• A method in Prograph consists of a sequence of

cases, each of which is a data flow diagram of

operations connected by datalinks. For example,

Figure 1 shows a Prograph method.

• Two program elements in Prograph are semantically

equivalent iff with the same inputs they will produce the

same outputs.

• The semantics of controlled data flow programs, like the

semantics of functional programs and unlike those of

imperative languages, is closely aligned with the syntax.

• Existing differencing tools for VPLs are quite primitive

compared with those that are available for TPLs and

cannot detect semantic differences.

•A set of subgraph isomorphisms between two directed

graphs is computed, and for each function in this set, local

differences of an isomorphism is calculated, as follows.

•Local differences due to an isomorphism is the sum of the

differences between matched operations plus differences

due to mismatched connections and extra program

elements, as shown in Figure 4.

•To compare two methods, the algorithm traverses a

search tree in which a node is either a pair of methods

(M), a pair of cases (C) or an isomorphism (I), as shown in

figure 5.

•The value of a method or an isomorphism node is the sum

of the values of its children plus the local difference. The

value of a case node is the smallest value among its child

nodes.

•The algorithm uses depth-first search to traverse the tree,

guided by heuristics based on estimates of the numbers of

differences between the items being compared at each

node.

The algorithm:

• Determines if two programs are semantically

equivalent, even if they are syntactically different.

• Finds semantic differences rather than syntactic

differences

• Can be adapted to any controlled dataflow VPL

Future work

• Replace subgraph isomorphism with maximal

common subgraph search

• Investigate domain-specific matching algorithm for

cases

•Apply semantic differences to program integration and

regression testing.

Figure 6. Difference search tree.

Comparing operations:

 

In Figure 2, the number of differences between two operations is 5

Comparing cases and methods:

•Method local difference is the different number of

cases between two methods.

•Each case is considered as a directed acyclic graph

where the vertices are the operations, and there is an

edge from operation A to operation B iff there is a

datalink or a synchro from A to B.

Figure 3. Directed acyclic graphs for two diagrams in Figure 4.

Figure 1. A method with two cases implementing the quicksort algorithm.

Figure 5. The search tree.

Figure 4. Counting differences between cases

•The value of a case node can only get smaller as

search proceeds. Hence a modified version of alpha-

beta search algorithm is used to reduce the number of

nodes explored in the search tree.

• If at any time, the difference count of a method or

isomorphism node is larger the alpha value, the

algorithm does not need to explore further (cut-off).

• If at any time, the difference count of a case node is

equal to zero, the algorithm does not need to explore

further (cut-off).

•For a method node, its children nodes are sorted

increasingly by case local differences to significantly

reduce the total number of nodes searched.

•For an isomorphism node, its children nodes are sorted

decreasingly by method local differences.

Figure 2. Counting differences between operations

The numbers of roots is different, count 1

The first terminals have different types, as do their second terminals, count 2

Two operations have different controls and types, count 2

CC CC C C C

A method consists a sequence of cases

A method consists a sequence of cases M

C C

II I I I

MM MM M M M M

C

M

I1I1 I2 I3

M1M1 M2M2

C 1C 1 C 2C 2 C 3C 3

10

10

Cut-off

Input and output bars must be matched

Input and output bars must be matched

Extra synchro Extra nodes and datalinks

Mismatched datalinks

Input

-1 <=

FactA

Output

*

1

Input

-1<=

FactA

Output

*

1

-1