K 2011 Pre-Proceedings...Dorel Lucanu Alexandru Ioan Cuza University of Iasi Salvador Lucas Universidad Polit´ecnica de Valencia Narciso Marti-Oliet Universidad Complutense de Madrid

K 2011

2nd International Workshop on the K Framework and its Applications

Cheile Gradistei, Romania

August 8-12, 2011

Pre-Proceedings

2nd International Workshop on

the K Framework and its Applications

K 2011

Cheile Gradistei, Romania

August 8-12, 2011

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

ANDREI ARUSOAIE, TRAIAN FLORIN SERBANUTA

Contextual transformations in K Framework . . . . . . . . . . . . . . . . . 1

IRINA MARIUCA ASAVOAE

Simulating Simulations in the K Framework . . . . . . . . . . . . . . . . . . 4

IRINA MARIUCA ASAVOAE

Using Abstractions in the K Framework . . . . . . . . . . . . . . . . . . . 6

MIHAIL ASAVOAE

Micro-architecture Modeling for Timing Analysis: A Case Study . . . . . . . . 8

MIHAIL ASAVOAE

Using the Executable Semantics for CFG Extraction and Unfolding . . . . . . . 10

MIHAIL ASAVOAE, DOREL LUCANU, GRIGORE ROSU

Towards Semantics-Based WCET Analysis . . . . . . . . . . . . . . . . . . 13

KYLE BLOCHER, PETER DINGES

An XML Intermediate Language Format for K Definitions . . . . . . . . . . . 16

PETER DINGES, KYLE BLOCHER

A Graphical Editor for K Definitions . . . . . . . . . . . . . . . . . . . . . 19

CHUCKY ELLISON

An Executable Formal Semantics of C with Applications . . . . . . . . . . . . 23

CODRUTA GIRLEA

Abstract Semantics for K Module Composition . . . . . . . . . . . . . . . . 26

MARK HILLS, PAUL KLINT, JURGEN J. VINJU

KRunner: Linking Rascal with K . . . . . . . . . . . . . . . . . . . . . . 28

MICHAEL ILSEMAN, CHUCKY ELLISON

On Compiling Rewriting Logic Language Definitions into Competitive Interpreters 31

DAVID LAZAR

An Executable Formal Semantics of Haskell 98 . . . . . . . . . . . . . . . . 34

iii

RADU MEREUTA, GHEORGHE GRIGORAS

Parsing challenges in K-framework . . . . . . . . . . . . . . . . . . . . . . 37

ELENA NAUM

Automated heap pattern generation . . . . . . . . . . . . . . . . . . . . . 42

ADRIAN RIESCO, IRINA MARIUCA ASAVOAE, MIHAIL ASAVOAE

Debugging Programs using the Language Definition . . . . . . . . . . . . . . 45

GRIGORE ROSU, MARK HILLS, TRAIAN FLORIN SERBANUTA

KOOL: Defining Object-Oriented Languages in K . . . . . . . . . . . . . . . 47

GRIGORE ROSU, TRAIAN FLORIN SERBANUTA

FUN: Defining Functional Languages in K . . . . . . . . . . . . . . . . . . 49

GRIGORE ROSU, TRAIAN FLORIN SERBANUTA

SIMPLE: Defining Imperative Languages in K . . . . . . . . . . . . . . . . . 51

VLAD RUSU, DOREL LUCANU

DSMLK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

VLAD RUSU, DOREL LUCANU

K Semantics for OCL - a Proposal for a Formal Definition for OCL . . . . . . . 56

ANDREI STEFANESCU

MatchC: Matching Logic Verification using the K Framework . . . . . . . . . . 59

TRAIAN FLORIN SERBANUTA

A concurrent semantics for the K framework . . . . . . . . . . . . . . . . . 61


From Language Definitions to (Runtime) Analysis Tools . . . . . . . . . . . . 63


The K-framework Tool Chain, Towards version 2.0: lessons learned and new per-

spectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

iv

Preface

This volume contains the preliminary proceedings of the Second International Work-

shop on the K Framework and its Applications (K 2011), held in Cheile Gradistei,Romania, on August 8-12, 2011. The previous K workshop was held in Nags Head,North Carolina, USA, August 15-22, 2010.

K is an executable semantic framework focused on the definition of programminglanguages, calculi, type systems, and formal analysis tools. K specifications aremade up of configurations, computations, and rules. Configurations organize thesystem/program state in labeled, potentially nested cells. Computations then carrythe ”computational meaning” of the specification as special nested list structuressequentializing computational tasks, such as fragments of a program. K (rewrite)rules generalize conventional rewrite rules by making explicit which parts of theterm are read-only, write-only, or ignored. This distinction makes K a suitableframework for defining truly concurrent languages or calculi even in the presence ofsubterm sharing between rules. Since computations can be handled like any otherterms in a rewriting environment – they can be matched, moved, modified, or evendeleted – K is particularly suitable for defining control-intensive language featuressuch as abrupt termination, exceptions or call/cc.

The topics of the workshop comprised, but were not limited to,

• foundations and models of K

• languages based on K, including implementation issues

• K as a logical framework

• K as a semantic framework, including for:· object-oriented systems· concurrent and/or parallel systems· interactive, distributed, open ended and mobile systems

• uses of K to provide rigorous support for model-based software engineering

• comparisons of K with existing formalisms having analogous aims

The final proceedings of the workshop will appear in the Electronic Notes in The-

oretical Computer Science (ENTCS) series. Papers will be based on the abstractssubmitted to the workshop, and will go through a thorough review process.

Many colleagues and friends have contributed to the success of K 2011, includingthe authors who submitted their abstracts; the local organizers in Romania, whotook care of many of the necessary arrangements needed for the workshop to occur;and the steering committee, whose guidance has led to a workshop format with agood balance of research presentations and time for collaboration. Advance thanksare also due to the Program Committee members for assembling what should bea solid final proceedings for K 2011. It is my hope that K 2011 has provided anopportunity for the attendees to discuss new uses of K, collaborate on solving openproblems, and pursue new, hopefully fruitful lines of research.

Amsterdam, August 3, 2011 Mark Hills

v

Program Committee

Maria Alpuente Universidad Politecnica de ValenciaSantiago Escobar Universidad Politecnica de ValenciaRobert Bruce Findler Northwester UniversityFabio Gadducci University of PisaDan Ghica University of BirminghamKlaus Havelund NASA Jet Propulsion LaboratoryMark Hills (Chair) CWILennart Kats Delft University of TechnologyAndrew Lenharth The University of Texas at AustinDorel Lucanu Alexandru Ioan Cuza University of IasiSalvador Lucas Universidad Politecnica de ValenciaNarciso Marti-Oliet Universidad Complutense de MadridJose Meseguer University of Illinois at Urbana-ChampaignPeter Mosses Swansea UniversityPeter Olveczky University of OsloJohn Regehr University of UtahGrigore Rosu University of Illinois at Urbana-ChampaignVlad Rusu INRIAWolfram Schulte Microsoft ResearchJurgen Vinju CWI

vi

1

K 2011: 2nd International Workshop on the K Framework and its Applications (Pre-Proceedings)

Contextual transformations

in K Framework

Andrei Arusoaie and Traian Florin Serbanuta

Faculty of Computer Science, Alexandru Ioan Cuza University, Romania

{andrei.arusoaie,traian.serbanuta}@info.uaic.ro

July 7, 2011

A large part of today’s programming languages research and development effort is moti-

vated by the new requirements arising from hardware evolution and software development.

Programming languages are becoming more and more complex, requiring precise specifica-

tions and tools for defining, testing and analyzing them. One of the most effective ways to

define a programming language or paradigm is as an executable formal specification which

can both execute programs and formally reason about them. A framework aiming at this goal

is K. K is a rewriting-based [1] semantic definitional framework which started in 2003 as a

means to define executable concurrent languages in Maude. A detailed description of K can

be found in [5], and the first prototype of K, called K-Maude, is described in [2].

A K definition for a programming language is compiled into a rewrite theory by applying

some transformations steps. One of them is the contextual transformation. To improve

the modularity of definitions, a K rule only specifies the minimal required context for its

application. The contextual transformation step uses static information about the structure

of the global running configuration to infer sufficiently additional context to make the rule

match and apply on the running configuration.

Although the K-Maude prototype already provides an implementation for context trans-

formations, this implementations lacks certain features which are important for the develop-

ment of complex definitions, such as the K definition for the C language [3]. Some limitations

of the existing approach are that (1) the configuration cannot contain cells with the same

name, (2) the cases where the context transformations could have more than one solution

are not analyzed, and (3) the locality principle does not have yet an unanimously agreed

formal specification. An investigation of all these limitations is required in order to find

correct formal definitions which can then be combined to obtain a well-defined contextual

transformation algorithm.

When giving semantics to a programming language in K, one specifies the syntax using

BNF (Backus-Naur Form) and the semantics using configurations and rules. The program

configuration is a structure which contains the context needed for computation. Config-

urations are nested structures of cells which can contain standard items as environments,

stores or other items specific to the given semantics. The rewrite mechanism of K works by

repeatedly using K rules to update the running configuration until no rules can be applied

any more. The configuration abstraction process allows one to specify only the minimal

context needed for a K rule to apply, or, in other words, only the cells with relevant content

1

2


K

k•

env•

holds

•

fstack•

xstack

control

thread *

threads

•

genv•

store•

busy

•

in•

out0

nextLoc

T

XV

kX �→ L

envL �→ V

store

↓

XV

kX �→ L

env

thread

threads

L �→ V

store

T

Figure 1: SIMPLE: configuration (left), variable lookup rule (top-right) and variable lookup

rule context (top-left)

for the rule.

For example, a rule which corresponds to variable lookup (Figure 1 - top-right) in the

K definition of SIMPLE language [4] specifies only the cell containing the computation,

specifying how the value of the variable in the memory replaces the variable at the top of

computation, and the cell specifying the memory where the mapping from the variable to

its value is located. The rest of the configuration (Figure 1 - left) remains the same. The

problem now, is that the left hand side of this rule could not match the configuration in this

form because some cells are missing and the matching algorithm is not able to infer the

cell structure. This is the place where contextual transformations are needed: they infer the

missing cells from configuration taking care of different cell properties.

Because of contextual transformations, K has two big advantages:

• abstraction, given by the fact that a user specifies in an abstract way the items of

interest from configuration without caring about the concrete configuration, and

• modularity, given by the fact that in an existing definition a user can add more items in

the configuration and rules without modifying the initial definition

The current implementation of context transformers has some limitations and its formal

definitions are not completely spelled out. Complex definitions, like the C semantics for

instance, show that sometimes the existing algorithm leads to unexpected behaviors. Our

approach aims to eliminate its drawbacks by finding a formalism which models entirely

contextual transformations and could be used to develop an algorithm which fits this for-

malism. In order to do that, we split the contextual transformation into a set the following

sub-problems:

1. Determine the set of all matchings between a rule and a given configuration. A

matching is represented by that component of the configuration which matches a rule

2

3


in the sense that the cell nesting from the rule is preserved in the configuration. Havingthe set of all matchings is very useful for disambiguation because different filters canbe applied over this set until an unique context for a rule is obtained.

2. Find an appropriate formal definition for locality principle. In [5] the locality principleis described as follows:"... rules are transformed in a way that makes them as local as possible, or, in otherwords, in a way that the resulting rule matches as deeply as possible in the concreteconfiguration."

On the other hand, in [6] the locality principle has another semantics: the contextshould be as minimal as possible. However, both these definitions are informal andtherefore harder to analyze and compare. We propose formal definitions for bothapproaches and then analyze and discuss the differences between them.

3. Apply the contextual transformation to a rule. A rule is split by the rewrite symbol("=>") in left hand side (lhs) and right hand side (rhs). Each of them can containcells because K allows us to create, delete or replace cells with other cells. Contexttransformations should take care of lhs and rhs separately until they reach the samelevel and from that point they should be transformed both until the full context isreached. Sometimes, we can obtain ambiguities when solving contexts separately forlhs and rhs but often they can be disambiguated by putting them together.

4. Find disambiguation rules. We can disambiguate using cell types when searching formatchings, using locality principle or consistency filters.

The solutions of these problems are then combined into an algorithm which computesthe context for a K rule.

This research will be incorporated in the new version of the K tool and are meant togreatly improve the usability and modularity of K definitions.

References[1] F. Baader and T. Nipkow. Term Rewriting and all that. Cambridge University Press,

Cambridge, 1998.

[2] Traian Florin Serbanuta and Grigore Rosu. K-Maude: A rewriting based tool forsemantics of programming languages. In Peter Csaba Ölveczky, editor, Rewriting Logicand Its Applications - 8th International Workshop, WRLA 2010, volume 6381 of LectureNotes in Computer Science, pages 104–122, 2010.

[3] Chucky Ellison and Grigore Rosu. A formal semantics of C with applications. TechnicalReport http://hdl.handle.net/2142/17414, University of Illinois, November 2010.

[4] K Framework Google Code Page: http://code.google.com/p/k framework/.

[5] Grigore Rosu and Traian Florin Serbanuta. An overview of the K semantic framework.Journal of Logic and Algebraic Programming, 79(6):397–434, 2010.

[6] Traian Florin Serbanuta. A Rewriting Approach to Concurrent Programming LanguageDesign and Semantics. PhD thesis, University of Illinois at Urbana-Champaign, Decem-ber 2010.

3

4


Simulating Simulations in the K Framework

Irina Mariuca Asavoae

Faculty of Computer ScienceAlexandru Ioan Cuza University, Iasi, Romania

[email protected]

1 Abstract

This work proposes an attempt on describing a method for providing in Kautomatic correctness checks between concrete and abstract semantics forprogramming languages. The settings involve K definitions of concrete andabstract semantics and then defining and run-time checking a simulation re-lation between these two. To do this, we formalize the notion of K transitionsystem and work out the correctness of the K abstract execution by analysisof its associated KTS. We exploit here the fact that K is a unifying frame-work which places under the same umbrella both the programs’ concreteexecutions and programs’ verification methods [3].

Our previous work on abstractions involved predicate abstraction [1]where we proved the existence of a Galois connection between the concreteand the abstract systems, while in the work for symbolic path execution [2] werelied on proving the existence of a stuttering simulation. More recently wehave implemented the proposed abstraction from [4], while the formal prooffor the existence of a bisimulation is already provided in the theoretical work.

Out of these experiments, and considering the attempt of extracting theabstract execution automatically from the concrete one, we draw the con-clusion that the simulation relation can be expressed as an algorithm whichmanipulates the trace semantics of the concrete and abstract systems. Cur-rently we rely on a trace semantics manually defined for each system, as auser defined cell activated for each rule. However, we consider that the em-bedding into the K prototype of an add-on for trace semantics is reasonablystraightforward through the Maude possibility of obtaining execution traces.

1

5


References

[1] I.M. Asavoae, and M. Asavoae. Collecting Semantics under Predicate Ab-straction in the K Framework. In WRLA 10, vol. 6381 of LNCS, Springer,2010.

[2] I.M. Asavoae, M. Asavoae, D. Lucanu. Path Directed Symbolic Execution inthe K Framework. In SYNASC 10, of IEEE, 2011.

[3] G. Rosu and T. F. Serbanuta. An overview of the K semantic framework.Journal of Logic and Algebraic Programming, 79(6):397–434, 2010.

[4] J. Rot, M. Bonsangue, and F. de Boer A Pushdown Automaton for Unbounded

Object Creation. PhD thesis, Technische Universitat Munchen, 2002.

2

6


Using Abstractions in the K Framework

Irina Mariuca Asavoae

Faculty of Computer ScienceAlexandru Ioan Cuza University, Iasi, Romania

[email protected]

1 Abstract

We present a method and a tool for verifying LTL properties of programswritten in a prototype programming language which allows for the creationof an unbounded number of objects with reference fields. The method usesan abstraction technique which is precise and finitary when the size of thevisible heap is bounded. We use K [4] to implement both concrete andabstract semantics of the language, and then extend the abstract semanticsinto a collecting semantics to obtain the reachability automaton of a program,for which we employ the LTL model checker of Maude.

This work contains the results of the collaboration between a researchteam from the Netherlands and the research team from Romania, formed byDorel Lucanu and the author. The collaboration started with the Romanianteam implementing in K the previous work of the Dutch team on objectoriented languages [5]. They introduce a novel technique for resolving nameclashes in the context of reuse of object identities. It is based on the conceptof cut points as introduced in [3] to support static analysis via abstract inter-pretation techniques. Cut points are objects in the heaps that are referredto from both local and global variables, and as such are subject to modifica-tions during a procedure call. Recording cut points in extra logical variablesallows for a precise abstract executions of the program, which in case of abound on the visible heap can be represented by a finitary structure, namelythat of a pushdown system. This allows us to make use of the fact that thecomplexity for model checking LTL properties of pushdown systems is of thesame order as for finite state systems [2, 6].

As such, we have faithfully implemented the formal semantics defined bythe Dutch team, using it as a case study for our own goal, namely extract-

1

7


ing a methodology for defining abstractions in K. This methodology wouldpresumably allow for modularity of the definitions such that we could alsoemploy composition of abstractions. We have previously experimented thedefinition of a different abstraction for a simple imperative language, namelypredicate abstraction [1], for which we defined also a model checking proce-dure in K using collecting semantics. We practically use the same techniqueto come up with the current work. The difference is in the fact that, forthe moment, we use the Maude model checker, instead of implementing ourown model checking procedure. This latter topic is the content of our futurework.

References

[1] I.M. Asavoae, and M. Asavoae. Collecting Semantics under Predicate Ab-

straction in the K Framework. In WRLA 10, vol. 6381 of LNCS, Springer,

2010.

[2] A. Bouajjani, J. Esparza, and O. Maler. Reachability analysis of pushdown

automata: Application to model checking. In Proc. CONCUR 97, vol. 1243

of LNCS, Springer, 1997.

[3] N. Rinetzky, J. Bauer 0001, T. W. Reps, S. Sagiv, and R. Wilhelm. A se-

mantics for procedure local heaps and its abstractions. In POPL 2005, pp.

296–309, 2005.

[4] G. Rosu and T. F. Serbanuta. An overview of the K semantic framework.

Journal of Logic and Algebraic Programming, 79(6):397–434, 2010.

[5] J. Rot, M. Bonsangue, and F. de Boer A Pushdown Automaton for Unbounded

Object Creation. PhD thesis, Technische Universitat Munchen, 2002.

[6] S. Schwoon. Model-Checking Pushdown Systems. PhD thesis, Technische

Universitat Munchen, 2002.

2

8


Micro-architecture Modeling for Timing Analysis: A Case Study ∗

Mihail Asavoae11Alexandru Ioan Cuza University, Romania

[email protected]

1 Introduction

Modern processors feature aggressive optimizations that influence the execution of programs.WCET estimation in the presence of micro-architecture becomes harder, as micro-architectureintroduces difficult to predict or even non-deterministic behaviors. In the context of WCETestimation, the modeling of instruction and data caches and in-order pipelines have been themost popular [3, 4, 5, 6]. We plan to cover a modular design for instruction and data cachesand a simple main memory model. Since modularity is our modeling target, we expect to beable to plug-in various micro-architecture elements, without changing a programming languagedefinition. Therefore, our design relies on a number of modules, corresponding to the micro-architecture features of interest, which communicate using predefined message names. We takeadvantage of the K framework [7, 2] to provide modularity to our design. Next, we explain howto refine a main memory modeling, to accommodate instruction and data caches.

The modeling for the main memory follows the organization of a disassembled executablefile, with two K cells to stand for code and data memory, cmem, respectively dmem. The k cellprocesses the requests for both instructions or data that come from the processor. The latteris abstractly represented by an assembly language definition [1]. The ”processor” issues aninstruction request, identified by the value in the program counter register, PC. The memorysystem interprets the PC value as an address and checks the code memory part, the cmemcell, for it. There are two possible cases. First, the instruction is found in the code memorycmem, and the control is back to the processor. Second, if the instruction is not found, a specialtoken, last, signals the execution termination. In a similar fashion, the main memory receivesrequests for data read with the getd(Addr) message, or data write with the putd(Addr, Data)message. The data memory cell is checked using Addr value and, if necessary, updated withData value.

When we model instruction and data caches, we consider two kinds of features. First,a cache memory is characterized by a number of parameters, such as cache size, number ofcache lines, cache associativity. Second, a cache memory exhibits a number of behaviors suchas information management or replacement policies. The state information, described usingK configurations, for both the instruction and data caches (1) share the cache parameters andprofiling information and (2) poses some differences when handle cache misses. Next, we presentseveral generalities about (1). The cache size is the total number of bytes that can be stored.The cache line size is the number of bytes that can be transfered to and from the memory inone step. The associativity describes the relation between cache lines and memory blocks. Amemory block can reside anywhere in the cache, in a group of cache lines or in exactly one

∗This work has been supported by Project POSDRU/88/1.5/S/47646 and by Contract ANCS POS-CCE,

O2.1.2, ID nr 602/12516, ctr.nr 161/15.06.2010 (DAK).

1

9


line. This leads to the standard terminology of fully associative caches, for the first case, N-associative caches for the second - with N the number of cache lines and the direct-mappedcaches for the last case. We assume the particular case of a cache line and a memory blockhaving the same size. Regarding (2), the data cache miss on a write operation poses specificproblems. For example, there are two possible policies to maintain coherent information betweencache and main memory contents, write-through and write-back. The former implies that themain memory is updated on every write, while the latter keeps the modified data in the cache,until the eviction. Therefore, the state information contains a list of ”dirty bits” to emphasizeinconsistent data between the cache and the main memory.

The K configuration for the instruction cache consists of the cache content, the replacementpolicy, information about parameters and profiling. The instruction cache cell, ic, containsinformation of the form Addr �→ iwrap(PC, Instr), where Addr is the cache address that holdsthe value of the instruction Instr at the program point PC in the program. The replace cellcaptures, in a modular way, a number of possible replacement policies. The basic idea is to usea copy of the cache memory that is annotated with various ”age” information for the classicalFIFO and LRU policies. We recall that a replacement policy decides which memory block isevicted from cache. The last cell, whereas it is not necessary for our timing analysis, keepsprofiling information about the number of cache hits and cache misses. For this purpose, theprofile cell contains only two counters. The replacement policy is important for the instructioncache miss situation. The configuration has a special cell called replace that actually maintainsa shadow copy of the cache with ”age” information to enable a parametric implementation oftwo of the most popular replacement policies: FIFO (round-robin) and LRU. The time valuesets the instruction ”age” attribute and we capture it using a corresponding K cell.

In this paper, we propose the design and implementation of various micro-architecture el-ements, for the purpose of timing analysis. We specify a simplified main memory, instructionand data caches following the same definitional style used to define programming languages.We use the K framework, therefore the definitions are executable and modular.

References

[1] Mihail Asavoae, Dorel Lucanu, and Grigore Rosu. Towards semantics-based wcet analysis.In WCET, 2011. to appear.

[2] Traian Florin Serbanuta and Grigore Rosu. K-Maude: A rewriting based tool for semanticsof programming languages. In WRLA 2010, volume 6381 of LNCS, pages 104–122, 2010.

[3] Christian Ferdinand and Reinhard Wilhelm. Efficient and precise cache behavior predictionfor real-time systems. Real-Time Systems, 17(2-3):131–181, 1999.

[4] Marc Langenbach, Stephan Thesing, and Reinhold Heckmann. Pipeline modeling for timinganalysis. In SAS, pages 294–309, 2002.

[5] Xianfeng Li, Liang Yun, Tulika Mitra, and Abhik Roychoudhury. Chronos: A timing ana-lyzer for embedded software. Sci. Comput. Program., 69(1-3):56–67, 2007.

[6] Yau-Tsun Steven Li, Sharad Malik, and Andrew Wolfe. Efficient microarchitecture modelingand path analysis for real-time software. In IEEE RTSS, pages 298–307, 1995.


2

10


Using the Executable Semantics for CFG Extraction andUnfolding ∗

Mihail Asavoae11Alexandru Ioan Cuza University, Romania

[email protected]

1 Introduction

Knowledge of program execution time bounds is important in the context of design and veri-

fication of embedded real-time systems. Such systems interact with the external environment,

yielding a set of real-time constraints that ensures the correctness of the design. In the verifica-

tion process, it is important to know a priori tight upper bounds on worst-case execution time

(WCET) [8] of hard real-time software components of the system.

One may distinguish between two distinct approaches for the worst-case estimation problem:

the high-level and the low-level analysis. Low-level analysis considers both the program, usually

at the assembly language level, and a model of the underlying architecture. The expected results

should be accurate to ensure that the timing requirements are satisfied. In the context of low-

level analysis, the problem at hand is the WCET estimation of a given program on a given

processor. Thus, two important issues should be addressed: the longest path search and the

micro-architecture modeling. The longest path analysis returns the sequence of instructions

that will be executed in the extreme scenario. The micro-architecture modeling describes the

hardware system that the program is executed on and determines the WCET of a known

sequence of instructions. Most successful approaches include, among others, the use of integer

linear programming, ILP, for the longest path [7, 4] and the use of abstract interpretation for

micro-architecture modeling [3], or the use of ILP for both the program and micro-architecture

consideration problem [5].

The longest path search problem is particularly important in the context of low-level worst-

case execution time analysis. This implies that all program executions are exhibited and in-

spected, via convenient abstractions, for their timing behavior. We propose a new approach,

based on the K framework [6, 2], to generate all executable paths of a program, via unfold-

ing. In this way we create the premises to apply useful abstractions to prune the search space,

while collecting timing information. We use the K framework to define the syntax and the ex-

ecutable semantics of the language of interest and then we (1) explore the definition to extract

control-flow information and (2) unfold the result, using manual loop bound annotations as

well as further structural-related assertions. The result of (1) is a safe over-approximation of

the control flow graph, which is further unfolded using space exploration capabilities of Maude

system. We use a definition for an integer subset of Simplescalar [1] PISA assembly language,

that we call SSRISC assembly language. We present a number of SSRISC instructions, grouped

in arithmetic-logic instructions, branch and jump instructions, load and store instructions and a

special instruction for program errors - break. The second group includes the infamous indirect


O2.1.2, ID nr 602/12516, ctr.nr 161/15.06.2010 (DAK).

1

11


branch, with the instruction jr Rs, as well as branch instructions and the unconditional jump.

The break instruction means an abrupt termination of the program.

Our approach works on an assembly programs, obtained from disassembling the Simplescalar

[1] executable files. One of the problems with the executables is that each instruction address

is a potential target of an indirect jump, represented in our language by the jr instruction.

The actual target address is known precisely at runtime. We define the program unfolder

in two phases. In the first phase, the modular definition of the formal executable semantics

is extended to accommodate control flow information. The only modified semantic rules are

those of branch and jump instructions. The second phase uses the over-approximation of the

control-flow graph, computed during the first phase and a set of user loop bound annotations.

The concrete semantics of the language is used to define the simplified abstract semantics that

uses symbolic values instead of real values. The definitional program unfolder outputs a trace

semantics for the program. Our method targets a particular class of programs, called hard

real-time programs, which have a bounded number of loops iterations and recursive function

calls.

We also rely on several assumptions. The analyzed code is structured, to ensure a well-

specified unfolding of the program. This assumption triggers another one, in the case of an

indirect jump, the instruction address that causes the jump is eliminated from the set of possible

targets, not to introduce infinite executions. Also, the target for the indirect jump set of

potential addresses is limited to only the addresses that are in the same block of instructions as

the indirect jump instruction. The current design and implementation is amenable to further

extensions.

We mention here a few advantages that our proposed method has over the existing ap-

proaches. The first advantage is the possibility of extending and executing the program con-

crete semantics to collect the information of interest, using the intrinsic modularity of the Kframework. A second advantage is that we work in the K-Maude, the implementation of K on

top of Maude system, and in this way we have access integrated analysis and verification tools.

In this paper we presented a definitional program unfolder, based on the formal executable

semantics of a target language. We worked with K, a rewrite-based framework for design and

analysis of programming languages. Our methodology had two phases. First, it extracted,

via reachability analysis, a safe control-flow graph (CFG) approximation, directly from the

executable semantics of the language. Second, it unfolded the control-flow graph, annotated

with loop bounds, and generated the set of all possible program executions. The two-phased

methodology describes, what we call, a definitional program unfolder and is implemented using

the K-Maude tool, a prototype implementation of the K framework.

References

[1] Doug Burger and Todd M. Austin. The simplescalar tool set, version 2.0. SIGARCH

Comput. Archit. News, 25:13–25, June 1997.

[2] Traian Florin Serbanuta and Grigore Rosu. K-Maude: A rewriting based tool for semantics

of programming languages. In WRLA 2010, volume 6381 of LNCS, pages 104–122, 2010.

[3] Christian Ferdinand and Reinhard Wilhelm. Efficient and precise cache behavior prediction

for real-time systems. Real-Time Systems, 17(2-3):131–181, 1999.

[4] Xianfeng Li, Liang Yun, Tulika Mitra, and Abhik Roychoudhury. Chronos: A timing ana-

lyzer for embedded software. Sci. Comput. Program., 69(1-3):56–67, 2007.

2

12


[5] Yau-Tsun Steven Li, Sharad Malik, and Andrew Wolfe. Efficient microarchitecture modeling

and path analysis for real-time software. In IEEE RTSS, pages 298–307, 1995.

[6] Grigore Rosu and Traian Florin Serbanuta. An overview of the K semantic framework.

Journal of Logic and Algebraic Programming, 79(6):397–434, 2010.

[7] Reinhard Wilhelm. Why ai + ilp is good for wcet, but mc is not, nor ilp alone. In VMCAI,pages 309–322, 2004.

[8] Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing,

David Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra,

Frank Mueller, Isabelle Puaut, Peter Puschner, Jan Staschulat, and Per Stenstrom. The

worst-case execution-time problem—overview of methods and survey of tools. ACM Trans.Embed. Comput. Syst., 7(3):1–53, 2008.

3

13


Towards Semantics-Based WCET Analysis ∗

Mihail Asavoae1, Dorel Lucanu1, Grigore Rosu2

1Alexandru Ioan Cuza University, Romania

{mihail.asavoae, dlucanu}@info.uiac.ro2University of Illinois at Urbana-Champaign, USA

[email protected]

1 Introduction

Ideally, program analysis tools should be based on rigorous semantics of the employed program-ming languages. Unfortunately, giving a formal semantics (using conventional approaches) to areal language is a non-trivial matter; moreover, even when a semantics is available, it is oftennot easy to use it for program analysis. Recent research in rewriting logic semantics and in tooldevelopment based on such semantics [13, 3] shows encouraging results with respect to bothexpressiveness and scalability. Moreover, the application of these techniques in the contextof real-world low-level languages such as Verilog [9] gives us hope that the theoretically idealsemantics-based approach to program analysis may be, after all, also practically feasible.

We propose a general methodology for worst-case execution time (WCET) analysis centeredaround a formal executable semantics of the underlying language. We assert that the formaldefinition of a language has all the necessary information to be used for WCET program anal-ysis and verification. We use the K rewrite-based semantic framework [13, 3] to define a formalexecutable semantics for a RISC assembly language, namely an integer restricted fragment ofSimplescalar [1]. With it, we can take C programs and “execute” them semantically as follows:first compile them into executables, then extract assembly programs from them using the Sim-plescalar disassembler [1], then execute the resulting assembly programs in our K semantics.The choice of the Simplescalar toolset is inspired from [8]. K is highly-modular, allowing us tostart with a high-level semantics of the language and then plugging in the K description of var-ious micro-architecture such as caches, pipelines or hardware speculation techniques. This way,we specialize the original high-level semantics to one specific to a particular processor, whichis what we eventually analyze. The K tool suite (http://fsl.cs.uiuc.edu/K) provides supportfor concrete and symbolic execution, for state space exploration of concurrent and/or non-deterministic programs, for LTL model-checking, and for full-fledged verification (see also [12]).

To exemplify our semantics-based approach to WCET analysis, we picked a specific but im-portant problem: detection and elimination of erroneous paths, a subset of infeasible executionpaths. Indeed, being able to identify such paths in a program and eliminate them from thecalculation of the WCET can significantly tighten the estimated WCET bounds.

WCET analysis determines, for all the possible input data, the longest execution pathof a program that runs on a particular architecture. Thus, WCET analysis addresses twoissues: longest path search and micro-architecture modeling. The former relies on the pathanalysis ability to discover and eliminate the executions that cannot be exercised under anyinput, executions called infeasible. Existing solutions for the path analysis problem includestatic analyses [6, 14, 18] based on abstract interpretation [2], integer linear programming (ILP)


O2.1.2, ID nr 602/12516, ctr.nr 161/15.06.2010 (DAK).

1

14


approaches [16, 5, 17] and measurement based methods [4, 10]. The longest path search couldexhibit a particular problem, with impact on timing bounds tightness and this current workpresents only that.

Most of the aforementioned approaches work on assembly language code, extracted fromexecutable files. The path analysis classifies the execution paths into feasible and infeasible.The programs may also exhibit error execution paths, either for certain input value or underspecial conditions (i.e. linking of recompiled code fragments). The most common errors havenumerical causes such as overflow/underflow and division by zero, or are memory-related, incase of misaligned accesses. It is important to discover and to use error-related knowledge aboutprograms to improve the timing predictability [15]. For example, the single-path programmingtechnique [11, 7] advocates for predicated code generation, when division by zero errors arepossible. Timing analyzers further utilize this predicated instrumentation to improve on timingbounds estimation. It is also possible that the underlying compiler generates preventive codeto test if certain numerical errors are possible. The extra tests in the generated code implicitlycover the erroneous paths during the WCET analysis.

In this particular work, we rely onK versatility to define both the programming language andassociated analysis methods for erroneous paths detection in WCET estimation. Definitions ofcertain instructions explicitly state program error conditions such as integer overflow/underflow,division by zero or misalignment. In this way, the erroneous paths are explicitly exposed bythe semantic rules. Using the formal semantics, we do not rely on the compiler to generatepreventive code, nor on manual instrumentation for error path detection. We use the concreteexecutable semantics, augmented with timing information, to derive abstract semantics andthen we employ reachability analysis techniques to detect and eliminate erroneous paths in thecontext of WCET analysis.

References

[1] Doug Burger and Todd M. Austin. The simplescalar tool set, version 2.0. SIGARCHComput. Archit. News, 25:13–25, June 1997.

[2] Patrick Cousot and Radhia Cousot. Abstract interpretation: a unified lattice model forstatic analysis of programs by construction or approximation of fixpoints. In POPL, pages238–252. ACM Press, 1977.

[3] Traian Florin Serbanuta and Grigore Rosu. K-Maude: A rewriting based tool for semanticsof programming languages. In WRLA 2010, volume 6381 of LNCS, pages 104–122, 2010.

[4] Jean-Francois Deverge and Isabelle Puaut. Safe measurement-based wcet estimation. InWCET, 2005.

[5] Jakob Engblom and Andreas Ermedahl. Modeling complex flows for worst-case executiontime analysis. In Proceedings of the 21st IEEE conference on Real-time systems symposium,RTSS’10, pages 163–174, 2000.

[6] Christopher A. Healy, Mikael Sjodin, Viresh Rustagi, and David B. Whalley. Bounding loopiterations for timing analysis. In IEEE Real Time Technology and Applications Symposium,pages 12–21, 1998.

[7] Raimund Kirner and Peter Puschner. Time-predictable computing. In Proc. 8th IFIPWorkshop on Software Technologies for Future Embedded and Ubiquitous Systems, Oct.2010.

2

15


[8] Xianfeng Li, Liang Yun, Tulika Mitra, and Abhik Roychoudhury. Chronos: A timinganalyzer for embedded software. Sci. Comput. Program., 69(1-3):56–67, 2007.

[9] Patrick O’Neil Meredith, Michael Katelman, Jose Meseguer, and Grigore Rosu. A formalexecutable semantics of Verilog. In MEMOCODE’10, pages 179–188. IEEE, 2010.

[10] Stefan M. Petters. Comparison of trace generation methods for measurement based wcetanalysis. In WCET, pages 75–78, 2003.

[11] Peter Puschner. The single-path approach towards wcet-analysable software. In Proc.

IEEE International Conference on Industrial Technology, pages 699–704, Dec. 2003.

[12] Grigore Rosu, Chucky Ellison, and Wolfram Schulte. Matching logic: An alternative toHoare/Floyd logic. In AMAST ’10. LNCS, 2010. forthcoming.


[14] Friedhelm Stappert and Peter Altenbernd. Complete worst-case execution time analysisof straight-line hard real-time programs. Journal of Systems Architecture, 46(4):339–355,2000.

[15] Lothar Thiele and Reinhard Wilhelm. Design for time-predictability. In Design of Systems

with Predictable Behaviour, 2004.

[16] Yau tsun Steven Li and Sharad Malik. Performance analysis of embedded software usingimplicit path enumeration. In in Proceedings of the 32nd ACM/IEEE Design Automation

Conference, pages 456–461, 1995.

[17] Reinhard Wilhelm. Why ai + ilp is good for wcet, but mc is not, nor ilp alone. In VMCAI,pages 309–322, 2004.

[18] Reinhard Wilhelm and Bjorn Wachter. Abstract interpretation with applications to timingvalidation. In CAV, pages 22–36, 2008.

3

16


Replace this file with prentcsmacro.sty for your meeting,or with entcsmacro.sty for your meeting. Both can befound at the ENTCS Macro Home Page.

An XML Intermediate Language Format for KDefinitions

Blocher, Kyle1 Dinges, Peter2

Department of Computer ScienceUniversity of Illinois at Urbana–Champaign

Urbana, IL, USA

Keywords: K Framework, Parser, Definition File Format, XML

1 Introduction

One of the hurdles that stands between the conception of a tool for K def-initions and its implementation is the lack of an easy-to-use file format forimporting and exporting K definitions. Such an interchange format shouldfocus on the needs of tool writers and (a) be easy to use, (b) have a formaldefinition, and (c) support an easy upgrade path. Verbosity and readabilityare of secondary concern, because as a communication format for K programs,the format is largely transparent to the user.

For tool writers, a file format is easy to use if the implementation languageprovides a tool set for it; a familiar notation is a bonus. When building theirtool, writers most likely want to focus on their specific idea – and not getbogged down by parsing intricacies. A formal definition provides additionalguidelines: it is valuable for error detection and can give first hints at theorganization of the tool’s internal state. Finally, a file format that allowsextensions (within limits) without requiring a parser rewrite prevents bit-rotthat otherwise would break tools at every evolutionary step of K.

1 Email: [email protected] Email: [email protected]

c�2011 Published by Elsevier Science B. V.

17


Blocher, Dinges

While the labeled plain text syntax of K is easy to parse, it has two short-comings that make it unsuitable for an interchange format: it neither has aformal definition, nor allows it for easy extensions in the form of annotations.These drawbacks can be fixed, but we feel that the advantages of a custominterchange format are by far outweighed by the lack of default parsers, miss-ing tool support (like schema verifiers), as well as the effort required for itsdefinition.

In the first part of the full article, we propose an XML format for Kdefinitions that avoids these problems. After discussing the requirements foran interchange format in greater depth, we define it formally using XMLschema and show how stock XML tools simplify semantic operations on Kdefinitions like checking for nested rewrites.

In the second part of the article, we discuss our experience with buildinga K to XML converter. It mostly focuses on the issues we faced when parsingK with concrete syntax and the resulting final parser architecture.

2 Parsing and Converting Plain Text K

The current implementation of K-Maude effectively defines the plain text syn-tax of K. It uses Maude’s advanced parsing facilities and thus spreads thesyntax definition across two layers of abstraction: Maude’s parser, and thedefinition of K in Maude. With our dedicated parser, we hope to gatherall relevant information about the core syntax in one place, thereby makingit easier to understand. Having a dedicated, standalone parser also solvessome of the peculiarities of parsing with Maude: For example, it is easy topreserve parentheses in the parse tree. Currently, Maude gobbles them andconsequently they are missing in the LATEX output – an unnecessary sourceof ambiguities. Furthermore, a standalone parser allows for experimentationwith new language features like generic collections that Maude’s parser cannotsupport.

We implement our parser in the JavaScript variant of OMeta[1], an ex-tension to Parsing Expression Grammars that adds support for left-recursiverules. The choice of JavaScript was originally motivated by our goal to build aweb-browser based graphical user interface for K. Implementations of OMetain other languages better suited for standalone tools, such as Python, are avail-able, however, and porting the parser to these languages should be straight-forward.

Another benefit of JavaScript is its dynamic nature. It allows us to compilenew rules and modify the parser as the parsing proceeds, which is exactlywhat syntax sentences in K do. Our design tries to exploit this dynamismand reduce the parser to a small kernel for handling the core constructs of K.Since we can disregard the semantics of operations – we only have to know

2

18


Blocher, Dinges

the syntax –, this allows us to define most of the built-ins of K within Kitself: for example, our PL-INT and PL-ID modules are both defined using Ksyntax rules. The only support provided directly in OMeta are productions

to parse literals into JavaScript objects. Thus, the syntax of K’s built-ins is

readily accessible to K developers; there is no need to understand another

programming language.

The current version of our parser is able to parse several examples from the

K distribution; the generated syntax tree can print itself in the XML format

described above.

3 Summary

In this article, we propose an XML file format for K definitions, show some of

its advantages, and describe a program for converting plain text K files into

the XML format. We furthermore discuss challenges of parsing plain text Kand their influence on the architecture of a parser that we constructed.

References

[1] Alessandro Warth and Ian Piumarta. Ometa: An object-oriented language for pattern matching.

In DLS ’07: Proceedings of the 2007 symposium on Dynamic languages, pages 11–19, New York,

NY, USA, 2007. ACM.

3

19


Replace this file with prentcsmacro.sty for your meeting,

or with entcsmacro.sty for your meeting. Both can be

found at the ENTCS Macro Home Page.

A Graphical Editor for K Definitions

Dinges, Peter1

Blocher, Kyle2

Department of Computer ScienceUniversity of Illinois at Urbana–Champaign

Urbana, IL, USA

Keywords: K Framework, GUI, Editor

1 Introduction

The K semantic framework features an intuitive graphical representation for

language definitions. We propose that an interactive editor for this representa-

tion – compared to plain text – helps to avoid trivial and non-trivial mistakes

when defining a language while allowing at least the same development speed.

In the full paper, we show use cases of several implemented and planned fea-

tures of an editor prototype that we built as a JavaScript browser application.

We plan to release the prototype to the K community as a foundation for

rapid exploration of new editing ideas.

2 Benefits of Graphical Editing

A graphical editor prevents all syntactic errors. For example, elements that

consist of several tokens in the source code, like cells, become atomic and the

user can no longer forget one of the defining tokens. Likewise, it becomes im-

possible to accidentally intersperse nestings as in <k><stack></k></stack>,omit parentheses as in ((({(([[ ... ]])}))), or make typographic errors

1 Email: [email protected] Email: [email protected]

c�2011 Published by Elsevier Science B. V.

20


Dinges, Blocher

as in marco. With sufficient logic built in, the editor can also warn about

semantic inconsistencies at edit time. It could, for instance, annotate cells

if their element has the wrong type. These features free the user from the

burden of having to keep the syntax and types in mind; instead, she can focus

on the problem at hand.

The higher information density of a graphical representation yields sim-

ilar benefits. First, more context information can be provided for language

elements. For example, cells can be nested to a deeper level without getting

lost because the scope of the ancestor cells is always visible in the form of a

colored container. The coloring of cells also allows for faster visual scanning

of rules on whether they are relevant for the problem at hand or not: if there

is something wrong with the stack cell and it is red, then look for rules with

red bubbles in them. If this fails, textual search is still available; the options

are not exclusive. However, visual scanning will be faster for elements close

by because it works without keyboard input.

Second, harnessing the expressiveness of 2-dimensional mathematical no-

tation can shorten terms and make parentheses superfluous. For example, the

rule fragment

++([HOLE] => l-value([HOLE]))

can now be written as

++�

l− value(�)

.

Therefore, more context fits on the screen and the components of a term are

easier to identify. This also frees the user’s mind for other tasks. Allowing

mathematical notation furthermore allows language designers to express their

intention more clearly and therefore improves the readability of the language

definition.

An additional benefit of using a graphical editor is integration of context-

sensitive autocompletion of terms. When a term is recognized by the user and

chosen from a list of valid options, the editor can automatically residualize

this term by rendering it as a graphical object in the interface, regardless of

how the term was chosen. The editor thus does not restrict the developer’s

ability to type in syntax and commands; rather, the editor expands this ability

by responding appropriately in the given context. This functionality is espe-

cially potent in an editor for K because under the assumption of an almost

always valid state of the language definition, the editor can use this defini-

tion to “guess what you mean.” For example, while typing t at any point

in a rule definition, the editor can detect if the developer meant to create a

<thread>...</thread> cell, the variable T , or a syntactic element that be-

gins with “T”. Likewise, it can eliminate choices invalid in a context, such as

2

21


Dinges, Blocher

a cell within K terms.

The interactivity of the editor plays a central role in all of these arguments:

while a PDF can provide basically the same information density, it forces the

user to jump back and forth between two different representations. Even with

automatic linking to the right source code location, the user has to consciously

translate what she saw in the PDF into its plain text form. The mental

decompilation is slow and mistakes are easily made.

Also note that the benefits pointed out above apply especially to developers

new to the K framework. Thus a graphical editor is – in our opinion – a good

way to grow the user base.

3 Benefits of a Browser-Based Editor

Web browsers with JavaScript constitute a well-suited environment for imple-

menting an editor of K’s graphical notation. First of all, web browsers are

available on almost every modern personal computer. In combination with a

server-side K backend, the editor can thus be published on the Web, which

allows users to take a quick test drive of the environment without having to

install any other software. Adequate libraries abstract away browser-specific

peculiarities, so that browsers can be seen as a portable platform-independent

graphical environment. (See, for example, the Lively kernel [2] to get an im-

pression of the possibilities.)

Second of all, modern browsers come equipped with sophisticated debug-

gers that give the user the power to interrupt the interpretation of the scripted

graphical user interface. This not only facilitates quick investigation of bugs;

it also allows interested users to drill deeper: just like in Morphic [1] inter-

faces, users can pick a widget from the interface and immediately access the

associated code (for example via “Inspect Element” in the right click popup

menu of Chrome or Firefox with the FireBug extension). This can be seen

as explorative meta-programming that enables users to quickly add desired

features, advancing the utility of the editor for the user faster than possible

through central updates. JavaScript’s dynamic typing is an advantage in this

context because it gives the flexibility necessary to incorporate unanticipated

changes that might be prevented by a static type system.

Because of this, we view the developed editor not so much as a fixed

entity, but rather as a platform for rapid prototyping of new concepts for

editing K. Once a stable feature set and interaction model have been reached,

both can be reimplemented on a more traditional and static platform that

enforces stricter standards and methodologies among the members of a larger

development team.

3

22


Dinges, Blocher

4 Summary

We propose that an interactive editor for the graphical representation of Khelps to avoid mistakes when defining a language: it eliminates syntactic errorscompletely and, through better overview and smart autocompletion, reducesthe number of semantic errors. We implement a prototype of such a graphicaleditor as a web browser application in JavaScript and plan to release it to theK community as a platform for rapid experimentation.

References

[1] John H. Maloney and Randall B. Smith. Directness and liveness in the morphic user interface

construction environment. In Proceedings of the 8th annual ACM symposium on User interfaceand software technology, UIST ’95, pages 21–28, New York, NY, USA, 1995. ACM.

[2] Antero Taivalsaari, Tommi Wikkonen, Dan Ingalls, and Krzysztof Palacz. Web browser as an

application platform: the lively kernel experience. Technical report, Mountain View, CA, USA,

2008.

4

23


An Executable Formal Semantics of C with Applications

Chucky Ellison

Abstract

This paper describes an executable formal semantics of C. Being executable, the semantics has been

thoroughly tested against the GCC torture test suite and successfully passes 770 of 776 test programs.

It is the most complete and thoroughly tested formal definition of C to date. The semantics yields an

interpreter, debugger, state space search tool, and model checker “for free”. The semantics is shown

capable of automatically finding program errors, both statically and at runtime. It is also used to enumerate

nondeterministic behavior.

1 IntroductionC is one of the most frequently used programming languages. It provides just enough abstraction above

assembly language for programmers to get their work done without having to worry about the details

of the machines on which the programs run. Despite this abstraction, C is also known for the ease in

which it allows programmers to write buggy programs. With no runtime checks, and little static checking,

in C the programmer is to be trusted entirely. Despite the abstraction, the language is still low-level

enough that programmers can take advantage of assumptions about the underlying architecture. Trust

in the programmer and the ability to write non-portable code are actually two of the design principles

under which the C standard was written. These ideas often work in concert to yield intricate, platform-

dependent bugs. The potential subtlety of C bugs makes it an excellent candidate for formalization, as

subtle bugs can often be caught only by more rigorous means.

In this paper, we present a complete formal semantics for C that can be used for finding program bugs.

Rather than being an “on paper” semantics, the definition is written in an executable, machine readable form

and has been tested against the GCC torture tests (see Section 2). The semantics describes the features of the

ISO/IEC 9899:1999 (C99) standard, but we often use the text from the proposed C1X standard when there are

any uncertainties about behavior. We use the C1X text because it will eventually supersede the C99 standard,

and because it offers clearer wording and more explicit descriptions of certain kinds of behavior.

Our semantics can be considered a freestanding implementation of C99. The standard defines a free-

standing implementation as a version of C that includes every language feature except for _Complex and

_Imaginary types, and that includes a subset of the standard library. We additionally provide a number

of functions found in math.h, stdio.h, stdlib.h, and string.h, including malloc() and longjmp().Our semantics is the first complete semantics of C, and to our knowledge, one of the few instances of a

complete formal semantics of a “real” programming language.

Above all else, our semantics has been motivated by the desire to develop formal, yet practical tools. Our

semantics was developed in such a way that the single definition could be used immediately for interpreting,

debugging, or analysis. At the same time, this practicality does not mean that our definition is not formal.

Being written in a subset of rewriting logic (RL), it comes with a complete proof system and initial model

1

24


semantics. Briefly, a rewrite system is a set of rules over terms constructed from a signature. The rewrite rulesmatch and apply everywhere, making RL a simple, uniform, and general formal computational paradigm.

Our C semantics defines 150 C syntactic operators. The definitions of these operators are given by 1,163semantic rules spread over 5,884 source lines of code (SLOC). However, it takes only 77 of those rules(536 SLOC) to cover the behavior of statements, and another 163 for expressions (748 SLOC). There are505 rules for dealing with declarations and types, 115 rules for memory, and 189 technical rules defininghelper operators. Finally, there are 114 rules for the core of our standard library. The semantics itselfis available at http://c-semantics.googlecode.com/.

2 Testing the SemanticsNo matter what the intended use for a formal semantics, its use is limited if one cannot generate confidence inits correctness. To this aim, we ensured that our semantics remained executable and computationally practical.

2.1 GCC Torture TestsAs discussed in the previous section, our semantics is encapsulated inside a drop-in replacement for GCC,which we call “KCC”. This enables us to test the semantics as one would test a compiler. We were thenable to run our semantics against the GCC C-torture-test [1] and compare its behavior to that of GCC4.1.2, as well as the Intel C++ Compiler (ICC) 11.1 and Clang 3.0 r132915 (C compiler for LLVM).We ran all compilers with optimizations turned off.

We use the torture test for GCC 4.4.2, specifically those tests inside the “testsuite/gcc.c-torture/execute”directory. We chose these tests because they focus particularly on portable (machine independent) executabletests. The README.gcc for the tests says, “The ‘torture’ tests are meant to be generic tests that can run onany target.” We found that generally this is the case, although there are also tests that include GCC-specificfeatures, which had to be excluded from our evaluation. There were originally 1093 tests, of which weexcluded 267 tests because they used GCC-specific extensions or builtins, they used the _Complex data typeor certain library functions (which are not required of a freestanding implementation of C), or they weremachine dependent. This left us with 826 tests. Further manual inspection revealed an additional 50 teststhat were non-conforming according to the standard (mostly signed overflow or reading from uninitializedmemory), bringing us to a grand total of 776 viable tests.

In order to avoid “overfitting” our semantics to the tests, we randomly extracted about 30% of the conform-ing tests and developed our semantics using only this small subset (and other programs discussed in Section ??).After we were comfortable with the quality of our semantics when running this subset, we ran the remainingtests. Out of 541 previously untested programs, we successfully ran 514 (95%). After this initial test, webegan to use all of the tests to help develop our semantics; we now run 770 (99.2%) of the 776 compliant tests.

Torture Tests Run (of 776)Compiler Count Percent

GCC 768 99.0ICC 771 99.4Clang 763 98.3KCC 770 99.2

The 776 tests represent about 23,500 SLOC, or 30 SLOC/file.

2

25


Correctness Analysis Our executable formal semantics performed nearly as well as the best compiler we

tested, and better than the others. We incorporated the passing tests into our regression suite that gets run every

time we commit a change. This way, upon adding features or fixing mistakes, our accuracy can only increase.

Three of the six failed tests rely on floating point accuracy problems. Two more rely on evaluating

expressions inside of function declarators, as in:

int fun(int i, int array[i++]) { return i; }

which we are not handling properly. The last is a problem with the lifetime of variable length arrays.

Coverage Analysis In order to have some measure of the effectiveness of our testing, we recorded the

application of every semantic rule for all of the torture tests. Out of 887 core rules (non-library, non-helper

operator), the GCC torture tests exercised 805 (91%).

In addition to getting a coverage measure, this process suggests an interesting application. For example,

in the GCC tests looked at above, a rule that deals with casting large values to unsigned int was never

applied. By looking at such rules, we can create new tests to trigger them. These tests would improve

both confidence in the semantics as well as the test suite itself.

3 Runtime VerificationWhen something lacks semantics (i.e., when its behavior is undefined according to the standard) then its

evaluation in our semantics will simply stop when it reaches that point in the program. We use this mechanism

to catch errors like signed overflow or array out-of-bounds.

In this small program, the programmer forgot to leave space for a string terminator ('\0'). The call

to strcpy() will read off the end of the array:

int main(void) {

char dest[5], src[5] = "hello";

strcpy(dest, src);

}

GCC will happily execute this, and depending on the state of memory, even do what one would expect. It is

still undefined, and our semantics will detect trying to read past the end of the array. Because this program

has no meaning, our semantics “gets stuck” when exploring its behavior. It is through this simple mechanism

that we can identify undefined programs and report them to the user. By default, when a program gets

stuck, we report the state of the configuration (a concrete instance of that shown in Figure ??) and what

exactly the semantics was trying to do at the time of the problem. We have also begun to add explicit error

messages for common problems—here is the outputfrom our tool for this code:

$ kcc buggy_strcpy.c ; ./a.out

ERROR encountered while executing this program.

Description: Reading outside the bounds of an object.

Function: strcpy

Line: 3

References[1] FSF. C language testsuites: “C-torture” version 4.4.2, 2010. URL http://gcc.gnu.org/onlinedocs/

gccint/C-Tests.html.

3

26


Abstract Semantics for K Module Composition

Codruta Girlea

Computer Science DepartmentUniversity of Illinois at Urbana-Champaign

[email protected]

A structured K definition is easier to write, understand and debug thanone single module containing the whole definition. Furthermore, modularizationmakes it easy to reuse work between definitions that share some principles orfeatures. Therefore it is useful to have a semantics for module composition oper-ations that allows the properties of the resulting modules to be well understoodat every step of the composition process.

In this paper we present an abstract semantics for a module system for theK framework. We describe K modules and module transformations in terms ofinstitution based model theory introduced by Goguen and Burstall in [GB92] .

The semantics is similar to the module semantics described by Goguen andRosu [Ro7,GR03]. Thus, a module is seen as a presentation in a given institution(the definition of the module), where the visible part of the module (visiblesignature or interface) is a sub-signature of the module definition signature (theworking signature), and the visible theorems are the restriction to this signatureof the set of theorems of the module [Ro7]. In addition to those, a K module mayassume a part of its definition as already implemented and state this part as arequired presentation. As such, the definition of a K module has the followinggeneral form:

module M{requires ρ, Kρexports ψΣ,K}

Σ, ψ and ρ are the working, visible and required signatures, respectively (where ψand ρ are subsignatures of Σ) and Kρ, K are the required and working theorems,respectively.

We assume the institution we are working in has an inclusion system [DGS93]on signatures, wherein each signature morphism σ can be factored (uniquely upto isomorphism) as the composition σ = e; ι of an abstract surjection e andan abstract inclusion ι. Furthermore, we assume the institution is inclusive (asdefined in [GR03]) and that the model functor of the institution (Mod) preservespushouts and coproducts.

The module operations we define the semantics of are: renaming, hiding,enriching and aggregation.

Renaming allows the reuse of modules with different names for the requiredand visible symbols and it only makes sense if it translates the symbols that those

27


2

two signatures share in a consistent manner. Intuitively, renaming does not add

new symbols to the signature, thus the morphims that define the renaming are

surjections.

Hiding allows a part of the visible signature to no longer be visible in the

new module.

Enriching, as opposed to hiding, adds new symbols and sentences to the

module. One can also add new symbols and particularly sentences to the set

of requirements, which is still, by definition, an enriching, but in this case the

effect is that of constraining (if new sentences are added), as the set of elements

required to define the module grows.

Aggregation allows two modules to be combined into one single module.

If a module defines everything it requires, it is called complete. Using all the

above operations on a set of modules, one can define a structured module. One

can obtain a complete structured module even if some or all the base modules

used are incomplete.

References

[DGS93] Razvan Diaconescu, Joseph Goguen, and Petros Stefaneas. Logical supportfor modularisation. In Papers presented at the second annual Workshop onLogical environments, pages 83–130, New York, NY, USA, 1993. CambridgeUniversity Press.

[GB92] Joseph A. Goguen and Rod M. Burstall. Institutions: abstract model theoryfor specification and programming. J. ACM, 39:95–146, January 1992.

[GR03] Joseph Goguen and Grigore Rosu. Composing hidden information modulesover inclusive institutions. In In From Object-Orientation to Formal Methods:Essays in Honor of Johan-Ole Dahl, pages 96–123. Springer, 2003.

[Ro7] Grigore Rosu. Abstract semantics for module composition. Technical ReportCSE2000–0653, University of California at San Diego, May 2000. WrittenAugust 1997.

28


K 2011

KRunner: Linking Rascal with K

Mark Hills1 Paul Klint2 Jurgen J. Vinju3

Centrum Wiskunde & InformaticaAmsterdam, The Netherlands

INRIA Lille Nord EuropeLille, France

Using the K framework [6,15], it is possible to define the semantics ofprogramming languages and language calculii. This includes the semanticsof a number of “real-world” or paradigmatic languages and language subsets,such as Verilog [12], KOOL [8], SILF [6,9], and KERNELC [14], a core ofthe C language. These definitions have been used for a number of purposes,including to provide semantics-based interpreters, program analysis tools, andverification environments such as matching logic [13].

Like most semantics frameworks, K focuses on assigning semantics to theabstract syntax of a program, not to its concrete syntax. Because of this,the current K tool suite [5,1] provides very little support for language front-ends, instead assuming that programs will be given in (or transformed into)a format easily consumed by Maude [4]. Front-ends are then created on anad-hoc basis, using a number of different lexers, parsers, and pretty-printers.Graphical front-ends are also not directly supported, meaning that the typicaluser of a definition either uses Maude directly, uses some other console-basedtool (such as an execution script), or uses a custom graphical front-end. Thisleads to a potentially poor user experience where, for instance, the user needsto work backwards from the given error messages, potentially through thegenerated version of the program, back to her original program, in order tofind the actual source of an error message. This also leads to difficulties indistributing language definitions, which may require a number of tools to be

1 Email: [email protected] Email: [email protected] Email: [email protected]

29


Hills, Klint, and Vinju

bundled with the definition or installed separately.The solution explored in this abstract is to provide the language front-

end and user interface integration using the Rascal meta-programming lan-guage [11,10]. Rascal provides a number of features needed to build front-endsthat can work with K language specifications. For lexing and parsing, Rascalprovides a grammar notation which can be used to generate scannerless GLL-based parsers. Generated parse trees can then be manipulated using matchingover concrete syntax patterns and standard (not parsing-specific) features ofRascal, including structure-shy traversals, string interpolations, rich built-indata types (e.g., sets, relations, lists, and tuples), pattern matching, user-defined algebraic data types, and higher-order functions. Rascal integrationwith the Eclipse IDE via IMP [3,2] provides an IDE for source programs inthe defined language as well as interactive features for running and analyzingprograms and for displaying results (output values, discovered errors, etc).

This work is based on the RLSRunner tool [7], which focused on providingsupport for K (or earlier, K-style) definitions running directly in Maude. Thetool described here is similar to RLSRunner, but is intended to instead workdirectly with K language definitions, instead of only working with definitionsalready converted into Maude format. This should make it easier to takeadvantage of current work on K, including (potentially) execution enginesoutside of Maude.

References

[1] http://code.google.com/p/k-framework/.

[2] P. Charles, R. M. Fuhrer, S. M. S. Jr., E. Duesterwald, and J. J. Vinju. Accelerating theCreation of Customized, Language-Specific IDEs in Eclipse. In Proceedings of OOPSLA’09,pages 191–206. ACM, 2009.

[3] P. Charles, R. M. Fuhrer, and S. M. Sutton. IMP: A Meta-Tooling Platform for CreatingLanguage-Specific IDEs in Eclipse. In Proceedings of ASE’07, pages 485–488. ACM Press,2007.

[4] M. Clavel, F. Duran, S. Eker, P. Lincoln, N. Martı-Oliet, J. Meseguer, and C. Talcott, editors.All About Maude - A High-Performance Logical Framework, How to Specify, Program andVerify Systems in Rewriting Logic, volume 4350 of LNCS. Springer-Verlag, 2007.

[5] T. F. Serbanuta and G. Rosu. K-Maude: A Rewriting Based Tool for Semantics ofProgramming Languages. In Proceedings of WRLA 2010, volume 6381 of LNCS, pages 104–122. Springer-Verlag, 2010.

[6] M. Hills, T. F. Serbanuta, and G. Rosu. A Rewrite Framework for Language Definitions andfor Generation of Efficient Interpreters. In Proceedings of WRLA’06, volume 176(4) of ENTCS,pages 215–231. Elsevier, 2007.

[7] M. Hills, P. Klint, and J. Vinju. RLSRunner: Linking Rascal with K for Program Analysis. InProceedings of SLE’11, LNCS. Springer-Verlag, 2011. To Appear.

[8] M. Hills and G. Rosu. KOOL: An Application of Rewriting Logic to Language Prototyping andAnalysis. In Proceedings of RTA’07, volume 4533 of LNCS, pages 246–256. Springer-Verlag,2007.

2

30


Hills, Klint, and Vinju

[9] M. Hills and G. Rosu. A Rewriting Logic Semantics Approach To Modular Program Analysis.In Proceedings of RTA’10, volume 6 of Leibniz International Proceedings in Informatics, pages151 – 160. Schloss Dagstuhl - Leibniz Center of Informatics, 2010.

[10] P. Klint, T. van der Storm, and J. Vinju. RASCAL: A Domain Specific Language for SourceCode Analysis and Manipulation. In Proceedings of SCAM’09, pages 168–177. IEEE, 2009.

[11] P. Klint, T. van der Storm, and J. Vinju. EASY Meta-programming with Rascal. In Post-

Proceedings of GTTSE’09, volume 6491 of LNCS, pages 222–289. Springer-Verlag, 2011.

[12] P. O. Meredith, M. Katelman, J. Meseguer, and G. Rosu. A formal executable semantics ofverilog. In Proceedings of MEMOCODE 2010, pages 179–188. IEEE Computer Society, 2010.

[13] G. Rosu, C. Ellison, and W. Schulte. Matching Logic: An Alternative to Hoare/Floyd Logic.In Proceedings of AMAST’10, volume 6486 of LNCS, pages 142–162. Springer-Verlag, 2011.

[14] G. Rosu, W. Schulte, and T. F. Serbanuta. Runtime Verification of C Memory Safety. InProceedings of RV 2009, volume 5779 of LNCS, pages 132–151. Springer-Verlag, 2009.

[15] G. Rosu and T. Serbanuta. An Overview of the K Semantic Framework. Journal of Logic and

Algebraic Programming, 79(6):397–434, 2010.

3

31


On Compiling Rewriting Logic Language

Definitions into Competitive Interpreters

Michael Ilseman Chucky Ellison

Abstract

This paper describes a completely automated method for generating

efficient and competitive interpreters from formal semantics expressed in

Rewriting Logic. The semantics are compiled into OCaml code, which

then acts as the interpreter for the language being defined. This automatic

translation is tested on the semantics of an imperative as well as a functional

language, and these generated interpreters are then benchmarked across a

number of programs. In all cases the compiled interpreter is faster than

directly executing the definition in a Rewriting system with improvements

between one to two orders of magnitude.

1 Introduction

Formal programming language semantics have been around almost as long as

programming languages themselves. Numerous formalisms have been introduced,

with differing strengths and weaknesses, yet most programming language devel-

opment is still done informally. While there are likely many reasons why this is

the case, one of the simplest is that people want interactive development—they

want immediate feedback while they work on their design.

Our primary goal is to make working with formally defined programming

languages easier than working with only implementations or natural language

specifications. Although programmers are more familiar with simply writing

compilers or interpreters based on informal specifications, the actual language

in which semantics are expressed is only a small part of the entire package. If

the only computer-readable “definition” is a compiler or interpreter, then the

programmer still has to write debuggers, integrated development environments,

refactoring tools, type checkers, model checkers, verification tools, etc.

Instead, if your language definition is a formal, mathematical description,

computers can actually generate these secondary tools, or at the very least

assist in generating them. This is because a formal definition can be easily

analyzed and transformed. Using our particular language formalism, called the

K Framework, we already generate a number of the above secondary tools. Most

importantly, such semantic definitions are directly executable as interpreters

in a rewriting system such as Maude.

1

32


2 Evaluation

Here we describe the experiments we made with our system. First we brieflydescribe the two languages we tried compiling, then the specific benchmarksand comparisons with other interpreted languages.

2.1 Languages

We worked with two languages during the creation and evaluation of our system.The first, IMP, is a simple imperative language with assignment, if, and whilestatements. The second, FUN is a simple functional language that supports recur-sion, references, and higher-order functions. Both languages support arbitrarilysized integers. These languages were formalized in K-Maude.

2.2 Benchmarks

For the benchmarking we used a system with 2 CPUs running at 2.53GHzwith 4GB memory. The versions of our software were as follows: Maude 2.4,OCaml 3.12.0, GNU bc 1.06, Ruby 1.8.7, and Python 2.6.5. Each benchmarkwas averaged over at least five non-consecutive runs. The benchmark programsare available for download on our website.1

For the benchmarks, we defined a number of programs in IMP and FUN,and implemented their equivalents in Ruby, Python, and GNU bc. The aim ofthe benchmarks is not just to see how we compare against the current K-Maudeimplementation, but also to see how our generically generated interpreters fareagainst other, hand written interpreters. Since Ruby and Python implementmany more advanced language features, these comparisons are not meant tobe viewed as conclusive, but to serve as a baseline to compare against. Oneof the major goals in the K Project is to be able to automatically generatecompetitive implementations, and comparing against these languages can helpshow us where we are with respect to that goal.

We tested these languages by running a number of programs. For the IMPlanguage, we ran a Fibonacci program Fib, a factorial program Fact, a sum-to-nprogram Sum, and a program Collatz that checks the Collatz conjecture upto n. For the FUN language, we tested an exponential Fibonacci program Fib,an exponential factorial program Fact, a sum-to-n program Sum, and a programHanoi that solves the tower of Hanoi problem for n rings.

In every benchmark, K-OCaml out-performed K-Maude, and in many ofthese by orders of magnitude. The runtime profile of these programs underK-OCaml make these generated interpreters a much more viable option forend-users than the one provided by K-Maude.

1http://fsl.cs.uiuc.edu/index.php/K_Compiler

2

33


0

2

4

6

8

10

0 100000 200000 300000 400000 500000 600000

Fib (IMP)

k-ocamlk-maude python

ruby

bc

0

5

10

15

20

0 5 10 15 20 25 30 35 40

Fib (FUN)

k-ocamlk-maude

python

rubybc

0

2

4

6

8

10

0 20000 40000 60000 80000 100000

Fact (IMP)

k-ocaml

k-maude

python

ruby

bc

0

2

4

6

8

10

0 10000 20000 30000 40000 50000 60000

Fact (FUN)

k-ocaml

k-maude pythonruby

bc

0

2

4

6

8

10

0 500000 1000000 1500000 2000000 2500000

Sum (IMP)

k-ocamlk-maude

python

ruby

bc

0

1

2

3

4

5

6

0 200000 400000 600000 800000 1e+06

Sum (FUN)

k-ocamlk-maude

python

ruby

bc

0

2

4

6

8

10

0 200 400 600 800 1000 1200

Collatz (IMP)

k-ocamlk-maude

python

rubybc

0

5

10

15

20

25

30

35

5 10 15 20 25 30

Hanoi (FUN)

k-ocamlk-maudepython

rubybc

Figure 1: Benchmark results. X-axis represents input number, Y-axis is time inseconds.

3

34


An Executable Formal Semantics of Haskell 98

David Lazar

University of Illinois at [email protected]

Haskell is a thriving programming language. People love Haskell because whenHaskell code compiles, they can be fairly certain that their code will work correctly.Despite how safe and mathematically-rooted the language is, Haskell was not devel-oped with formal semantics. In A History of Haskell, the creators of Haskell acknowl-edge the irony of this, “we always found it a little hard to admit that a language asprincipled as Haskell aspires to be has no formal definition” [3].

With the semantic frameworks available to them at the time, the creators of Haskellhad a good argument against developing the language with a formal semantics. Theyargued that “the absence of a formal language definition does allow the language toevolve more easily, because the costs of producing fully formal specifications of anyproposed change are heavy, and by themselves discourage changes.” Today, we havethe K semantic framework [7] for formally defining programming languages withoutsetting the definitions in stone and discouraging changes.

The K semantic framework does not constrain language evolution the way earlieroperational or denotational styles do. K is modular. This means K definitions are easyto change which actually encourages experimentation with new language features.With these advances, we have no reason to continue putting off a formal definitionof Haskell. In this talk, we will present a work-in-progress K definition of Haskell 98and discuss how this definition benefits the Haskell community.

All of the source code, test cases, and documentation for this project, includingthe executable K definition itself, is available on Google Code [1] under a permissiveOpen Source license.

The semantics of Haskell 98 is divided into three components:

◦ infrastructure,◦ desugaring Haskell 98 into the Haskell Kernel,◦ and the formal semantics of the Haskell Kernel.

The infrastructure component aims to provide a tool that looks and feels like GHC onthe outside, but uses the formal definitions from semantic components on the inside.With such an interface, Haskell programmers will be able to run their code in thesemantics with only a minor tweak to their build system. Currently, we provide aninterface similar to the runghc command, but we hope to have a drop-in replacementfor ghc (for at least Haskell 98) in the future.

Haskell’s syntax is mostly sugar. The Haskell 98 Report [5] gives equations fortranslating the syntactic sugar into simpler constructs. Exhaustively applying theequations from the Report produces a program that is mostly made up of case ex-pressions. The syntax for list comprehensions, list enumerations, do expressions, and

35


even let expressions and lambda abstractions desugars away. The resulting program iswritten in a small subset of Haskell 98 that the Report calls the Haskell Kernel.

We have implemented a translator from Haskell 98 to the Haskell Kernel in the Kframework. In doing so, we have demonstrated that it is feasible (and preferable!) touse the K framework to implement a compiler. More importantly, we have turned theHaskell 98 Report into an executable document: the compiler is implemented with Krules that exactly match the equations in the Report. This lets us check the correctnessof the Report through testing and gives us a playground for modifying the languageand quickly seeing the results.

The final component of formalizing Haskell 98 is to give formal executable se-mantics to the Haskell Kernel. Beyond the desugaring equations, the Haskell Ker-nel is not specified at all in the Report. We will use the formal semantics of otherHaskell core languages, such as GHC Core [8], to guide the semantics of the Kernel.The formalization of the Kernel will fill a large gap in the Report and will serve as acompiler-independent core language for Haskell. The Haskell Kernel will also be thefirst call-by-need language defined in K.

Together, these three components give us a complete formal executable definitionof Haskell 98. Alongside the development of each component, we have been collect-ing Haskell code against which to test the semantics. So far, the infrastructure andtranslator from Haskell 98 to the Haskell Kernel have passed every test we’ve thrownat it, including several from the GHC test suite. We will eventually import the entireGHC test suite into our testing procedure.

After formalizing Haskell 98, the next logical step is to take advantage of K’s mod-ularity to extend the definition with various GHC extensions. The latest revision ofthe Haskell language, Haskell 2010 [4], specifically adds the foreign function interfaceand pattern guards to the language. To implement the FFI, we could tie together theK definition of C [2] and our definition of Haskell. We could also take advantage ofK’s ability to formalize concurrency primitives to formalize and standardize Haskell’ssupport for concurrency. This would be especially beneficial to the community sincethis feature of Haskell has yet to be standardized.

Besides filling in gaps in the Report, an executable semantics of Haskell has severalapplications. Once the semantics of the Kernel is complete, we will be able to comparethe output of Haskell programs compiled with popular compilers and the output ofthe same programs run in our semantics. This practice has been shown to find bugs incompilers. We also intend to integrate our semantics with matching logic [6] to createa new method for doing program-level verification of Haskell software.

Thanks to Google for partly funding the project through their Google Summer ofCode program, and thanks to Joe Hurd and Aaron Tomb from Galois and ChuckyEllison and Grigore Rosu from UIUC for mentoring the project.

References

[1] Formal executable semantics of Haskell in the K framework. http://code.google.com/p/haskell-semantics/, 2011.


36


[3] Paul Hudak, John Hughes, Simon Peyton Jones, and Philip Wadler. A history of Haskell:being lazy with class. In Proceedings of the third ACM SIGPLAN conference on History of

programming languages, HOPL III, pages 12–1–12–55, New York, NY, USA, 2007. ACM.[4] Simon Marlow. Haskell 2010 language report, 2010. http://www.haskell.org/

onlinereport/haskell2010/.[5] Simon Peyton Jones et al. The Haskell 98 language and libraries: The revised report.

Journal of Functional Programming, 13(1):0–255, Jan 2003. http://www.haskell.org/definition/.

[6] Grigore Rosu and Andrei Stefanescu. Matching logic: A new program verification ap-proach (NIER track). In 30th International Conference on Software Engineering (ICSE’11),pages 868–871, 2011.


[8] Andrew Tolmach, Tim Chevalier, and the GHC Team. An external representation for theGHC Core language. 2010.

37


Parsing challenges in K-framework

Radu Mereuta, Gheorghe Grigoras

Faculty of Computer Science, Alexandru Ioan Cuza University, Iasi, Romania[radu.mereuta,grigoras]@info.uaic.ro

Introduction

K is a rewriting-based semantic definitional framework suitable for defining semantics for programming

languages and calculi, as well as type systems or formal analysis tools in an executable environment. The

main advantage of having the executable semantical definition of a language is the ability to use it with

analysis tools to verify programs. Here can be mentioned MatchC (under development), a tool based on

Matching Logic [10] that uses the K definition of a subset of C.

A definition in K includes configurations, computations and rules. Configurations organize the sys-

tem/program state in units called cells, which are labeled and can be nested. Computations carry ”compu-

tational meaning” such as fragments of program; in particular, computations extend the original language

or calculus syntax. K (rewrite) rules generalize conventional rewrite rules by making it explicit which part

of the term they read, write or do not care about [11].

K was introduced by Grigore Rosu in the lecture notes of a programming language design course

at the University of Illinois at Urbana-Champaign (UIUC) in Fall 2003 [9]. Since then, it has been

used continuously in teaching programming languages at UIUC, in seminars in Spain and Romania.

Now, K is a part of several research initiatives intending to develop the theoretical foundations and the

implementation that allow its use in both academia and industry.

K-Maude [12], the current tool supporting K, showed to be quite scalable and applicable to real world

programming languages such as Scheme [14], Verilog [13], Java 1.4 [6] and C [5] (others are underway).

However, since K-Maude relies on the Maude parser to parse K definitions, there are cases when new

definitions introduce ambiguities. Moreover, in order to be translated in Maude, these definitions need to

be syntactically correct; therefore it is preferable to have a parser able to parse K definitions. The design

of such a parser is not an easy task because the K definitions are quite complex, combining K syntactical

constructs with fragments of syntax from the defined language. The grammar for the defined language is

a parameter for such a parser. In this paper we exhibit the main difficulties met in parsing K definitions

and we sketch out a solution for it.

Problem Description

To understand the problem of parsing a K definition, let’s look at an example. We consider a small

language that accepts addition and assignment over identifiers and integers:

syntax Exp ::= ID | INT | Exp "+" Exp | Exp "=" Expsyntax Stm ::= Exp ";" | Stm Stm

The semantics of this simple language can be given using a configuration consisting of two cells, a cell

<k> for computation structures and a cell env for binding the variables to values:

configuration <T> <k> .K </k> <env> .Map </env> </T>

The semantics of syntactical constructs is given by K rules, which describe how the configuration is

changed when these constructs are executed. For instance, the semantics of the assignment operator is

given by the following rule:

rule [store]: <k> I = V => V:Int </k> <env> I:Id |-> ? => V </env>

38


2

The question mark denotes an anonymous variable. We use colors to mark and exhibit that this simpleexample includes syntactical constructs belonging to different languages. The black parts highlights thegrammar of K, the blue part highlight the meta-variable declarations, and the only part that is actually ofthe defined language syntax is the assignment ”=” symbol. The left and right sides of the assignment aremeta-variables ranging over two particular syntactical categories of the defined language. The variablesI and V once declared, they keep their meaning, their sort and their bindings through the context of theentire rule. They can be declared everywhere in their scope. Even if the grammars for the two languagesare correctly defined, we face here with a parsing ambiguity: the non-typed occurrences (colored in green)could also define identifiers in the specified language. This sort of ambiguities can be solved only afterthe parse-tree of the entire rule is built and analyzed.

The back-end of the K tool is designed to accept K definitions in the K Intermediate Format (KIF)which are obtained by transforming the syntactic constructs built with the grammar of the definedlanguage into abstract syntactic trees (ASTs). As seen above, these constructs are often merged with Kconstructs. The goal is to write a tool able to parse definitions as above, solve the ambiguities, infer thetypes for each construct, and transform them into pure K definitions (using ASTs) like the following one:

rule(cell("k", =>(=(I:Id, V:Int), V:Int), "k"),cell("env", |->(I:Id, =>(?:Int, V:Int)), "env"))

All the variables should have a type now, the anonymous variables should have an inferred type, thepriority of the operators should be resolved, and all of the language constructs should be in a prefix formthat can be handled by a rewrite engine.

Such a tool should be general enough to allow definitions of languages belonging to various program-ming paradigms. The grammar of such a language will be a parameter for the tool.

The current implementation of the K-framework [12] has a front-end that transforms an annotatedBNF definition of a language into a Maude [2] module which is later used to parse the K rewrite rules. Thissolution often leads to ambiguities because of the mixfix notation of the operators and the preregularityproperty that needs to be satisfied by the ordered sorts.

In the compilation of a definition, out of 21 transformation steps, 14 are necessary only for flatteningthe syntax into an AST form, and there still are some limitations. The need for a new tool becomes clear,one which can take the static grammar of K and extend it with a new one, defined by the user.

Proposed Solution

A potential solution we investigate is the use of SDF [7] and its scannerless generalized parser withinSpoofax [3]. Currently a prototype is under development and it shows promising results. SDF’s modularityproved to be really helpful at the level of integration between the K grammar and the defined languagegrammar. A side effect of using this tool is the need to directly handle ambiguities. These could comefrom the similarity of K syntax with the users language. To solve them, a new step is necessary to filterout the unwanted parsing candidates depending on the context (in the above example, the variables ”I”and ”V” cannot be identifiers because they have been declared as variables in the first part of the rule).The downside is that the number of ambiguities could lead to an explosion of possible terms that couldrequire high amounts of memory and computing power.

Another possible tool that we are trying in parallel is Rascal [8]. Based on a similar theory as SDF,it could prove more scalable, because the parser allows more control, and the disambiguation could bedone during parsing, and thus on a smaller portion of the AST.

A notable advantage of using these two tools, is the ability to generate Eclipse plug-ins that canrecognize the language specified by the user. This goes as far as syntax coloring, error reporting, codefolding, and, most importantly, seeing the parse tree resulted after the parsing step. This allowed us toexperiment with all kind of solutions that could be very close to the expected end result.

39


3

Parsing technique

A K definition is composed out of 3 main parts: syntax declaration for the language in question - rep-resented by G0(L), extra syntactic constructs that extend K - G0(D) - and rewrite rules that give thesemantics - R0(D) .

!"#$%&'(%")*"+,-+%,.

/0)('1

/,#')(%23

* 45678*

* 45698*

* :5698* 4;6<8

4;678

4;698 ='-3,-

4,),-'(,*$'-3,-

>/? >/@

>))"678

=A-,*<

9%3'#B%CA'(%")

='-3%)C

D

<*E,F%)%(%")*G*30)('1*6HI?8*J*3,#')(%2*6-,.-%(,*-A&,38

/0)('1*F&'((,)%)C

Fig. 1. The front end workflow

Level 0 in the figure 1 represents K code and is one of the entry points in the compilation process.From this, with an initial parsing, the language definition is extracted and an SDF definition is generatedfor the language. While only G1(L) is necessary to parse programs, to generate a complete parser for K,for that exact definition, G1(L) and G1(D) will be composed with G1(K) - the initial K definition withoutany syntactical constructs.

In K-framework, embedding of language constructs in the K syntax is done with the help of the sort”K”. In fact this is the place where the name of the technique comes from. A ”K” sort, represents acomputation, which from a parsing perspective is a language construct. In practice this is solved verysimple, the only thing to do is to subsort every language sort to the sort K. To make the rewrite rules asgeneral as possible, every sort can be replaced by meta-variables (see figure 2).!"#"$%&'()&%*+,,+*-

./

-0$'+# 1#2&334&56&7&589&7&1#2&:;:&1#2&7&1#2&:4:&1#2

-0$'+# <', 334&1#2&:=:&7&<', <',

>

?+*"+@AB

Fig. 2. The front end workflow

These simple steps already generates some problems as exemplified below:

rule <k> X </k> ...

rule(cell("k", K(amb([X:K,X:Id,X:Int,X:Exp,X:Stm, "k"))

40


4

the parser will generate all the possible variable types by following all the subsorting paths from K tovariable. This can be later used to do sort inference on variables.

A disambiguation mechanism for context-free languages is a procedure that chooses from a range ofpossible parses for a sentence, the most appropriate one according to some criteria [4].

There are many ways for disambiguation of ambiguous grammars, ranging from simple syntacticcriteria to semantic criteria [4]. SDF concentrates on ambiguities caused by integrating lexical and context-free syntax. Some classes of disambiguation rules turn out to be adequate for declarative filters [1]. But inthe case of a K definition, a more complex approach was needed that uses both the declarative methodsand custom procedures using semantic information.

Tthe main steps of the disambiguation are:

1. collect configuration info -> apply disambiguation filter for cell types

� rule < cell > amb(T : {S1 , . . . ,Sn}) < /cell >

� rule < cell > amb(T : {S �1 , . . . ,S

�m}) < /cell >,S �

i <= type(cell)

2. collect variable declarations -> eliminate other sort possibilities for variables

� amb(Var : {S1 , . . . ,Sn})� Var : S

contains a variable declaration V ar : S

3. choose maximal sort for everything except named variables

� amb(T : {S1 , . . . ,Sn})� T : max({S1 , . . . ,Sn})

where T is NOT a named variable

4. sort inference for the remaining variables

� amb(T : {S1 , . . . ,Sn}), amb(T : {S �1 , . . . ,S

�n})

� T : max({S1 , . . . ,Sn} ∩ {S �1 , . . . ,S

�n})

where T is a named variable

If any other ambiguity is left, then it is reported to the user to take further actions, but if what we areleft after is a clean AST, then this can be converted into a pure K definition, and later sent to a rewriteengine.

Conclusion

This work started because of the need to replace the K-Maude tool, currently used to parse K definitions.After a period of research, the most promising solution turned out to be SDF and its generalized parser.Having the possibility to write modular grammars, allowed us to integrate easily the two grammars indiscussion (the K grammar and the defined language grammar). Because the downside of the context-free declarative grammars are ambiguities, special procedures needed to be developed to cope with thenondeterminism of the parsing step (most of these problems are generated because of the integrationmethod).

To sum up, the novelty of this work, includes a new technique for parsing a K definition. The firststep is to read the definition and extract the syntax declarations. From these, a new parser is generatedthat can cope with the complexity of a K rewrite rule. The second step is to parse the entire definitionand get a parse forest. The last step is necessary to filter the unwanted parsing possibilities. The finalresult should be a clean AST that represents the intended definition and which can now be used in thenext steps of the compilation, towards a rewrite engine.

Future work in this direction includes finishing the implementation and testing it on a wide variety ofprogramming languages. Because SDF has a very good connection with Eclipse with the help of Spoofax,we are also aiming to create a user friendly interface that will speed up the editing and testing phase.

41


5

References

1. M.G.J. van den Brand, J. Scheerder, J. J. Vinju and E. Visser. Disambiguation Filters for Scannerless Gener-alized LR Parsers. Compiler Construction CC02, pages 143–158, Springer-Verlag. 2002

2. Manuel Clavel, Francisco Duran, Steven Eker, Patrick Lincoln, Narciso Martı-Oliet, Jose Meseguer and CarolynTalcott. The maude 2.0 system. In Robert Nieuwenhuis, editor, Rewriting Techniques and Applications (RTA2003), number 2706 in Lecture Notes in Computer Science, pages 76-87. Springer-Verlag, June 2003.

3. Lennart C. L. Kats and Eelco Visser. The Spoofax Language Workbench. Rules for Declarative Specificationof Languages and IDEs. Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Pro-gramming, Systems, Languages, and Applications, OOPSLA 2010, October 17-21, 2010, Reno, NV, USA. Pages444-463

4. Paul Klint and Eelco Visser. Using Filters for the Disambiguation of Context-free Grammars. Proc. ASMICSWorkshop on Parsing Theory, pages 1–20. 1994

5. Chucky Ellison and Grigore Rosu, A Formal Semantics of C with Applications, University of Illinois, URL:http://hdl.handle.net/2142/17414, November, 2010

6. Azadeh Farzan, Feng Chen, Jose Meseguer and Grigore Rosu, Formal Analysis of Java Programs in JavaFAN,Proceedings of Computer-aided Verification (CAV’04), LNCS, volume 3114, pages 501 - 505, 2004

7. J. Heering, P.R.H. Hendriks, P. Klint, and J. Rekers. The syntax definition formalism sdf - reference manual,2001.

8. Paul Klint, Tijs Storm and Jurgen Vinju. RASCAL: a Domain Specific Language for Source Code Analysisand Manipulation. SCAM 2009

9. G. Rosu, CS322, Fall 2003 - Programming Language Design: Lecture Notes, Tech. Rep. UIUCDCS-R-2003-2897, Department of Computer Science, University of Illinois at Urbana-Champaign, lecture notes of a coursetaught at UIUC (December 2003).

10. Grigore Rosu, Chucky Ellison and Wolfram Schulte. Matching Logic: An Alternative to Hoare/Floyd Logic.Proceedings of the 13th International Conference on Algebraic Methodology And Software Technology (AMAST’10), volume 6486, pages 142-162, 2010

11. Grigore Rosu and Traian Florin Serbanuta. An overview of the K semantic framework. Journal of Logic andAlgebraic Programming, 79(6):397-434, 2010.

12. Grigore Rosu and Traian Florin Serbanuta. K-Maude: A Rewriting Based Tool for Semantics of ProgrammingLanguages. Rewriting Logic and Its Applications - 8th International Workshop, WRLA , 6381:104-122, 2010.

13. Patrick O’Neil Meredith, Michael Katelman, Jose Meseguer and Grigore Rosu, A Formal Executable Se-mantics of Verilog, Eighth ACM/IEEE International Conference on Formal Methods and Models for Codesign(MEMOCODE’10), IEEE, pages 179-188, doi:10.1109/MEMCOD.2010.555863, 2010

14. Patrick Meredith, Mark Hills and Grigore Rosu, A K Definition of Scheme, University of Illinois at Urbana-Champaign, Department of Computer Science UIUCDCS-R-2007-2907, 2007

42


Automated heap pattern generation

Naum Elena

[email protected]

Faculty of Computer ScienceAlexandru Ioan Cuza University

July 8, 2011

1 Introduction

Matching logic is an alternative to Hoare logics in which the state structure plays a crucialrole. Program states are represented as algebraic datatypes called (concrete) configurations,and program state specifications are represented as configuration terms with variables andconstraints on them called (configuration) patterns. A pattern specifies those configurationsthat match it. Patterns can bind variables to their scope, allowing both for pattern abstractionand for expressing loop invariants.The tool associated with Matching logic, MatchC works with configurations written as xml

elements. MatchC is a verifier defined over a kernel of the C programming language. The mostimportant cells in the MatchC configuration are �k�, �env�, �heap� and �form�. These cells arethe only ones used in the front-end of the application. Out of these cells, there is one that holdsparticular significance: the heap cell and its implied operation of matching.

2 Problem Description

Consider the following structure declaration in the C programming language:

struct listNode {

int val;

struct listNode* next;

};

In order to be able to work with large sections of the heap without specifying each element,the concept of heap pattern was introduced. A heap pattern matches a portion of the heap.The heap is populated when a program variable that is not of elementary type is instantiated.Take the following example of C code:

struct listNode* variable = (struct listNode*)malloc(sizeof(struct listNode));

variable->val = 5;

1

43


The env cell links the variable used the the program to the location in the heap. The heapholds two elements that represents a list with one node.

<env>

variable |-> L

</env>

<heap>

L |-> 5 : listNode.val

(L + 1) |-> L’ : listNode.next

</heap>

This is the representation for only one element of the single linked list. Adding one more

node to the list adds two more elements in the heap. But all the elements of the list can be

summarized by the corresponding heap pattern. This is the reason why, when structures are

declared, corresponding heap patterns have to be written.

So far static heap patterns have been written to correspond to default structures e.g. (list as

the heap pattern for the listNode structure, both representing a single linked list). But, should

someone change any characteristic (name, order of the field declaration, field names etc.) of the

default structures, then an entire new heap pattern must be written for the new structure (a

heap pattern definition takes between 100 and 300 code-lines with all the auxiliaries needed).

This is where the problem becomes clear: the need of an automated process for generating new

heap patterns.

3 Proposed Solution

Writing a new heap pattern for each defined structure by hand, though not completely impossi-

ble, becomes rather cumbersome. The solution comes with the idea that theoretical structures

(such as linked lists, binary trees etc.) can be summed up into templates. On this basis, using

explicit parameters (actual C code) for the defined templates, heap patterns can be generated

with minimal human intervention. The only input that is needed is an annotation (just a com-

ment in C):

/*@ pattern structName<(listInformationFields), listPointers> patternName*/.

A template is a parametrized set of rules that regard the rolling and unrolling operations

and the ones used for the construction of the heap pattern. By filling out the gaps of the

template with the information from the C source file all the requirements for writing a new

heap pattern are met.

4 Future Work

The work presented up until now represents an extension for MatchC. So far we have dealt with

the problem of creating the correct heap patterns based on existing C code for simple structures

(no pointers or structures as fields), linked lists and binary trees. Yet we can achieve more.

The templates and the generator mechanism can be broaden to work for arrays. Up to this

moment, due to the properties of the arrays, writing a heap pattern would not have been fea-

sible, the costs greatly outweighing the benefits. Using the tool all necessary modules can be

generated in less than a minute. This means that having different heap patterns written for

different arrays becomes possible.

2

44


References

[1] Grigore Rosu, Andrei Stefanescu Matching Logic: A New Program Verication Approach.

NIER ICSE, 2011.

[2] Grigore Rosu, Chucky Ellison, Wolfram Schulte Matching Logic: An Alternative to

Hoare/Floyd Logic. AMAST, 2010.

[3] Grigore Rosu, Chucky Ellison, Wolfram Schulte From Rewriting Logic Executable Semantics

to Matching Logic Program Verication. POPL, 2010.

3

45


Debugging Programs using the Language Definition

Adrian Riesco1, Irina Mariuca Asavoae2, and Mihail Asavoae21Universidad Complutense de Madrid, Spain

[email protected] Ioan Cuza University, Romania

{mariuca.asavoae, mihail.asavoae}@info.uiac.ro

1 Introduction

K [10] is a rewriting-based framework to define programming languages and analysis tools.

It comes with a highly concurrent rewrite abstract machine, a definitional technique, and a

specialized notation. Due to all these, the languages defined using the K framework range from

the simple imperative language IMP to highly complex, real life languages such as C [4] or

Scheme [7]. One important characteristic of a K language definition is executability, therefore,

one could use the definition to run (and debug) programs.

We propose a debugging methodology of IMP programs, that employs a procedure of

counterexample-driven refinement. The process works on a target program, annotated at points

of interest, with expected (partial) state information. For simplification purposes, we consider

only one such annotation, at the end of the program. The methodology combines two steps:

(1) a forward propagation to collect the set of reachable states, that is used by (2) a backward

propagation to detect potential bugs.

Step (1) starts with the formal executable semantics of the IMP language and, using reach-

ability analysis, it determines the set of all the possible configurations, computing, along the

process, the sets of dependencies between program variables. The forward propagation results,

combined with the expected state annotation, produce the initial state for the backward prop-

agation. As such, out of all possible configurations, we select only those that do not match the

expected configuration. A similar selection determines the set of dependent program variables,

that also appear in the annotation.

Step (2) uses the weakest precondition mechanism to propagate the results from step (1).

The novelty lies here, as the inference rules for the weakest precondition computation derive

from the concrete rewrite rules of the formal semantics. To make the program instructions

amenable to “backward execution,” we enrich the concrete program configuration with addi-

tional information, i.e. a representation of the program itself. We use the K framework to define

the new abstract state and the “reversed” rewrite rules.

Apart from the classical testing-based approaches for software bugs detection, the model

checking and abstract interpretation methods drew much interest, because of their complete

execution paths coverage that leads to formal program debugging. The methodology of ab-

stract debugging [2] combines program verification and analysis techniques, such as invariant

assertions and intermittent abstractions, to find the origin of potential bugs. Lazy abstraction

[6] exploits the potential locality of software bugs, via local refinements, whereas reachability

analysis over an abstract boolean program [1] applies to find errors in device drivers. Both lazy

abstraction and boolean program abstraction specialize predicate abstraction [5], a general bug

detection technique.

The Maude system supports several approaches for debugging Maude programs: tracing,

term coloring, and using an internal debugger [3, Chap. 22]. The tracing facilities allow us

1

46


to follow the execution of a specification, that is, the sequence of applications of statementsthat take place. Term coloring uses different colors to print the operators used to build a termthat does not fully reduce. Finally, the Maude internal debugger allows the definition of breakpoints in the execution by selecting some operators or statements. When a break point is foundthe debugger is entered, where we can see the current term and execute the next rewrite withtracing turned on.

Complementing these mechanisms, a declarative debugger [9], which helps the user to findthe statement that produced a bug by asking questions about the correctness of the computationsteps, has been developed for Maude specifications. However, when using Maude as a logicalframework we can only use these approaches to debug the semantics of the programs, but notthe programs themselves, as illustrated in [8].

Hence, rewriting logic based definitions of programming languages support program exe-cutability, but, in the same time, lack in debugging facilities. The current state of the artapproaches to discover bugs either help debugging the formal semantics or work with an ab-stract representation of the semantics. Our proposal attempts to bridge the gap between thesetwo general methodologies.

References

[1] Thomas Ball and Sriram K. Rajamani. The slam project: debugging system software viastatic analysis. POPL ’02, pages 1–3, New York, NY, USA, 2002. ACM.

[2] Francois Bourdoncle. Abstract debugging of higher-order imperative languages. PLDI ’93,pages 46–55, New York, NY, USA, 1993. ACM.

[3] Manuel Clavel, Francisco Duran, Steven Eker, Patrick Lincoln, Narciso Martı-Oliet, JoseMeseguer, and Carolyn L. Talcott, editors. All About Maude - A High-Performance Logical

Framework, How to Specify, Program and Verify Systems in Rewriting Logic, volume 4350of LNCS. Springer, 2007.


[5] Susanne Graf and Hassen Saıdi. Construction of abstract state graphs with pvs. In CAV,pages 72–83, 1997.

[6] Thomas A. Henzinger, Ranjit Jhala, Rupak Majumdar, and Gregoire Sutre. Lazy abstrac-tion. POPL ’02, pages 58–70, New York, NY, USA, 2002. ACM.

[7] Patrick Meredith, Mark Hills, and Grigore Rosu. An executable rewriting logic semantics ofK-scheme. In Danny Dube, editor, SCHEME’07, Technical Report DIUL-RT-0701, pages91–103. Laval University, 2007.

[8] Adrian Riesco, Alberto Verdejo, Narciso Martı-Oliet, and Rafael Caballero. A declarativedebugger for Maude. In Jose Meseguer and Grigore Rosu, editors, AMAST 2008, volume5140 of Lecture Notes in Computer Science, pages 116–121. Springer, 2008.

[9] Adrian Riesco, Alberto Verdejo, Narciso Martı-Oliet, and Rafael Caballero. Declarativedebugging of rewriting logic specifications. Journal of Logic and Algebraic Programming,2011. To appear.


2

47


KOOL: Defining Object-Oriented Languages in KGrigore Ros,u

1, Mark Hills

2, and Traian Florin S, erbanut, a

1,3

1University of Illinois at Urbana-Champaign

2Centrum Wiskunde & Informatica, Amsterdam, Netherlands

3Alexandru Ioan Cuza University of Ias, i

This paper discusses the K [5, 6, 4] semantic definition of the KOOL language.KOOL is intended to be a pedagogical and research language that captures theessence of the object-oriented programming paradigm. A program consists of aset of class declarations, each containing a set of fields and a set of methods. Todistinguish the entry point of the program, one of the classes must be namedmain and contain a method named also main which is invoked when the programis executed. KOOL includes the following object-oriented features in addition tothe conventional imperative expression and statement constructs: object creationand self references, (single) inheritance, dynamic method dispatch, super calls(with static dispatch), dynamic testing for class type (instanceOf).

A particularly interesting issue from a definitional point of view is thetype system associated to KOOL. The paper discusses both a dynamicallytyped semantics and a static type checker which are aware of class subtyping,upcasting and down-casting.

All definitions are executable, being amenable for test and use as interpreters.Although sharing a name with another language defined in K [2], this version

of KOOL has a different focus. The earlier version was designed to provide anenvironment for experimenting with various language features. Part of this workwas investigating the impact of language design decisions on performance ofthe definition, including both runtime and verification performance [3], whileother work focused on using rewriting-based tools to aid in the creation oflanguage extensions [1, 3]. The new version of KOOL shares some of the sameinterest in providing a core OO language as a platform for research, but ismore focused on pedagogy.

References

[1] Feng Chen, Mark Hills, and Grigore Rosu. A Rewrite Logic Approach toSemantic Definition, Design and Analysis of Object-Oriented Languages.Technical Report UIUCDCS-R-2006-2702, Department of Computer Science,University of Illinois at Urbana-Champaign, 2006.

1

48


[2] Mark Hills and Grigore Ros,u. KOOL: An application of rewriting logic tolanguage prototyping and analysis. In Franz Baader, editor, RTA, volume4533 of Lecture Notes in Computer Science, pages 246–256. Springer, 2007.ISBN 978-3-540-73447-5. doi: 10.1007/978-3-540-73449-9 19.

[3] Mark Hills and Grigore Ros,u. On formal analysis of OO languages usingrewriting logic: Designing for performance. In Proceedings of the 9th IFIP

International Conference on Formal Methods for Open Object-Based Dis-

tributed Systems (FMOODS’07), volume 4468 of Lecture Notes in Computer

Science, pages 107–121. Springer, 2007. doi: 10.1007/978-3-540-72952-5 7.also appeared as Technical Report UIUCDCS-R-2007-2809, January 2007.

[4] K. The K Framework, 2010. URL http://k-framework.googlecode.com.

[5] Grigore Rosu and Traian Florin Serbanuta. An overview of the K semanticframework. Journal of Logic and Algebraic Programming, 79(6):397–434,2010. doi: 10.1016/j.jlap.2010.03.012.

[6] Traian Florin S, erbanut, a and Grigore Rosu. K-Maude: A rewriting basedtool for semantics of programming languages. In WRLA, pages 104–122,2010. doi: 10.1007/978-3-642-16310-4 8.

2

49


FUN: Defining Functional Languages in KGrigore Ros,u and Traian Florin S, erbanut, a

University of Illinois at Urbana-ChampaignAlexandru Ioan Cuza University of Ias, i

This paper discusses the K [2, 3, 1] semantic definition of (the untypedversion of) the FUN language. FUN is intended to be a pedagogical and researchlanguage that captures the essence of the functional programming paradigm,extended with several features often encountered in functional programminglanguages. Like many functional languages, FUN is an expression language, thatis, everything, including the main program, is an expression. Functions can bedeclared anywhere and evaluate to closures, which are first class values in thelanguage. To make it more interesting and to highlight some of K’s strengths,FUN includes the following features in addition to the conventional functionalconstructs encountered in similar languages used as teaching material:

• Functions can take multiple arguments in two different ways. First, theycan take space-separated arguments whose semantics is given by currying.Second, they can take comma-separated tuple arguments, whose semanticsis given directly, not via currying. For example, FUN allows functiondeclarations/invocations of the form “f (a,b) c (d,e)”.

• Similarly, we allow let and letrec binders which work with lists ofvariables and expressions, and we give their semantics directly, withoutdesugaring them to one-argument variants. We also allow the usual syntac-tic sugar for declaring-and-binding functions with “let f (a,b) c (d,e)= ...”.

• We include a callcc construct, for two reasons: first, several functionallanguages support this construct; second, some semantic frameworks havea hard time defining it.

• Finally, we include mutables by means of referencing, dereferencing andassignments. We include these for the same reasons as above: there arefunctional languages which have them, and they are not easy to define insome semantic frameworks.

The definition of FUN is executable, therefore it can be used to execut FUNprograms, to test them and explore/trace their behaviors.

1

50


References


[2] Grigore Ros,u and Traian Florin S, erbanut, a. An overview of the K semantic

framework. Journal of Logic and Algebraic Programming, 79(6):397–434,2010. doi: 10.1016/j.jlap.2010.03.012.

[3] Traian Florin S, erbanut, a and Grigore Ros,u. K-Maude: A rewriting based tool

for semantics of programming languages. In Peter Csaba Olveczky, editor,

WRLA, volume 6381 of Lecture Notes in Computer Science, pages 104–122.Springer, 2010. ISBN 978-3-642-16309-8. doi: 10.1007/978-3-642-16310-4 8.

2

51


SIMPLE: Defining Imperative Languages in KGrigore Ros,u and Traian Florin S, erbanut, a

University of Illinois at Urbana-ChampaignAlexandru Ioan Cuza University of Ias, i

This paper discusses the K [2, 3, 1] semantic definition of the SIMPLE

language. SIMPLE is intended to be a pedagogical and research language that

captures the essence of the imperative programming paradigm, extended with

several features often encountered in imperative programming languages. A

program consists of a set of global variable declarations and function definitions.

Like in C, function definitions cannot be nested and each program must have one

function called main, which is invoked when the program is executed. To make

it more interesting and to highlight some of K’s strengths, SIMPLE includes

the following features in addition to the conventional imperative expression

and statement constructs:

• Multidimensional arrays and array references. An array evaluates to

an array reference, which is a special value holding a location (where the

elements of the array start) together with the size of the array; the elements

of the array can be array references themselves (particularly when the

array is multi-dimensional). Array references are ordinary values, so they

can be assigned to variables and passed/received by functions.

• Functions and function references. Functions can have zero or more param-

eters and can return abruptly using a return statement. SIMPLE follows a

call-by-value parameter passing style, with static scoping. Function names

evaluate to function references, which hereby become ordinary values in

the language, same like the array references.

• Blocks with locals. SIMPLE variables can be declared anywhere, their

scope being the most nested enclosing block.

• Input/Output. The expression read() evaluates to the next value in the

input buffer, and the statement write(e) evaluates e and outputs its value

to the output buffer. The input and output buffers are lists of values.

• Exceptions. SIMPLE has parametric exceptions (the value thrown as an

exception can be caught and bound).

• Concurrency via dynamic thread creation/termination and synchronization.

One can spawn a thread to execute any statement. The spawned thread

1

52


shares with its parent its environment at creation time. Threads can besynchronized via reentrant locks which can be acquired and released, aswell as through rendezvous.

In addition to the basic, untyped, version of SIMPLE, the paper will alsodiscuss a type system for simple, together with the K definition of a static typechecker and a dynamically typed semantics. All definitions are executable, beingamenable for test and use as interpreters.

References




2

53


DSMLK

Vlad Rusu and Dorel Lucanu

Inria Lille, France University of Iasi, [email protected] [email protected]

1 Introduction and Motivation

Domain-Specific Modelling Languages (dsmls) are languages dedicated to modelling in specificapplication areas. Recently, the design of dsmls has become widely accessible to engineerstrained in the basics of Model-Driven Engineering (mde): one designs a metamodel for thelanguage’s abstract syntax; then, the language’s operational semantics is expressed using modeltransformations over the metamodel. Such designs can be implemented in mde tools [1, 2, 3].

The democratisation of dsml design catalysed by mde is likely to give birth to numerouslanguages. One can also reasonably expect that there shall be numerous errors in those lan-guages. Indeed, getting a language right (especially its operational semantics) is hard, regardlessof whether the language is defined in the modern mde framework or in more traditional ones.

Formal approaches can benefit language designers by helping them avoid or detect errors.But, in order to be accepted by nonexpert users, formal approaches have to operate in thebackground of a familiar language design process, such as the mde-based one mentioned above.

We propose here such an approach, which uses the K semantic framework [4] to formalisethe basic mde ingredients used in dsml definition: models, metamodels, model transformations.

Hence, users can define dsmls using familiar mde ingredients, but, thanks to our mappingof those ingredients to K, users gain access to K’s verification tools: a bounded model checkerfor safety properties, and a verifier for matching logic properties [5]. In this work we focus onusing the bounded model checker, and leave the matching-logic verifications to future work.

2 Outline of the Approach

Our approach is not to define dsmls directly in K, since K is unlikely to be accepted bynonexperts - and one main motivation for this work is to gain acceptance, possibly furtheringthe cause of formal methds in practice. Rather, we haved defined in K the basic mde artifacts:

• a language for metamodels. We have adopted the dedicated km3 language [6] for thediagrammatic part of a metamodel, and the standard ocl [7] for textual constraints.

• a language for models (i.e., instances of metamodels). We have adapted km3 for this task.

• a language for model transformations, called kmrl, which draws inspiration the atl [2]model transformation language as well as from K itself. From atl we retain the combi-nation of declarative (rewrite rules) and imperative constructs (assignments, loops, andconditionals). From K we retain the local rewriting mechanism, which shows exactly wherea rewriting takes place, and which leads to simple, clear, and compact rewrite rules.

1

54


The approach will be illustrated in a full paper on the xspem language [8]. xspem is a languagefor activities constrained by time, by resources, and by precedence relations. It is based onthe omg standard [9]. We show how users can automatically: check model-to-metamodelconformance (including ocl constraints); execute a dsml’s semantics, starting from a giveninitial model; and perform bounded model checking for safety properties over dsml executions.

3 Comparison with Related Work

kermeta [1] is an object-oriented language extended with metamodelling features, which allowsusers to define the syntax of dsmls using metamodels, and their operational semantics by meansof imperative commands of the language. Compared to kermeta, kmrl also has declarativefeatures (rewrite rules), it is formally defined, and allows for formal verification.

The atl language [2] is a mixed declarative/imperative model transformation language. Aformal definition of atl in Maude [10] has been given in [11]. We took inspiration from atl

in this work. Compared to atl, the declarative features of ktl are more developped: in atl

one can only match over one model element, whereas in kmrl we allow for matching overarbitrary model patterns. On the other hand, atl’s imperative features are more developpedthan kmrl’s: in atl rules can call each other, and can invoke methods of class diagrams.

Several other approaches [12, 13, 14] use the Maude algebraic and rewriting-based formalspecification language [10]. In these approaches, model transformations (in particular, for dsml

operational semantics) can only be specified in a declarative manner, by mapping them toMaude equations/rewrite rules. Compared to these approaches, ours also includes imperativefeatures, which are lower-lever but allow for better control and efficiency. The same comparisoncan be drawn with declarative model transformations based on graph rewriting [15, 16].

Finally, the so-called translational approach [17] consists in endowing a source language(in particular, a dsml) with a formal semantics by translating it to a target language thatdoes have a formally defined semantics. For example, xspem has been defined by translationto timed Petri nets [17]. Our approach differs is that we define not individual dsmls, but adsml definition framework (here, the mde-based one). Our approach is more general than thetranslational one, and is more likely to be accepted by nonexperts since it does not require fromthem specialised knowledge of a target language (for writing a translation from dsml to it).The downside of our approach, which arises from its generality, is that we are likely to be lessefficient for execution/verification than specialised, “hard-coded” translational approaches.

References

[1] Pierre-Alain Muller, Franck Fleurey, and Jean-Marc Jezequel. Weaving executability intoobject-oriented meta-languages. In MoDELS, volume 3713 of Lecture Notes in ComputerScience, pages 264–278. Springer, 2005.

[2] Frederic Jouault, Freddy Allilaire, Jean Bezivin, and Ivan Kurtev. ATL: A model trans-formation tool. Sci. Comput. Program., 72(1-2):31–39, 2008.

[3] The Objet Management Group. Meta Object Facility (MOF) 2.0 query/view/ transforma-tion specification. Technical report, 2011. http://www.omg.org/spec/QVT/1.1/PDF/.


2

55


[5] Grigore Rosu, Chucky Ellison, and Wolfram Schulte. Matching logic: An alternative toHoare/Floyd logic. In Michael Johnson and Dusko Pavlovic, editors, Proceedings of the 13thInternational Conference on Algebraic Methodology And Software Technology (AMAST’10), volume 6486, pages 142–162. LNCS, 2010.

[6] Frederic Jouault and Jean Bezivin. KM3: A dsl for metamodel specification. In RobertoGorrieri and Heike Wehrheim, editors, FMOODS, volume 4037 of Lecture Notes in Com-puter Science, pages 171–185. Springer, 2006.

[7] The Objet Management Group. The object constraint language, version 2.2. Technicalreport, 2010. http://www.omg.org/spec/OCL/2.2/.

[8] Reda Bendraou, Benoıt Combemale, Xavier Cregut, and Marie-Pierre Gervais. Definitionof an executable spem 2.0. In APSEC, pages 390–397. IEEE Computer Society, 2007.

[9] Software & systems process engineering metamodel specification (spem).http://www.omg.org/spec/SPEM/2.0/.

[10] M. Clavel, F. Duran, S. Eker, P. Lincoln, N. Martı-Oliet, J. Meseguer, and C. L. Talcott.All About Maude, A High-Performance Logical Framework, volume 4350 of Lecture Notesin Computer Science. Springer, 2007.

[11] Javier Troya and Antonio Vallecillo. Towards a rewriting logic semantics for atl. In LaurenceTratt and Martin Gogolla, editors, ICMT, volume 6142 of Lecture Notes in ComputerScience, pages 230–244. Springer, 2010.

[12] Artur Boronat, Reiko Heckel, and Jose Meseguer. Rewriting logic semantics and verificationof model transformations. In Marsha Chechik and Martin Wirsing, editors, FASE, volume5503 of Lecture Notes in Computer Science, pages 18–33. Springer, 2009.

[13] Jose Eduardo Rivera, Francisco Duran, and Antonio Vallecillo. Formal specification andanalysis of domain specific languages using Maude. Simulation: Transactions of the Societyfor Modeling and Simulation International, 85(11 - 12):778–792, 2009.

[14] Vlad Rusu. Embedding domain-specific modelling languages into Maude specifications.ACM Software Engineering Notes, 2011. To appear. Extended version available athttp://researchers.lille.inria.fr/~rusu/SoSym/.

[15] Gabriele Taentzer. AGG: A graph transformation environment for modeling and validationof software. In John L. Pfaltz, Manfred Nagl, and Boris Bohlen, editors, AGTIVE, volume3062 of Lecture Notes in Computer Science, pages 446–453. Springer, 2003.

[16] Gyorgy Csertan, Gabor Huszerl, Istvan Majzik, Zsigmond Pap, Andras Pataricza, andDaniel Varro. VIATRA - visual automated transformations for formal verification andvalidation of UML models. In ASE, pages 267–270. IEEE Computer Society, 2002.

[17] B. Combemale, X. Cregut, P.-L. Garoche, and X. Thirioux. Essay on Semantics Definitionin MDE. An Instrumented Approach for Model Verification. Journal of Software, 4(9):943–958, November 2009.

3

56


K Semantics for OCL - a Proposal for a Formal Definition for

OCL

Vlad Rusu and Dorel Lucanu

Inria Lille, France University of Iasi, [email protected] [email protected]

1 Introduction and Motivation

Object Constraint Language (OCL) is a formal language used to describe expressions like con-

straints or queries over objects in a UML model. The constraints are used to give an exact

description of the information contained in the models and the queries are used to analyze

these models and to validate them. The evaluation of the OCL expressions does not have side

effects. In spite of the fact OCL has defined more than ten years ago, it is not yet widely

adopted in industry and one reason for that is the lack of proper and integrated tool support

for OCL. Another reason is that although designed to be a formal language, experience has

shown that the language definition is not precise enough. Even the last OMG standard includes

underspecified things and some inconsistencies.

In this paper we present an executable formal semantics for OCL described in K, a se-

mantic framework suitable for defining programming languages, type systems, formal analysis

tools and calculi. K has been already successfully used for giving formal definitions to several

programming languages and developing analysis tools for these languages. Therefore having

a formal definition for OCL in this framework has several advantages including executability

(the K semantics of OCL can be used to evaluate OCL expressions for different models), an

easy integration with model languages and object-oriented languages defined in K, and could

constitute a first complete formal definition for OCL.

2 Outline of the Approach

The definition for OCL we define in K is the OMG standard [6]. A K definition has three main

ingredients: computations - which carry ”computational meaning” as special lists structures

sequentializing computational tasks-, a configuration - which organizes the system/program

state-, and a set of K rules - which give operational semantics to syntactical constructs.

A configuration for OCL consists of a sub-configuration for storing the state of the model(s),

a cell � �k meant to store the computation task derived from the evaluation of the OCL expres-

sions, a cell � �mem storing the partial results, and a cell � �result for the result value. The

computational tasks are those derived from the K syntax of OCL together with some auxiliary

tasks needed to compute certain partial results. The semantics is given a set of K rules, one for

each computational task.

The value v = [[e]]M obtained by the evaluation of an OCL expression e over a model M is

given by

�e�k �·�mem �·�result �M�model∗−−−→ �·�k �·�mem �v�result �M�model

1

57


The fact that the evaluation of the expression has no side effects is given by the fact thecell � �model includes the same model instance M at both beginning and the ending of thecomputation. We write M |= e ⇒ v. If e is a constraint expression, then we say that Msatisfies e, write M |= e, if and only if M |= e ⇒ true.

3 Related Work

One of first formalizations of OCL is given in [2], where a conservative shallow embedding ofthe Object Constraint Language (OCL) in Isabelle/HOL is given. This encoding aimed toaccompany the development process of OCL and, in the same time, to be a foundation for toolsupported reasoning over OCL specifications, for example as basis for test case generation.

An interesting extension of OCL with temporal expressions is given in [5], where a semanticsfor the integration of OCL with UML Statecharts is given. This semantics is given through amapping of temporal OCL expressions to temporal logics formulae.

In [7] a tool which analyzes the syntax and semantics of OCL constraints together with aUML model and translates them into the language of the theorem prover PVS is presented.

A semantics aimed to express the desired well-formedness constraints in OCL with respectto the metamodel of the target modeling language is presented in [3]. Their approach allows usto . This semantics maps OCL constraints to propositional formulae, which are then fed into aSAT solver.

A formal semantics for a significant subset of OCL is presented in [4]. This semantics isbased on a mapping from UML models with OCL expressions to Church-Rosser and terminat-ing equational theories. A formal tool built in Maude is able to automatically evaluate OCLexpressions over selected scenarios.

Within the project MOMENT2, an algebraic framework for MOF metamodeling, the OCLconstraints can be used for both static and dynamic analysis [1]. To achieve this goal, theconcept of OCL-constrained metamodel conformance is formally defined.

References

[1] Artur Boronat and Jose Meseguer. Algebraic semantics of ocl-constrained metamodel spec-ifications. In TOOLS (47), volume 33 of Lecture Notes in Business Information Processing,pages 96–115. Springer, 2009.

[2] Achim D. Brucker and Burkhart Wolff. A proposal for a formal ocl semantics in Is-abelle/HOL. In TPHOLs, volume 2410 of Lecture Notes in Computer Science, pages 99–114.Springer, 2002.

[3] Krzysztof Czarnecki and Krzysztof Pietroszek. Verifying feature-based model templatesagainst well-formedness ocl constraints. In GPCE, pages 211–220. ACM, 2006.

[4] Marina Egea. An executable formal semantics for OCL with aopplications to model analysisand validation. PhD thesis, Universidad Complutense de Madrid, 2008.

[5] Stephan Flake and Wolfgang Muller 0003. Formal semantics of static and temporal state-oriented ocl constraints. Software and System Modeling, 2(3):164–186, 2003.

[6] The Objet Management Group. The object constraint language, version 2.2. Technicalreport, 2010. http://www.omg.org/spec/OCL/2.2/.

2

58


[7] Marcel Kyas, Harald Fecher, Frank S. de Boer, Joost Jacob, Jozef Hooman, Mark van der

Zwaag, Tamarah Arons, and Hillel Kugler. Formalizing uml models and ocl constraints in

pvs. Electr. Notes Theor. Comput. Sci., 115:39–47, 2005.

3

59


MatchC: Matching Logic Verification usingthe K Framework

Extended Abstract

Andrei StefanescuUniversity of Illinois at Urbana-Champaign

[email protected]

Matching logic [2,1] is a new logic designed to state and reason aboutstructural properties over program configurations. Syntactically, it introducesa new first-order formula construct, called a pattern, which is a configurationterm, possibly containing variables. Semantically, its models are actually con-crete program configurations, where a configuration satisfies a pattern iff itmatches it. MatchC [1] is a matching logic verifier for a deterministic frag-ment of C implemented in the K framework [3]. In this paper we discuss:(1) the architecture of the current a version of the verifier, with emphasis onthe components implemented in K; and (2) the design a new version of theverifier based on the recently proposed deduction system of matching logic,again with emphasis on the K part. The matching logic verifier is the firstproject using K for symbolic execution and logic reasoning.

1 Current version

Generally, matching logic specification are rewrite rules between matchinglogic formulae. The tool accepts specification in the more restricted format:

�code ···�k πl ∧ ψl ⇒ �· ···�k πr ∧ ψr

where πl, πr are patterns (symbolic program configurations), and ψ1, psi2 areexistentially quantified first order logic formulae. The rule captures partialcorrectness: if the program fragment code is executed in a configuration thatmatches πl and satisfies ψl, and the execution terminates, then the resultingconfiguration matches πr and satisfies ψr.

Preprint submitted to Electronic Notes in Theoretical Computer Science 15 July 2011

60


Currently, three components of MatchC are implemented in K: the se-mantics of the C fragment, the matching logic deduction, and the matchinglogic formulae implication. The C fragment is defined in straight forwardmanner. The matching logic deduction module makes use of the K modular-ity in extending configurations. A formulae π ∧ ψ is represented as a taskcell containing a config cell and a form cell. The top configuration is a bagof tasks. To prove a specification correct, the prover rewrites the task forthe left-hand-side of the rule and at the end checks if resulting configurationimplies the right-hand-side of the rule. The (unmodified) original semanticsis extended with with rules for executing annotated functions and loops, forapplying abstraction axioms, and for splitting the state in the case of if withsymbolic condition. Note that although the original semantics is intended forconcrete execution, due to the nature of K rewriting, it works for symbolicexecution as well. The matching logic formulae implication consists of twoparts: matching the structure and checking the constraints. Structure match-ing is implemented in K as a set of rules that attempt to match correspondingparts of each cell’s contents and generate the associated constraints. Con-text transformation is essential in having a reasonable size implementation.The formulae implication is implemented as search for a proof in a rule sidecondition, a case that does not occur in any other K definition.

2 New version

We discuss the design of a new version of MatchC based on a recently pro-posed matching logic deduction system consisting of eight proof rules. Thededuction module attempts to orient the proof rules from left to right as muchas possible. Each axiom has a set of triggers that needs to be matched in theconfiguration in order to apply the axiom. Specifications can be associatedwith any fragment of code, not just functions and loops. The verifier at-tempts to use the specifications every time the fragment of code is on top ofthe computation.

References

[1] Grigore Rosu and Andrei Stefanescu. Matching logic: A new program verification approach

(NIER track). In Proceedings of the 30th International Conference on Software Engineering(ICSE’11), pages 868–871. ACM, 2011.

[2] Grigore Rosu, Chucky Ellison, and Wolfram Schulte. Matching logic: An alternative to

Hoare/Floyd logic. In AMAST ’10, volume 6486. LNCS, 2010.

[3] Grigore Rosu and Traian Florin Serbanuta. An overview of the K semantic framework. Journalof Logic and Algebraic Programming, 79(6):397–434, 2010.

2

61


A concurrent semantics for the K framework

Traian Florin S, erbanut, a

Alexandru Ioan Cuza University of Ias, iUniversity of Illinois at Urbana-Champaign

Given the intrinsic potential for concurrency of rewriting, it is natural toattempt (and succeed in) defining concurrent programming languages usingrewriting. However, a question arising in this context is whether the frameworkis generous enough to be able to offer the amount of concurrency desired bythe language designer.

One of the most important characteristics of the K-framework [7, 8], makingit appropriate for defining programming languages is its natural way of capturingconcurrency. Besides being truly concurrent as the chemical abstract machine [1],K rules also allow capturing concurrency with sharing of resources. However, thedirect representation of K in rewriting logic prohibits concurrent access to re-sources, as concurrent applications of rewrite are not allowed for overlapping rules.

This paper defines K rewriting, a faithful concurrent semantics for K whichcaptures the intended concurrency specified by the K rules. Noticing thatthe read-only pattern of a K rule is similar with the interface graph of a graphrewriting rule, we have formalizedK rewriting through the help of graph rewriting,adapting existing representations [6, 2] of terms and rules as graph and graphrewrite rules to maximize their potential for concurrent application.

Classical results in the algebraic theory of graph rewriting ensure [4, 3, 5]that concurrent applications of graph rules are possible and also serializableif they only overlap on the parts matched by their interface graphs. However,this is precisely the intended semantics for K.

Our main result shows that K-rewriting is sound and complete w.r.t. standardterm rewriting using the direct representation of K rules as rewrite rules, andthat the concurrent application of K rules is serializable. Soundness meansthat applying one K rule can be simulated by applying its corresponding directrepresentation as a rewrite rule. Completeness means the converse, i.e., that oneapplication of the direct representation of a K rule can be simulated by applyingthe K rule directly. Finally, the serialization result ensures that applying multipleK rules in parallel can be simulated by applying them one by one, maybe multipletimes, whence, if one does not care about the amount of progress achievablein one step, the direct rewriting logic representation can simulate K rewritingfor all practical purposes.

1

62


References

[1] Gerard Berry and Gerard Boudol. The chemical abstract machine. Theoretical

Computer Science, 96(1):217–248, 1992. doi: 10.1145/96709.96717.

[2] Andrea Corradini and Francesca Rossi. Hyperedge replacement jungle rewrit-

ing for term-rewriting systems and logic programming. Theoretical Computer

Science, 109(1&2):7–48, 1993. doi: 10.1016/0304-3975(93)90063-Y.

[3] Andrea Corradini, Ugo Montanari, Francesca Rossi, Hartmut Ehrig, Reiko

Heckel, and Michael Lowe. Algebraic approaches to graph transformation:

Basic concepts and double pushout approach. In Handbook of graph grammars

and computing by graph transformations, volume 1, pages 163–246. World

Scientific, 1997.

[4] Hartmut Ehrig and Hans-Jorg Kreowski. Parallelism of manipulations in

multidimensional information structures. In MFCS’76, volume 45 of Lecture

Notes in Computer Science, pages 284–293. Springer, 1976. doi: 10.1007/3-

540-07854-1 188.

[5] Hartmut Ehrig, Karsten Ehrig, Ulrike Prange, and Gabriele Taentzer. Fun-

damentals of algebraic graph transformation. Monographs in Theoretical

Computer Science. An EATCS Series. Springer, 2006.

[6] Annegret Habel, Hans-Jorg Kreowski, and Detlef Plump. Jungle evaluation.

In ADT’87, volume 332 of Lecture Notes in Computer Science, pages 92–112.

Springer, 1987. doi: 10.1007/3-540-50325-0 5.

[7] Grigore Rosu and Traian Florin Serbanuta. An overview of the K semantic

framework. Journal of Logic and Algebraic Programming, 79(6):397–434,

2010. doi: 10.1016/j.jlap.2010.03.012.

[8] Traian Florin S, erbanut, a and Grigore Rosu. K-Maude: A rewriting based

tool for semantics of programming languages. In WRLA, pages 104–122,

2010. doi: 10.1007/978-3-642-16310-4 8.

2

63


From Language Definitionsto (Runtime) Analysis Tools



The K framework [7, 9] is a rewritng-based framework specializing in de-signing executable definitions for programming languages. The rewriting logicrepresentation of K definitions gives them access to the arsenal of generic toolsfor rewriting logic available through the Maude rewrite engine [1]: state spaceexploration, LTL model checking, inductive theorem proving, and so on. Thiscollection of analysis tools is by itself enough to provide more information aboutthe behaviors of a program than one would get by simply testing the programusing an interpreter or a compiler for that language. Nevertheless, the effortof defining the semantics pays back in more than just one way: by relativelyfew alterations to the definition, one can use the same generic tools to obtaintype checkers and type inferencers [2], static policy checking tools [4, 3], runtimeverification tools [8], and even Hoare-like program verification tools [6].

In this paper we argue that K definitions can be used to test and analyzethe executions of programs written in real-life languages either directly or bytransforming the definitions into runtime analysis tools. To stress the “real-life”aspect, we choose as our running example a subset of the C programming lan-guage, named KernelC. KernelC allows writing C programs with addition andsubtraction, increment, assignment, basic comparison operators and logical con-nectives, ternary if, basic input/output library functions, expression statements,statement composition and blocks, conditional, while loop, function declarationand invocations, variable declarations, memory allocation, freeing, and derefer-encing (including array dereferencing). We first show how this definition can beeasily turned into a runtime verification tool for strong memory safety [8]. Thenwe extend KernelC with thread creation and synchronization constructs, andshow how this definition can be adjusted for (1) checking whether the executionsof a program are datarace free; or (2) instrumenting the execution to obtaintraces for applying predictive runtime analysis techniques. Finally, we use theexecutability of definitions and the ability to explore execution paths to compare,test, and analyze the differences between two definitions of the concurrencyfeatures for KernelC: a very simple sequential-consistent one, and one basedon a relaxed memory model inspired from the x86-TSO memory model [5].

1

64


References

[1] M. Clavel, F. Duran, S. Eker, J. Meseguer, P. Lincoln, N. Martı-Oliet,and C. Talcott. All About Maude, A High-Performance Logical Framework,volume 4350 of Lecture Notes in Computer Science. Springer, 2007. doi:10.1007/978-3-540-71999-1.

[2] Chucky Ellison, Traian Florin S, erbanut, a, and Grigore Ros,u. A rewriting logicapproach to type inference. In Recent Trends in Algebraic Development Tech-niques — 19th International Workshop, WADT 2008, Pisa, Italy, June 13-16,2008, Revised Selected Papers, volume 5486 of Lecture Notes in ComputerScience, pages 135–151. Springer, 2009. doi: 10.1007/978-3-642-03429-9 10.

[3] Mark Hills and Grigore Rosu. A rewriting logic semantics approach tomodular program analysis. In Christopher Lynch, editor, Proceedings of the21st International Conference on Rewriting Techniques and Applications,volume 6 of Leibniz International Proceedings in Informatics (LIPIcs), pages151–160, Dagstuhl, Germany, 2010. Schloss Dagstuhl–Leibniz-Zentrum fuerInformatik. ISBN 978-3-939897-18-7. doi: 10.4230/LIPIcs.RTA.2010.151.

[4] Mark Hills, Feng Chen, and Grigore Ros,u. A rewriting logic approach tostatic checking of units of measurement in C. In Proceedings of the 9thInternational Workshop on Rule-Based Programming (RULE’08), volume ToAppear of Electronic Notes in Theoretical Computer Science. Elsevier, 2008.

[5] Scott Owens, Susmit Sarkar, and Peter Sewell. A better x86 memory model:x86-tso. In Stefan Berghofer, Tobias Nipkow, Christian Urban, and MakariusWenzel, editors, TPHOLs, volume 5674 of Lecture Notes in Computer Science,pages 391–407. Springer, 2009. ISBN 978-3-642-03358-2. doi: 10.1007/978-3-642-03359-9 27.

[6] Grigore Rosu, Chucky Ellison, and Wolfram Schulte. Matching logic: Analternative to Hoare/Floyd logic. In Thirteenth International Conference onAlgebraic Methodology And Software Technology (AMAST ’10). LNCS, 2010.to appear.


[8] Grigore Ros,u, Wolfram Schulte, and Traian Florin S, erbanut, a. Runtimeverification of C memory safety. In Runtime Verification (RV’09), volume5779 of Lecture Notes in Computer Science, pages 132–152, 2009. doi:10.1007/978-3-642-04694-0 10.


2

65


The K-framework Tool Chain—Towards version 2.0: lessons learned and new perspectives—



The K framework [6, 7] has been continuously evolving [4, 2, 3, 5, 7], bothin theory and implementation, ever since its beginnings in 2003 [4] when it wasjust a simple but effective technique for defining languages and systems usingrewriting, and being based on a first order representation of computations. Thispaper is mainly concerned with the implementation of the K framework [1] (herenamed the K tool), its current evolution, and future prospects.

Among the evolution steps towards what is now regarded as the K tool, onecan mention the following (in approximative chronological order):

• configurations as (nested) cells;

• automatic inference of strictness rules;

• localized rewriting;

• cell comprehension and anonymous variables;

• initial configuration, configuration abstraction, and default configurations;

• K as abstract syntax;

• specialized notation for both syntax and semantics;

• definitions across multiple modules;

• non-deterministic strictness.

In addition to the above, there are several features which have been discussedbut are not yet fully integrated in the tool:

• code generation for backends other than Maude;

• strictness based on computation type;

• defining results using predicates instead of grammar productions;

• K/programming languages specific module system;

1

66


• the K intermediate format and its concrete representation;

• K-specific parsing and parser generation;

• improving the configuration abstraction algorithm;

• integrated development/execution/debugging/analyzing environments.

This paper proposes for discussion all of the above, and maybe more, withthe hope of starting the development of the next version of the K tool with theconfidence gained from assuming the lessons learned from past development andtaking into account the new avenues for development.

References


[2] Grigore Rosu. K: a Rewrite-based Framework for Modular Language Design,Semantics, Analysis and Implementation. Technical Report UIUCDCS-R-2005-2672, Department of Computer Science, University of Illinois atUrbana-Champaign, 2005.

[3] Grigore Rosu. K: a Rewrite-based Framework for Modular Language Design,Semantics, Analysis and Implementation. Technical Report UIUCDCS-R-2006-2802, Computer Science Department, University of Illinois at Urbana-Champaign, 2006.

[4] Grigore Ros,u. CS322, Fall 2003 - Programming Language Design: LectureNotes. Technical Report UIUCDCS-R-2003-2897, University of Illinos atUrbana Champaign, December 2003. Lecture notes of a course taught atUIUC.

[5] Grigore Ros,u. K: A rewriting-based framework for computations – preliminaryversion. Technical Report Department of Computer Science UIUCDCS-R-2007-2926 and College of Engineering UILU-ENG-2007-1827, University ofIllinois at Urbana-Champaign, 2007.


[7] Traian Florin S, erbanut, a and Grigore Ros,u. K-Maude: A rewriting basedtool for semantics of programming languages. In WRLA, pages 104–122,2010. doi: 10.1007/978-3-642-16310-4 8.

2

Documents

K 2011 Pre-Proceedings...Dorel Lucanu Alexandru Ioan Cuza University of Iasi Salvador Lucas Universidad Polit´ecnica de Valencia Narciso Marti-Oliet Universidad Complutense de Madrid