An Embedded Microprocessor for Intelligent Control

Journal of Intelligent and Robotic Systems (2005) 42: 179–211 © Springer 2005

An Embedded Microprocessor forIntelligent Control

IOANNIS PANAGOPOULOS, CHRISTOS PAVLATOS andGEORGE PAPAKONSTANTINOUNational Technical University of Athens, Zografou Campus, Athens, Greece;e-mail: [email protected]

(Received: 21 January 2004; in final form 6 September 2004)

Abstract. The conventional approach for the implementation of the knowledge base of a planningagent, on an intelligent embedded system, is solely of software nature. It requires the existence ofa compiler that transforms the initial declarative logic program, specifying the knowledge base, toits equivalent procedural one, to be programmed to the embedded system’s microprocessor. Thispractice increases the complexity of the final implementation (the declarative to sequential transfor-mation adds a great amount of software code for simulating the declarative execution) and reducesthe overall system’s performance (logic derivations require the use of a stack and a great number ofjump instructions for their evaluation). The design of specialized hardware implementations, whichare only capable of supporting logic programs, in an effort to resolve the aforementioned problems,introduces limitations in their use in applications where logic programs need to be intertwined withtraditional procedural ones in a desired application. In this paper, we exploit HW/SW codesign meth-ods to present a microprocessor, capable of supporting hybrid applications using both programmingapproaches. We take advantage of the close relationship between attribute grammar (AG) evaluationand knowledge engineering methods to present a programmable hardware parser that performs logicderivations and combine it with an extension of a conventional RISC microprocessor that performsthe unification process to report the success or failure of logic derivations. The extended RISC mi-croprocessor is still capable of executing conventional procedural programs, thus hybrid applicationscan be implemented. The presented implementation increases the performance of logic derivationsfor the control inference process (experimental analysis yields an approximate 1000% – 10 timesincrease in performance) and reduces the complexity of the final implemented code through theintroduction of an extended C language called C-AG that simplifies the programming of hybridprocedural-declarative applications.

Key words: declarative programs, embedded systems, intelligent control, logic programming, mi-croprocessor, RISC

1. Introduction

Knowledge engineering approaches have extensively been used in many applica-tion domains such as medicine, scheduling and planning, control, artificial intel-ligence [16], etc. Especially in the field of intelligent control and robotics [18],knowledge engineering approaches have been successfully exploited for enhanc-ing control applications with built-in intelligence. The low power requirements,

180 I. PANAGOPOULOS ET AL.

small dimensions, and real-time limitations, which are usually specified in suchapplications, impose the need of designing specialized embedded systems for theirimplementation. Therefore, the possibility of exploiting knowledge engineeringapproaches in embedded systems is of crucial importance. The two most importantfactors that influence the efficiency of such designs are the one of programmingsimplicity and performance of the final implementation.

Knowledge engineering methods are realized by tools which are based on thedeclarative programming model. On the other hand, the nature of computation sup-ported today by existing microprocessors is solely procedural. As a consequence,the implementation of a knowledge base with inference rules (logic program) inexisting microprocessors is bound to the use of a software compiler performingthe declarative to procedural transformation for its execution. This transformationmechanism affects both the complexity and speed of the final implementation,since it usually introduces a great amount of additional code for simulating thedeclarative execution and imposes the implementation of a software stack for logicderivations, increasing exponentially memory I/O references. As a consequence,software implementations of logic programs, in existing embedded platforms, neg-atively affect design efficiency in term of complexity and speed. The existence ofprocessors capable of supporting the declarative programming model would greatlyimprove execution performance and simplicity of the generated code.

The first machine introduced for the implementation of logic programs (PRO-LOG) was the Warren Abstract Machine (WAM) [2]. WAM is an abstract mi-croprocessor architecture with a specialized instruction set capable of performinglogic derivations. It is targeted though, exclusively to logic programs. Extensiveefforts in the implementation of machines for logic programming targeting theincrease in performance have been also encountered in the 5th generation com-puting era which envisioned a number of interconnected parallel machines for AIapplications [5, 8]. Powerful processors have been introduced working on UMAand NUMA computers [1, 8] in the effort of increasing the parallelization of declar-ative programs, implemented for PROLOG inference engines. Although the overallspeed-up achieved, following such approaches, has been satisfactory, the cost forthe implementation of such systems, along with their size, prevented their use insmall scale applications in embedded system environments. In general, up to now,the implemented “logic” machines are solely optimized for the logic programmingmodel, which is not always suited for all application domains.

The introduction of embedded systems [19] seems to present new challengesand requirements in the implementation of processors with optimized logic in-ference capabilities. Embedded systems do not target generality, since they areoriented for small-scale applications running on dedicated hardware. Additionally,their restricted computational power (required for constraint satisfaction), turnsapproaches for increasing performance extremely useful for design efficiency. Asa result, the effort of designing hardware capable of supporting the declarativeprogramming model for logic derivations can now lead to intelligent embedded de-

AN EMBEDDED MICROPROCESSOR FOR INTELLIGENT CONTROL 181

signs which are considerably more efficient compared to the traditional proceduralones.

In this paper, we propose an extension of the RISC-architecture microprocessorfor knowledge engineering, based on attribute grammars evaluation, in the effort ofincreasing performance and favoring programming simplicity of intelligent embed-ded systems. We have chosen to follow the attribute grammar (AG) approach forthe implementation of inference engines, since AGs have been proven to supportboth the declarative and procedural programming model encountered in existingknowledge engineering systems [6, 12]. Our contribution is summarized in thefollowing:• We introduce a programmable hardware implementation of an extended parser

which is capable of handling all required derivations in logic programmingapplications achieving a considerable 1000% speed-up compared to the con-ventional purely software approach (performance increase, 10 times fasterthan the conventional approach).

• We propose a modified version of the RISC microprocessor which allowsprograms following the declarative execution model to be executed. This mod-ification extends and not substitutes the conventional procedural executionand thus hybrid applications may be easily programmed achieving program-ming simplicity of hybrid procedural-declarative applications (reduction ofprogramming complexity).

• We allow design flexibility, since both the microprocessor and the extendedhardware parser are programmable allowing the implementation of any de-sired knowledge engineering application (design flexibility).

The rest of the paper is organized as follows. Section 2 presents related workand exposes the differences in our approach. Section 3 provides the theoreticalfoundation of the proposed implementation. Section 4 gives an overview of theproposed extended RISC microprocessor. Section 5 presents the software compi-lation details by providing information for the C-AG language and the encodingand transformations performed. Section 6 presents the hardware implementationissues concerning our approach. Section 7 evaluates the proposed implementationand presents quantitative and qualitative experimental results. Finally, in Section 8conclusion and future work are presented.

2. Related Work

To the extent of the writers’ knowledge, this is the first effort of designing a special-ized microprocessor for attribute grammar evaluation and exploiting its merits inperformance and programming simplicity in knowledge engineering applications.Existing hardware implementations [3, 4] are only restricted to the parsing portionof an attribute grammar evaluator, targeting the increase in performance of syn-tactic analysis without allowing any semantic analysis to take place and thereforethey are not expressive enough to support knowledge engineering applications. In


the effort of presenting a complete programmable attribute grammar evaluator, wehad to choose one of the several software implementations [12, 14, 17] that alreadyexisted and adopt it to the proposed design. We have chosen to use the AG evaluatorpresented in [12]. The choice of this AG evaluator, due to its simplicity, allows asimple and fast hardware implementation and can become easily reprogrammable.It is also semantically driven (supports dynamic parsing by exploiting tree-pruningtechniques to increase efficiency and prevent the memory explosion problem forstoring all possible parse trees) and gives all possible solutions (non-deterministic),i.e. it can provide all possible parse trees for a specific input string. Due to the closerelation between attribute grammars and logic programs, it can be easily extendedto be used in intelligent embedded systems for constraint logic programming ap-plications [20] and can be extended to support fuzziness and uncertainty [10].Finally, there are already software implementations of the parser and its use invarious application domains providing a sufficient development environment forthe evaluation of the design [15].

3. Theoretical Definitions

An attribute grammar (AG) is based upon a context free grammar (CFG). A CFGis a 4-tuple G = (N, T , P, Z), where N is the set of non-terminal symbols, T isthe set of terminal symbols, P is the set of grammar rules (a subset of N ×(N ∪T )∗written in the form A → α, where A ∈ N and α ∈ (N ∪T )∗) and Z (Z ∈ N) is thestart symbol (the root of the grammar). An AG is a 4-tuple AG = {G, A, SR, d},where G is a context-free grammar, A = ⋃

A(X) where A(X) is a finite set ofattributes associated with each symbol X ∈ V (V = (N ∪ T )). Each attributerepresents a specific context-sensitive property of the corresponding symbol. Thenotation X.a is used to indicate that attribute a is an element of A(X). A(X) ispartitioned into two disjoint sets; the set of synthesized attributes AS(X) and theset of inherited attributes AI(X). Synthesized attributes X.s are the values definedin terms of attributes at descendant nodes of node X of the corresponding semantictree (decorated tree). Inherited attributes X.i are values defined in terms of at-tributes at the parent and (possibly) sibling nodes of node X of the correspondingsemantic tree. From the definition, the start symbol does not have inherited at-tributes while the terminal symbols do not have synthesized attributes. Each of theproductions p ∈ P (p: X0 → X1 . . . Xn) of the CFG is augmented by a set ofsemantic rules SR(p) that define attributes evaluation rules and conditions in termsof other attributes of terminals and non terminals appearing in the same produc-tion. The way attributes will be evaluated, depends both on their dependencies toother attributes in the tree and also on the way the tree is traversed. Finally, eachattribute a is associated with a specific domain d(α).

The syntax rules of the AG define all possible derivations from a specific nonterminal symbol. If only terminal symbols are used to determine the success ofthose derivations (based on comparisons with the tokens of an input string), then


parsing is performed. If terminal symbols and attribute instance values determinethe success of those derivations then, semantically driven parsing is performed. It ispossible to omit all terminal symbols in the AG (replace them with nil tokens) andstore their information in attributes at the leaf nodes of the tree (definite clause AGapproach [15]), in order to perform semantically driven parsing without the use ofterminal symbols. Finally, if no terminal symbols exist and no input string in used,then parsing is degenerate and tree derivations are only controlled by semanticconditions on attribute instance values. In our approach, we do not use any terminalsymbols, since the implementation’s main purpose is targeting logic programmingapplications. Apart from that, using the proposed implementation, semanticallydriven parsing can be performed through the definite clause AG approach, whiledegenerate parsing can be realized through the evaluation and checking of attributeinstance values. As a consequence, all expressive power of AGs is preserved. This,as it will be shown in later sections, ensures the maximum possible flexibility inthe design.

Attribute grammars have extensively been used for logic programming appli-cations [6, 11]. In [12] an effective method based on an extension of the Floyd’sparser [7] is presented that transforms a logic program to its AG equivalent rep-resentation. The basic concepts underlying this approach will be exposed throughthe following example:

Consider that we have the knowledge base (logic program) illustrated in Table I(first column). We want to ask the question “p is successor of whom?”, i.e. Suc-cessor (p, ?). In order to answer this question the inference engine needs to starta logic derivation process through which inference rules are combined to presenta possible logic derivation tree. Logic derivations assign values to inference rules,according to the question asked and the values from facts, returning True or Falseaccording to whether such rule can be valid. The process of assigning values to thevariables of the inference rules and trying to find out whether those values form atrue sentence is known as the unification process.

The AG transformation introduces a number of equivalent syntax rules for theinference rules and a number of attributes and semantic conditions for the unifica-tion process in the inference procedure and can be used as a complete inferenceengine. In specific:

Every inference rule in the initial logic program can be transformed to an equiv-alent syntax rule consisting solely of non-terminal symbols:

RO(t01, t02, . . . , tok0) ← R1(t11, t12, . . . , t1k1)R2(t21, t22, . . . , t2k2) . . .

Rm(tm1, tm2, . . . , tmkm1) is transformed to the syntax rule:

RO = R1R2 . . . Rm|.

(“|.” represents the end of the rule).For example, Goal(X, Y ) ← Successor(X, Y ) is transformed to the syntax rule:

Goal = Successor|.


Table I. (Left column) An informal definition of the Logic program/Knowledge base of the“Successor” example. Inference rules and facts are provided here. (Right column) Informalrepresentation of the AG equivalent program with its syntax rules (in bold) and its semantic rules

Informal definition of the knowledge base Equivalent AG evaluation Syntax and Semantic rules

(Logic program)

Inference rules Goal = Successor |.

Goal (X, Y ) if Successor (X, Y ) Successor.ia1 = Goal.ia1;

Successor (X, Y ) if Parent (Z, X) Goal.sa2 = Successor.sa2;

and Successor (Z, Y )

Successor (X, Y ) if Parent (Y, X) Successor = Parent Successor | Parent |.

Facts Parent[1].ia2 = Successor[1].ia1;

Parent (j, b) Successor[2].ia1 = Parent.sa1;

Parent (j, l) Parent[2].ia2 = Successor[1].ia1;

Parent (b, a) Successor[1].sa2 = Parent[2].sa1;

Parent (b, p) Parent = ||||.

nil1: if ((Parent.ia1! = nil) && (Parent.ia1! = “j”))

flag = 0; else Parent.sa1 = “j”;

if ((Parent.ia2! = nil) && (Parent.ia2! = “b”))

flag = 0; else Parent.sa2 = “b”;

nil2: if ((Parent.ia1! = nil) && (Parent.ia1! = “j”))

flag = 0; else Parent.sa1 = “j”;

if ((Parent.ia2! = nil) && (Parent.ia2! = “l”))

flag = 0; else Parent.sa2 = “l”;

. . .

In case there are two or more alternatives for a single inference rule, those aretransformed to an equal number of alternatives in the corresponding syntax rule.For example,

RO(t01, t02, . . . , tok0)← R1(t11, t12, . . . , t1k1)R2(t21, t22, . . . , t2k2) . . .

Rp(tp1, tp2, . . . , tpkp),

RO(t01, t02, . . . , tok0)← D1(t11, t12, . . . , t1l1)D2(t21, t22, . . . , t2l2) . . .

Dq(tq1, tq2, . . . , tqlq )

are transformed to the syntax rule: RO = R1R2 . . . Rp|D1D2 . . . Dq |. (“|”represents the alternative meta-symbol). For example, Successor(X, Y ) ←Parent(Z, X), Successor(Z, Y ) and Successor(X, Y ) ← Parent(Y, X) are trans-formed to Successor = Parent Succesor|Parent|.

Finally, facts of the inference rules are transformed to terminal leaf nodes of thesyntax tree referring to the empty string. For example, the facts: Rg(a, b), Rg(c, d),

Rg(e, f ) are transformed to: Rg = |||. (which are three “nil” symbols separated by


the alternative meta-symbol). For example, Parent(j, b), Parent(j, l), Parent(b, a),Parent(b, p) are transformed to Parent = ||||.

The syntax rules which form the equivalent AG evaluator are illustrated in Ta-ble I (second column) in bold. Obviously, parsing will be degenerate since thereare no terminal symbols.

For every variable existing in the predicates of the initial inference rules, twoattributes are attached to the corresponding non-terminal symbols of the syntaxrules, one synthesized and one inherited. Those attributes assist in the unificationprocess of the inference engine. The attribute evaluation rules are constructed basedon the initial logic program. A detailed method for specifying those transforma-tion rules can be found in [12] and can be easily performed automatically by asuitable tool. If desired, additional semantic rules may be further added, in ordetto increase the inference power beyond the one affected by the PROLOG rules,leading to the implementation of semantically driven parsers, theorem provers andinference engines with fuzziness and uncertainty. Attributes at the leaf nodes ofthe tree are assigned values from the facts of the logic program. The inferenceprocess is carried out during tree derivations and an EVAL function is called atthe insertion/visit of each node that computes the attribute rules performing theunification process. Semantic conditions, on the result of the unification process,determine the success or failure of those derivations in the inference procedure(a meta-variable flag is used to hold this information). In Figure 1, a snapshotof the aforementioned mechanism is illustrated. The inference process is carried

Figure 1. Two snapshots of the constructed parsing trees decorated with attributes for theunification process. Dashed lines represent the flow of constants during the inference processto report the solutions that match the criteria requested.


out during attribute evaluation, based on the attribute evaluation rules provided inTable I (second column) in order to give the two answers to the initial problem.

4. Overview of Our Approach

Conventional approaches, for the incorporation of a logic program/inference en-gine in embedded systems, follow the method illustrated in Figure 2(a). A logicprogram is initially captured using a logic programming language (such as PRO-LOG). Then, a compiler is used that performs the transition from the declarativealgorithm to the behaviorally equivalent procedural one which performs the logicderivations/inference process. The final algorithm is programmed to the micro-processor for execution. Our approach (Figure 2(b)) starts from a conventionalPROLOG like language for the definition of the logic program. The initial spec-ification is transformed, through the use of a preprocessor, to its AG equivalentrepresentation, written in the proposed extended C-language, called C-AG. Wefurther introduce an extension to the RISC microprocessor which consists of aprogrammable hardware parser, for the definition of the syntax rules, that performsthe required tree derivations (logic derivations) and a conventional RISC part thathandles the attribute evaluation rules (unification process) and possibly executesconventional sequential code that co-exists in the hybrid procedural-declarative ap-

(a) (b)

Figure 2. (a) The logic program is fed to a compiler which transforms the declarative code toits procedural equivalent and generates the final executable that is loaded to the microproces-sors. (b) The logic program is transformed through a logic preprocessor to its AG-equivalent.Syntax rules are encoded and programmed to the hardware parser and semantic rules arecompiled and programmed to the RISC microprocessor.


plication. The C-AG language is finally compiled for the generation of the machinelanguage code that is programmed to the microprocessor. The implementation of acompiler supporting C-AG is straightforward since the only extension that needs tobe taken care of to a conventional C compiler is the incorporation of the advancedinstruction set of the hybrid microprocessor (as inline code, see Section 5.4) thathandles the declarative and procedural execution and the transition from one toanother.

The basic innovative idea in the proposed architecture is to equip the micro-processor with two modes of operation: one declarative and one procedural. In theprocedural mode, the processor functions in the conventional way (the hardwareparser extension is disabled) and program execution is sequential through consec-utive increases or jumps of the Program Counter (PC). On the declarative mode,the hardware parser is enabled and performs the required tree derivations (logicderivations) based on its internally stored syntax rules (capturing the knowledgebase). Nodes of the constructed tree are stored in a specially designed stack withinthe hardware parser. There is a one-to-one correspondence between a stack lineand the node in the parse tree it represents. Consequently, the position of the nodein the stack can be used as the identification number of the specific node (NID– Node IDentification number). A stack line also holds an encoding of the non-terminal symbol (predicate) associated with the node, along with information onthe dependency of the node to other nodes (predicates) of the tree (represented aslocations in the stack).

Upon each creation/visit to a node, attribute evaluation rules need to be executedin order to perform unification and possibly determine the success or failure of theunification process. Attribute evaluation rules are organized as blocks of code inthe RISC microprocessor. Each block is associated with its corresponding non-terminal symbol. The hardware parser uses the associated non-terminal symbol’sencoding (stored in the stack line), for each node, to determine (by controllingthe value of the PC) the block of code that needs to be executed (the block ofcode that performs unification). A space is also reserved in the embedded system’smemory for storing attribute instances. The latter are dynamically organized inblocks (associated with a specific stack line-node) and can be referenced based onthe NID of the node.

The hardware parser acts like a coprocessor to the RISC microprocessor and hasits own control unit to handle the tree traversal functionality. It is programmed ini-tially with the possible tree derivations (syntax rules) along with the microproces-sor (offline) and during execution the only data exchange that takes place betweenthe two are requests for attribute evaluation and an encoding of the non-terminalsymbols of the nodes that the evaluation needs to take place. I/O data exchange be-tween the two chips does not impose any communication overhead since it is han-dled through internal registers (same as any conventional processor–coprocessorinterconnection, e.g., processors–mathematical coprocessors, Intel processor-MMX unit, etc.). Consequently, the whole architecture guarantees fast execution


without imposing any additional data transfer requirements that may compromisethe implementation’s performance.

By using the proposed approach and the microprocessor’s extended features,apart from the fact of the declarative-procedural coexistence, we greatly reducethe complexity of the implemented program (no translation to procedural codeis needed and the C-AG extension allows straightforward programming of suchapplications). We also achieve a great improvement to the system’s performance(av. 1000%). This is mainly due to the fact that the programmable hardware im-plementation of the parser, relieves the microprocessor from the additional com-putation time that would be required for performing the additional tree derivations(logic derivations).

In Figure 3, it is clear that the computational time required for performing suchtree derivations by a RISC and also a RISC with pipelining capabilities micro-

(a)

(b)

Figure 3. For various number of edges in a representative (path finding) application solvedusing a logic program, we see that tree derivations (in light color) consume the largest portionof the total computation time, compared to the unification process (in dark color) both in aRISC microprocessor (b) and a RISC microprocessor with pipeline (a).


processor constitutes approximately the 80% of the total time required to performthe inference process (attribute evaluation computations related to the unificationprocess consist of simple assignment operations and therefore are not computation-ally intensive). Mapping the tree derivation process to hardware would thereforesurely improve the overall embedded system’s performance.

5. The Compilation Process

As mentioned before, the logic preprocessor receives the initial logic program andpresents the resulting AG equivalent one written in C-AG. The logic preprocessorperforms automatically the transition from the logic program to the C-AG equiv-alent representation (Figure 4), based on rules for the logic programming to AGtransformation. Since those rules have been already well documented in [12] theirdescription and explanation is beyond the scope of this paper.

The designer can additionally include any other procedural-declarative desiredcode to the resulting program, written in C-AG. The hybrid program is then re-ceived by the C-AG compiler which translates the C-AG program to the low-levelcode of the extended microprocessor, by adding inline assembly, where needed,to take advantage of the extended RISC microprocessor’s additional features. Theoverview of this translation mechanism is presented in Figure 5.

The resulting C-AG program is compiled by the “C-AG” compiler. The programwritten in C-AG is separated in the fragments of code (attribute evaluation rules,procedural code) to be mapped to software and the syntax rules that program the

Figure 4. Example representation of the “Successor” example in Section 3 using C-AG. Thefunction void AG_Successor (. . . ) holds the syntax rules along with the attribute evaluationrules (semantic rules). The names of the non-terminal symbols of the “Successor” examplehave been substituted with single letters (for example, the “Goal” non-terminal is now repre-sented as G). Syntactic alternatives are also represented for simplicity as different rules. The“switchdecl” keyword switches the mode of operation of the microprocessor and points to thefunction with the attribute evaluation rules. (Further details on the C-AG language can befound in Section 5.1.)


Figure 5. The logic program is initially transformed to its C-AG attribute grammar evaluationequivalent through the logic preprocessor. Additional procedural application specific code isadded by the user if needed. The C-AG compiler transforms the final C-AG program to itsexecutable form for execution in the Extended RISC microprocessor.

Figure 6. The initial C-AG code is fed to the C-AG compiler which separates the syntaxrules from the semantic rules and the conventional procedural code. Syntax rules program thehardware parser while semantic rules and conventional procedural code are programmed tothe RISC microprocessor using a conventional C-Compiler.

hardware parser (HW/SW Codesign). The syntax rules of the AG are extractedfrom the program, encoded and loaded to the hardware parser. The C-AG com-putations for the attribute evaluation rules are also transformed and loaded to themicroprocessor’s instruction memory at specific addresses associated with eachnon-terminal symbol. Conventional C-code is finally produced augmented withinline assembly instructions using the extended instruction set of the proposedimplementation. Then, a conventional C-compiler is used to produce the final bi-nary executable. The schematic diagram of the “C-AG” compiler is illustrated inFigure 6.

5.1. THE EXTENDED C LANGUAGE (C-AG)

In the effort of supporting our proposed implementation, specific language con-structs were needed to enable hybrid declarative-procedural computations. For that


reason, we had to incorporate a specific syntax for the specification of the syntaxand attribute grammar evaluation rules, along with a way to enable the control oftree derivations and perform the required switching mechanism from declarativeto procedural code and vice-versa. Nowadays, the most well known programminglanguage syntax for the specification of grammars is YACC [21]. Although YACCis widely accepted as a valuable tool for the specification of syntax rules andparsing, it has been shown that its use is restrictive for AGs [21] (e.g., it doesnot support inherited attributes). Several other attribute grammar evaluators alsodo exist (following the conventional approach) defining an AG-like syntax suchas ELI [21] and FNC2 [9]. Those systems also allow the specification of blocksof sequential code within the attribute evaluation rules. Such mechanism increasesthe expressiveness of AGs, required for real-life applications. We have chosen tofollow an ELI-like syntax for the specification of the AG in the introduced extendedC language (C-AG) and performed the required modifications (adaptation of theELI-rules to follow the general syntactic patterns of a conventional C language)to be able to integrate it into conventional procedural code. An overview of theintroduced programming template is illustrated in Figure 7.

AGs within the conventional C-code are defined as functions. Those functionsare distinguished from conventional ones by being prefixed with the string “AG_”.Within the function, syntax rules are defined along with their associated attributeevaluation rules. We allow at every syntax rule a procedural block of code that isexecuted upon a successful creation of the subtree starting from the non-terminaldefined at the left-hand side of the specific syntax rule. Reserved keywords “set-flag/clrflag” within the attribute evaluation rules are used to determine the suc-cess/failure of a derivation. Finally, in main(), the switch to declarative code isperformed through the reserved work “switchdecl” whose parameter specifies thefunction defining the AG to be executed. It should be noted that the scope ofconventional variables within the program remains intact and attribute evaluationrules are able to access the same variables that could be accessed in a conventionalC-program. This allows data sharing between procedural and declarative code.

5.2. SYNTAX RULE ENCODINGS

As previously mentioned during the compilation process, the Syntax rules are pro-grammed to the hardware parser. For this purpose a specific encoding is needed.The extracted syntax rules from the initial program are initially transformed to thefollowing representation: RULE: A ::= B C END and RULE: A ::= D E ENDare transformed to A = BC| DE|. where the symbol “|” represents alternativeproductions and the combination of “|.” denotes the end of the rule.

The proposed encoding (see example in Figure 8) for the non-terminal symbolsof the grammar uses P = �log2(N − 2)� bits, where (N − 2) is total numberof non-terminal symbols. The additional two symbols have the encoding “000..”which is used to represent the symbol “|” and encoding “11111. . . ” which is used


. . . Conventional C inlcudes, variable, function declarations

〈return type〉 AG_〈grammar_name〉(. . . ){RULE: 〈NT 〉 ::= 〈NT 〉〈NT 〉 . . . 〈NT 〉 END{

attribute evaluation rules;

{. . . } procedural code

}

RULE: 〈NT 〉 ::= 〈NT 〉〈NT 〉 . . . 〈NT 〉 END{

attribute evaluation rules;

if (condition (〈NT 〉.〈attribute〉)) clrflag/setflag;

}

. . . return . . . ; }int main (int argc,char argv[])

{

〈return type〉 variable;

. . .

variable = switchdecl (AG_〈grammar_name〉(. . . ));exit(0);

}

Figure 7. Abstract syntax of a typical C-AG program. The function prefixed with AG_ holdsthe syntax and semantic rules of the AG program. Each syntax rule is defined within theRULE: .. END keywords following the ELI AG evaluation syntax. In the main function theswitchdecl function performs the transition to declarative execution.

Figure 8. (a) The syntax rules of an example grammar. (b) The encoding of non-terminalsymbols of the grammar in binary and hexademical. (c) The memory location of each rule andthe representation of the rule within this location (for example, G = S|. is stored in memoryaddress 1 which is the encoding of goal “G” and within this location 20F are the encodings ofsymbols S|., respectively).


to represent the symbol “.”. Such an encoding on one hand is the smallest possi-ble, achieving the maximum possible reduction in the number of bits needed torepresent the grammar and on the other hand, as it will be shown, provides aneasy way of storing the grammar rules into the “Rules Memory” in the hardwareparser. Every subgoal (the left-hand side non-terminal symbol in each syntax rule)is assigned a value according to the previously mentioned encoding. We use theencoding of each subgoal as the reference address to the “Rules Memory” locationwhere the right-hand side of the grammatical rule is stored.

5.3. AG EVALUATION RULES TRANSFORMATION

As previously stated, the “C-AG” compiler automatically determines when at-tribute evaluation computations need to be performed; depending on the position(non-terminal symbol) of the computation in a syntax rule, i.e. the position treederivations have reached so far. For example, an initial attribute evaluation rule ofthe form: B.inh = f (A.inh) associated with the syntax rule A = B|. is automat-ically transformed to: When a node associated with B is visited: inh[current] =inh[father]. Note that the notation 〈attribute name〉[〈related_node〉] illustrates thedependency of the attribute 〈attribute name〉 in the currently visited node with at-tributes in its neighboring nodes in the derived tree. This implies the transformationof the initial attribute evaluation rules, associated with non-terminal symbols in thesyntax rules, to semantically equivalent attribute evaluation rules related to nodedependencies in the derived tree. In our previous example such a transformationcan be expressed as: “when a node related to the non-terminal symbol B in therule A = B|. is visited, the attribute inh of the node can be calculated from theattribute inh of its father (which is a node associated with the non-terminal A)”.

Those transformations are automatically performed by the “C-AG” compilerwhich determines the attribute dependencies expressed in the syntax rules andtransforms them to attribute dependencies related to dependencies on the nodesof the tree. Table II demonstrates such transformations for various types of depen-dencies among non-terminal symbols in a syntax rule. The notation (enc(NT ), i)

is used to make possible to distinguish non-terminal symbols at the right-hand side(RHS) of the syntax rule. For that reason non terminal symbols at the RHS aredefined by a tuple (x, i), where x is the subgoal’s encoding for the specific syntaxrule (LHS non-terminal symbol) and i is the position of the needed non-terminalsymbol at the RHS of the syntax rule starting from 0. For example, in the ruleA = BC|., (enc(A), 0) defines B, (enc(A), 1) defines C, (enc(A), 2) defines the“|” and so on (enc(NT ) gives the encoding of the non terminal symbol NT ).

The instruction memory is divided into blocks where each block holds the at-tribute evaluation rules of a specific syntax rule (referenced by the value of x in thetuple (x, i)). This block is further subdivided into sub-blocks holding the attributeevaluation rules for every position within the rule (indicated by the index i in thetuple (x, i)). Those blocks and sub-blocks can either be organized with fixed sizes


Table II. Possible attribute relations between non-terminal symbols in a syntax rule and arepresentation of the computation that needs to be performed in order to preserve such relationin the final program

Attribute evaluation rules Attribute evaluation equivalent (traversal dependent)

A → B C|. (enc(A), 2): syni [current] = f (synj [predsub],synk[sub])

A.syni = f (B.synj ,C.synk)

A → B C|. (enc(A), 0): inhi [sub] = f (inhj [current]);

B.inhi = f (A.inhj ); (enc(A), 1): inhi [sub] = h(inhk[current]);

C.inh = h(A.inhk);

A → B1B2B3 . . . Bn|. (enc(A), 1): temp1[current] = a1[sub];

A.syni = (enc(A), 2): temp2[current] = a2[sub];

f (B1.a1, B2.a2, . . . , Bn.an); . . .

(enc(A), n − 1): tempn[current] = an−1[sub];

(enc(A), n):

syni [current] = f (temp1[current],temp2[current], . . . ,

tempn[current],an[sub])

A → B C|. (enc(A), 1): inhi [sub] = f (inhj [sub]);

C.inhi = f (B.inhj );

A → B C D E|. (enc(A), 1): temp[l] = inhj [sub];

D.inhi = f (B.inhj ); (enc(A), 2): inhi [sub] = f (temp[current]);

or a mapping table can be used that points to the memory location for every tupleof the form (x, i). The resulting memory organization for the AG evaluation rulesfor the “successor” example is illustrated in Table III. (Fixed block sizes are usedin this example reserving 256 bytes per block (for every syntax rule) and 32 bytesper sub-block (for every position within the syntax rule).

5.4. COMPILATION DETAILS

In order for the RISC to be able to access attribute instances values of currentand neighboring nodes, whenever a tuple (x, i) is dispatched from the parser forattribute evaluation, additional information is needed from the hardware parserindicating the position of the parent, son, sibling and son’s sibling node of thecurrently visited node. Those can be used in the microprocessor’s attribute evalu-ation rules for accessing attributes of neighboring nodes. For that reason we haveextended the RISC microprocessor’s instruction set to incorporate additional in-structions related to attribute referencing. Moreover, additional instructions areneeded for implementing the required switching mechanism between proceduraland declarative code and for uploading the rules to the parser’s “Rules Memory”.Finally, two additional instructions are used for controlling the meta-variable flag


Table III. Snapshot of an example memory organization of the ex-tended RISC microprocessor. To the left memory addresses of blocksare illustrated and to the right the contents of those blocks aredemonstrated

Address Content

Base to Base + 999 Sequential code

Base + 1000 to Base + 1031 Attribute evaluation rules (enc(G), 0)



Base + 1096 to Base + 1255 unused

Base + 1256 to Base + 1287 Attribute evaluation rules (enc(S), 0)






Base + 1448 to Base + 1511 unused

Base + 1512 to Base + 1543 Attribute evaluation rules (enc(P), 0)

. . . . . .

Base + 1768 to Base + xxxx Sequential code

Figure 9. The extracted syntax rules of the C-AG program are transformed to an encoding ofnon-terminal symbols and stored in a 32 bit array. This array is downloaded to the hardwareparser.

indicating the success/failure of semantic conditions on attribute instance values.The complete table of the added instructions is illustrated in Table IV. (Imple-mentation issues for the incorporation of this extended instruction set to the RISCmicroprocessor will be given in Section 6 where hardware implementation issuesare explained in detail.)

Initially, the grammar is extracted from the C-AG code and a global array of32 bit encodings of the syntax rules of the grammar (logic formulas) is insertedfollowing the encoding presented in Section 5.2 (Figure 9).

The call to the attribute evaluation function through the use of “switchdecl”is transformed to a conventional call to a function (Figure 10(a)) and for every


Table IV. Additional instructions of the extended RISC microprocessor that allow attribute accesses,initialization of the hardware parser’s components and the switch between declarative-proceduralcode

Instruction Description

ldx addr, dest(x = sub, sup, sib, ssib, cur)

Load the value at memory location addr + (index depending onposition in stack) and store it to register dest. (index positions referto current(cur), son(sub), sibling(sib), parent(sup) and son’s siblingnode(ssib))

stx addr, value(x = sub, sup, sib, ssib, cur)

Store to memory location addr + (index depending on positionin stack) the value “value” (index positions refer to current(cur),son(sub), sibling(sib), parent(sup) and son’s sibling node(ssib))

prgdecl addr Performs the switch between declarative and procedural code.Specifically, the assembly instruction loads to the PC the basememory address where AG evaluation rules are stored and pushesinto the stack the return value of the PC (after all logic derivationshave taken place and all solutions have been found)

setflag/clrflag Allows the manipulation of the previously described meta-variableflag which is dispatched to the hardware parser in order to indicate asuccessful/unsuccessful derivation

ldrules index, addr Load the 32 bit value located at memory address “addr” to the “RulesMemory” of the hardware parser at position “index”

initstk Assembly instruction used to initialize the parser’s stack contents

(a)

(b)

Figure 10. (a) Transformation of the switchdecl function to a conventional call to the AG evalua-tion function. (b) Attributes in the initial C-AG program are transformed to global variables in theprogram and space is reserved for them.

attribute used within the attribute evaluation rules, a space is reserved in the globalsection of the final C program (Figure 10(b)).

The initial attribute evaluation rules, based on syntax rules, are transformedto attribute evaluation rules based on nodes of the constructed tree (Figure 11)following the rules specified in Table II (Section 5.3): (The notation [i] denotes theindex number i of the register in the register file.)


Figure 11. Assembly language instructions of the attribute evaluation rules presented in theleft-hand side block. Those instructions are part of the extended RISC instruction set andrealize the transformations presented in Section 5.3 according to the attribute dependencies onthe left.

void AG_Successor (char successors[10], char p)

{

inti;

i = −1;

_asm {

ld [0],0

ldrules [0],AG_successor_rules + [0];

. . .

initstk;

prgdecl AG_succesor_asm_rules;

}

Figure 12. Inline assembly code within the attribute evaluation function loads the syntax ruleencodings to the hardware parser, initializes the stack of the hardware parser and switches todeclarative mode (prgdecl).

Inline assembly is added, at the location of the attribute evaluation rules, whichloads the syntax rules, initializes the parser’s stack and initiates the declarativemode. The assembly instruction “prgdecl” pushes into the stack the location of thePC and then jumps to the attribute evaluation function (Figure 12).


Figure 13. Flowchart of the execution of the extended microprocessor and the switchingmechanism between declarative and procedural code.

As previously mentioned, the whole purpose of the preliminary compilationstage is to provide a final C-program which, when compiled and executed by themicroprocessor handles both the declarative and procedural code. In specific, themicroprocessor can be functioning at two modes of operation. The procedural onefollows the conventional steps of program execution. At the declarative one, (callto an AG evaluation function in the program) the processor switches to the declar-ative mode. The flowchart of the two mode switching mechanism is illustrated inFigure 13.

During the declarative mode, the processor initially loads the grammar rules intothe hardware parser, specifies the base address of the AG evaluation rules, pushesinto the stack the return address of the PC (when declarative mode completes)and waits for a signal from the hardware parser indicating that AG evaluation isrequired. The processor at each such call, jumps to the AG evaluation block relatedto the non-terminal associated with the currently parsed node, performs the AGcomputation and waits for further requests. At the end of the derivation process, theprocessor pops from the stack the return address and resumes normal, procedural


execution. In the next section all hardware implementation issues required for thehardware parser and the extended RISC microprocessor, in order to support suchoperation, are clearly defined.

6. Hardware Implementation Issues

6.1. THE HARDWARE PARSER

An overview of the parser’s flowchart is illustrated in Figure 14. The parser usesa stack to hold the nodes of the constructed parse tree. Those nodes are defined asfive-tuples of the form (goalλ, iλ, supλ, subλ, predλ) in a stack S. goalλ, is the nonterminal which is currently tried, λ is the element of the stack (current node) whichis currently active, v is the new empty position in the stack, iλ is the place in the de-finition of the non terminal goal at which tree derivations have reached so far, supλ

is the location in the stack S of the goal’s superior and subλ, predλ are the locationsin the stack S of the goal’s most recent subordinate and sibling, respectively. Theparsing that takes place is degenerate. The parser has been extended to incorporatecalls to an attribute evaluator procedure (EVAL) for attribute evaluation whoseresults control tree derivations through the use of the meta-variable flag. A moredetailed description of a similar software implementation of the extended parsercan be found in [12].

An overview of the proposed programmable hardware implementation of theparser is illustrated in Figure 15.

The grammatical rules are encoded and stored within the parser in the “RulesMemory”. Another memory element within the parser is used as the stack S thatstores nodes visited of the constructed tree. The stack is a two-input, two-outputmemory allowing two memory read/write operations to be executed simultane-ously. We have chosen to integrate this stack within the hardware parser in theeffort of increasing performance in the construction of the tree. Two additionalregisters are used to store the current and new position in the stack: the λ registerand the ν register, respectively. Those reside within the hardware parser’s datapath.The hardware implementation of the parser has been carried out in the XILINXISE 6.0 environment. A finite state machine controller drives the control signalsand implements the parsing algorithm. The design is driven by the microproces-sor’s clock. Since the operations that occur in a single cycle in the parser areprimitive RTL operations, the hardware parser’s computations within a single cycledo not impose any upper limit to the microprocessor’s clock, provided that theyare implemented using the same technology used for the implementation of themicroprocessor. At any moment, the subgoal’s encoding and the index, within thecurrent syntax rule in the current stack line, is used to determine the next symbolto be evaluated (value at the output port of the “Rules Memory”). The new symbolis either a new non terminal causing a new node to be added to the stack line ordefines the end of the syntax rule. After the addition of a new node, the hardwareparser triggers the microprocessor for attribute evaluation and monitors the value of


Fig

ure

14.

The

pars

er’s

flow

char

tth

atis

impl

emen

ted

inha

rdw

are.

Num

bers

inci

rcle

sre

pres

ent

conn

ectio

npo

ints

.T

hero

unde

dre

ctan

gle

isa

spec

ial

conn

ectio

npo

inta

tthe

begi

nnin

gof

aca

llfo

rat

trib

ute

eval

uatio

n.


Figure 15. Abstract representation of the hardware parser. Incoming/Outgoing lines are con-nected to the RISC microprocessor. The “Stack” holds the tree derivations while the “RulesMemory” stores the syntax rules of the application.

the flag. In order for the microprocessor to support attribute instance referencing,the parser has been extended to dispatch attribute referencing base addresses ofneighboring nodes (their positions in the stack). If the derivation is unsuccessful ora new symbol represents the end of the rule, new alternatives are evaluated. The“Rules Memory” is of k · 32 bit size where k is its maximum size. The non terminalsymbols of each syntax rule are stored in consecutive memory positions at fixedintervals. Each stack line holds information on a specific node of the constructedparse tree as described by the software parser. This means that with a P bit encod-ing of non terminal symbols and k at maximum symbols allowed at the RHS of aproduction, each stack line is of 4 log2 N + log2 k bits. For an efficient and realisticimplementation, a typical size for the stack is 1.5 KB and for the “Rules Memory”2 KB. This means that for an 8 bit encoding of non-terminal symbols, the hardwarecan support up to 300 nodes in each tree, 254 non-terminals and up to 254 syntaxrules with a maximum of 8 non-terminals at the RHS of each rule.

6.2. RISC MODIFICATIONS

For our analysis and experimental evaluation we have used the description of theRISC microprocessor presented in [13]. Based on this description we have im-plemented a soft-core in VHDL of a traditional microprocessor, which we haveextended for AG evaluations. The hardware parser takes control of the PC anddispatches the tuple (x, i) to define the attribute evaluation computations to beexecuted for each node. The modification in the PCs datapath for such action isillustrated in Figure 16.

If more than one computation instructions are defined for a visited node, thoseare executed sequentially. Attributes associated with every node in the parse treeare stored in a designated area in the microprocessor’s data memory. We havechosen to use this memory for attribute storage in order to enable other parts ofthe program’s code (sequential or declarative) to be able to access them. Attributes


in this memory area are also organized into blocks. The RISC microprocessor’sinstruction set is extended to support such attribute referencing instructions. Thoseare assigned their own encodings and introduce additional hardware to the micro-processor’s soft core for their implementation (since those instructions are similarto any other indexed memory access instructions, they do not impose any additionaldelay in the designated microprocessor’s clock cycle time). In Figure 17 additionalregisters are introduced storing the NIDs of neighboring nodes which are accessedupon execution of the additional introduced attribute referencing instructions.

Figure 16. Modified program counter of the RISC microprocessor. The encoded goal alongwith the position of the non-terminal within the currently evaluated syntax rule are used tocompute the new instruction address for the attribute evaluation rules that need to be ex-ecuted. A multiplexer swiches between this mode of operation and the conventional ones(conventional PC increment, jump instructions, etc.).

Figure 17. Additional node addresses of the father’s, son’s, etc. addresses of the currentlyprocessed goal. Those are fed to the ALU and dynamically selected through the extended in-structions in order to allow attribute referencing of those nodes. Incoming lines are connectedto the hardware parser.


7. Case Study and Benchmarking

The proposed approach has initially been verified for its correctness in a testing andevaluation environment that we have implemented in software, using the VISUALC++ 6.0 environment. The proposed implementation has been implemented in syn-thesizable Verilog in the XILINX ISE 6.0 environment and has been simulated forvarious FPGA technologies at the net-list level. RISC times have been accuratelysimulated by a similar implementation of a RISC architecture in synthesizableVerilog in the same environment. The actual optimization results have completelyverified our initial expectations.

7.1. QUANTITATIVE ANALYSIS

In order to evaluate the general performance of the proposed implementation, wehave used a simple path finding logic program and simulated it for various sizes ofgraphs. The time required for the evaluation of attribute has not been considered.This does not affect the correctness of the results, since in both approaches (pro-posed and conventional) attribute evaluation occurs in the microprocessor. There-fore, attribute evaluation time does not get affected by our proposed implementa-tion (and therefore no improvement in this time is achieved). For a various numberof tree derivations (size of the parse tree constructed) instruction-level simula-tion with exact execution times has given the results illustrated in Figure 18, Fig-ure 19 compared to both a conventional RISC implementation and a RISC with apipeline with 5 stages. We have not considered any additional delay for memoryI/O references.

Performance using the proposed programmable attribute grammar evaluator hasincreased by an average of 1000% compared to the purely software approach in a

Figure 18. Comparing the performance of the “Path finding application” when executed ina RISC (solid line), RISC with a pipeline (dashed line) and the proposed extended RISCmicroprocessor (dotted line).


Figure 19. The increase in performance achieved by using the proposed implementation com-pared to the RISC (dotted line) and RISC with a pipeline (dashed line) approach for variousnumber of edges in the “Path finding application”.

conventional microprocessor. Apparently the increase in performance is larger ifwe take into consideration memory I/O delays.

7.2. QUALITATIVE ANALYSIS

The speed-up and programming simplicity merits of the proposed implementationcan be achieved in any AI application that needs to be implemented in an embeddedsystem. In this section, though, our main focus is to point the use of our proposedimplementation in an intelligent control application. Consider the case where anembedded system, equipped with the proposed extended RISC microprocessor,is controlling the movement of a wheel-chair within a room. The stepper-motorson the wheel-chair only support vertical and horizontal movement (no diagonalmovements are allowed). This restriction is imposed in order to allow more cleardemonstration of the implementation’s intelligent features. In real life, diagonalmovements can be also allowed with appropriate changes in the system’s knowl-edge base. In that case, our concluding remarks will be analogous to the ones thatwill be presented here. We also consider the vertical and horizontal movement tobe discrete, i.e. every signal sent to one of the stepper motors results to an actualmovement of the wheel-chair in the room of a specific distance (d) to the corre-sponding direction (North–East–South–West). This property, allows us to considerthe room as being covered by a grid, where each tile is of size d ×d. Every tile canbe either occupied by a stable obstacle (furniture, appliances), or a moving obstacle(humans, animals) or the wheel-chair itself. A camera located at the ceiling of theroom provides such information for every tile.

In order to experiment with such an application, we have formed an analogousexample from the well-documented “Wumpus World Game”. In specific, we haveapplied some modifications in the Wumpus World’s properties to make it corre-spond to the required application. In this modification of the game, the Hero is


Table V. Facts of the “Wumpus World” and their interpretation

Fact Meaning

Pit(x, y) True, if there is a Pit at x, y

Gold(x, y) True if there is Gold at x, y

Wumpus(x, y) True if there is the Wumpus at x, y

Empty(x, y) True if tile x, y is Empty

DontKnow(x, y) True if the hero has no knowledge for x, y

BeenAt(x, y, l) Hero has visited x, y, l times

At(x, y) Hero is at x, y

Table VI. Actions available to the hero at the “WumpusWorld” application and the effects they have

Action Result

Go North If at x, y go to x, y − 1

Go South If at x, y go to x, y + 1

Go East If at x, y go to x + 1, y

Go West If at x, y go to x − 1, y

Perceive If at x, y then he knows what there is at x − 1, y,

x + 1, y, x, y + 1 an d x, y − 1

moving vertically or horizontally within a grid-space in an effort of locating thetile which has the gold. The Wumpus is moving randomly within the world, tryingto eat the Hero. One tile in the world has the Gold, while a number of tiles withinthe world are considered as Pits that the hero can fall. The hero can either movein four different directions (North–South–East–West) in a time instant or can stayat the same tile and perceive its adjacent tiles (find out whether there is a Pit, theWumpus or the Gold in an adjacent tile). The only information that can be perceivedby the hero without the “perception” action is the fact that the Wumpus is at someadjacent tile. The aforementioned example is analogous to the wheel-chair applica-tion, if we consider the Hero as the wheel-chair, the Pits as the stable obstacles, theWumpus as a moving obstacle, and the Gold as the final target location. Thereforefor the rest of this section, we will be demonstrating the features of our proposedimplementation for the “Wumpus World” example as presented above.

Initially, we define the knowledge base for the specific problem. It consists ofthe facts presented in Table V.

The actions, the hero can take and their resulting effects to its knowledge baseare presented in Table VI.

Finally the consequences that should be checked by our evaluation system arepresented in Table VII.


Table VII. Consequences of the actions of the hero in the“Wumpus World”. Those act as flags which can be either trueor false depending on the position of the hero in the world

Event Meaning

GoldFound Hero has found the gold

WumpusEats Wumpus ate hero

PitFall Hero has fallen into a Pit

Nothing Nothing has happened

The constructed evaluation environment uses two two-dimensional arrays, onefor capturing the properties of each tile in the world (world[i][j ]) and another thatcorresponds to the actual knowledge of the hero (Knowledge[i][j ]). The Wum-pus is controlled by the evaluation environment and its movement is consideredto be completely random (no form of intelligence). The Hero, based on its cur-rent knowledge, infers the most appropriate action and performs it. This inferencemechanism applies a specific weight for each movement action ranging from 0to 1. The movement with the maximum weight is chosen as the most preferable. Inthe case that this movement is not above a specific threshold THR then the Herowill decide to perform the “Perceive” action. In order to add some randomness tothe movement weights (in the case where two or more movements give the sameweight), we also apply a “random” value ranging from (−0.1 to 0.1) which affectsthe weight and differentiates any equalities that may occur. Finally, we have alsochosen to integrate a variable named “courage” ranging from 0 to 0.2 in order todefine how daring we want our Hero to be, i.e. how possible will it be for the Heroto move to a location that he does not know. Those weights are calculated bottomup, starting from the facts of the knowledge base. Therefore, we have augmentedthe already mentioned facts, with starting values affecting the final weights.

Upon each action, the evaluation system checks to see whether any consequenceis satisfied, and if so, it reacts accordingly, possibly by ending the simulation. Notethat the success or failure of the hero to find the world is time – critical since afaster inference engine allows faster movements and therefore decreases the chanceof the Wumpus finding the hero. Taking into consideration the aforementionedfeatures, the logic program used for the inference process of the hero is presented inTable VIII. (The knowledge base of the hero is initially empty, while each inferenceprocess begins by a logic question of the form Action(ch) ←?)

The whole game has been evaluated using both the proposed extended RISCmicroprocessor and the conventional one. Using the proposed implementation, theaddition of new rules to the inference process is straightforward as well as theimplementation of its declarative computations. As it was expected, the speed-upin the inference process achieved, following our method, allowed the Hero to bemore successful in its quest for the gold. This is better illustrated in Figure 20.


Table VIII. Complete logic program used by the hero to complete the inference process and decide itsaction. On the left column inference rules are presented while on the right column a brief explanationon how those rules operated is provided

Logic rule Meaning

Action(ch) ← ActionCert(x1, x2, x3, x4),ActionVector(x1, x2, x3, x4, ch)

ActionCert gathers the weights for each move-ment’s action while ActionVector calculateswhich action should be preferred. ch is the en-coding of the final action and x1, x2, x3, x4correspond to the weights for the four possiblemovements

ActionVector(x1, x2, x3, x4, ch) Always true and calculates the action (ch) to betaken

ActionsCert(x1, x2, x3, x4) ← At(x, y),GoTo(x − 1, y, x1), GoTo(x, y − 1, x2),GoTo(x + 1, y, x3), GoTo(x, y + 1, x4)

It gathers the weights for every possible move-ment

GoTo(x, y, z) ← Status(x, y, k),F(k, randomness, wumpusaround, z)

It calculates the weight for a specific move-ment. Status(x, y, k) applies a value k depend-ing on what is the knowledge of the Hero ofthe specific tile and F is a function that evalu-ates the weight z, taking into consideration k,wumpusaround (−0.3 value when wumpus isaround to force a perception and 0 if not) andrandomness

F(k, randomness, wumpusaround, z) ← Calculates the weight

Status(x, y, k) ← Pit(x, y, k)|Wumpus(x, y, k)|Gold(x, y, k)|Empty(x, y, k)|Questionable(x, y, k)

Find’s out the status (weight) of a specific loca-tion based on whether a Pit, Wumpus, the Gold,Nothing, or don’t know is at the location

Pit(x, y, 0) ← If there is a pit at x, y the weight is always 0

Wumpus(x, y, 0) ← If Wumpus is located at x, y the weight isalways zero

Gold(x, y, 1) ← If gold is at x, y then the weight should be 1

Empty(x, y, k) ← Nothing(x, y, q),BeenAt(x, y, l), G(q, l, k)

x, y location is empty with a weight k, if x, y

has Nothing, BeenAt return the number l it hasbeen visited and G yields k based on the valuesof q, l

Nothing(x, y, 0.5) ← x, y is empty with a 0.5 weight

G(q, l, k) ← k becomes smaller while l (number of visits)becomes larger

Questionable(x, y, k) ← DontKnow(x, y, q),H(q, courage, k)

Calculates the possibility of moving to an un-known tile

DontKnow(x, y, 0.4) ← A 0.4 weight is applied if hero does not knowwhat resides in tile x, y

H(q, courage, k) ← Calculates the weight k based on the hero’scourage


(a) (b)

Figure 20. Outcome of several experiments of the hero’s quest using the proposed imple-mentation (a) and a conventional RISC microprocessor (b). The light gray portion indicates asuccessful quest while the dark gray represents failure.

Figure 21. When trying to match the performance of the proposed implementation using aconventional RISC microprocessor by increasing the hero’s courage, approximately 50% ofthe experiments have led the hero to fall into a pit (portion in black).

This is justified from the fact that, by forcing the hero to always perceive theworld before he moves the only chance that Wumpus has to find him is when hemoves on the tile he resides. With a faster inference process this is more unlikelyto happen.

In our second experiment we wanted to find out in which case the time forfinding the gold for both implementations would be comparable. For that reasonwe kept the same world and increased the courage for the hero for the conventionalRISC implementation. For a value of courage = 0.2 (maximum), the performancewas similar to the one achieved with the proposed implementation (the inferenceprocess in the RISC perceived adjacent tiles very rarely) and the outcome of theexperiments was averaged to the one illustrated in Figure 21.

Apparently in order for the conventional RISC to reach the same performancewith our proposed implementation, the risks taken should be increased (50% chanceof falling into a Pit, e.g., hitting a stable obstacle).

Although, our main concern in this application was not to present a completeinference engine that can achieve maximum success rate, but rather to present howour proposed implementation achieves a speed-up that improves qualitatively anyinference algorithm, we needed to evaluate whether the inference process con-structed is of some value to justify our qualitative results (it actually achieves betterresults compared to the case that it is not used at all for the hero’s movement). Forthat reason, we have implemented a second algorithm for the hero’s movement


(a) (b)

Figure 22. Outcome of several experiments comparing the success of the hero when usinga completely random movement mechanism (a) and the inference process presented in thissection (b).

(a) (b)

Figure 23. Times each tile is visited in a successful quest with the proposed inferencemechanism (a) and a completely random process (b).

(one that is completely based on luck) and found out how its success rate is com-parable to our inference engine. Since in the case where Pits exist in the world suchcomparison is obviously making our approach far better, we have experimented ina world with no Pits and taken into consideration that with no inference, the speedof the hero will be much faster. The results are illustrated in Figure 22.

Apparently the hero following the inference process has been more successful.But, even in the case where in both those approaches the hero has been successful,the fact that the hero following the inference process avoids revisiting for multipletimes, tiles that have not had the gold prevents him from wandering around tilesthat have already been proven of not having anything, turning our approach moregoal focused (Figure 23).

As a result, we have proven that following our proposed implementation, onone hand the inference process is easier to be implemented and integrated in anintelligent embedded system (mixing of declarative and procedural code) and onthe other hand, the speed-up in performance we achieve, improves qualitatively theoutcome of the required actions. In future releases of the microprocessor we areplanning to also integrate learning capabilities to the inference process.


8. Conclusions and Future Work

In this paper, we have proposed an extended RISC microprocessor for the im-plementation of intelligent embedded systems to be used for control applications.Towards this goal we have presented a RISC extension that supports the executionof hybrid combinations of declarative-procedural code and a C-extended languagethat allows such programs to be written. We have also proposed, a hardware pro-grammable implementation of a parser that is attached to the microprocessor, inorder to define the execution sequence of attribute evaluation rules creating a pro-grammable semantically driven parser to be used for knowledge representation.As a result, we proposed a complete extended RISC microprocessor, which whilesupporting all conventional facilities for the execution of procedural programs, iscapable of increasing the performance of logic programming computations andallows design flexibility required in embedded system applications. In the casewhere there is the need of reducing the final proposed system’s size and cost, onepossible modification to the presented extension is to move the “Stack Memory”of the hardware parser to the system’s memory and not within the parser. Such anapproach reduces the total space required for the implementation but also resultsin a negative impact to the maximum possible increase in performance.

Another possible modification to the proposed implementation concerns im-proving the extended microprocessor with pipelining capabilities. In other words,the hardware parser after each call to an EVAL function for the evaluation of at-tribute instances does not block until attribute instances are evaluated to determinethe value of the flag, but continues execution. In that case the hardware parserassumes that the derivation has been successful and continues creating the corre-sponding derivation tree. In the case where attribute evaluation has been successfulthe hardware parser will have already computed a part of the rest of the derivationtree. In the different case the computed derivations are flushed and a new alterna-tive is evaluated. Such pipelining technique is expected to improve even more thesystem’s performance and is one of our main focus points in future releases of theimplementation. Our current efforts are also focused in increasing the efficiencyof the implemented hardware parser by using different more efficient parsing algo-rithms and extending the model to support inexact/full theorem proving capabilitiesand learning.

References

1. A high performance OR-parallel Prolog system, PhD Thesis, The Royal Institute of Technol-ogy, Stockholm, March 1992.

2. Ait-Kaci, H.: Warren’s Abstract Machine: A Tutorial Reconstruction, MIT Press, New York,1991 (out of print); available at: http://www.isg.sfu.ca/~hak/documents/wam.html.

3. Chen, H. and Chen, X.: Shape recognition using VLSI architecture, Internat. J. PatternRecognition Artificial Intelligence (1993).

4. Chiang, Y. T. and Fu, K. S.: Parallel parsing algorithms and VLSI implementation for syntacticpattern recognition, IEEE Trans. Pattern Anal. Mach. Intelligence 7 (1984).


5. Commun. ACM 36(3) (1993).6. Deransart, P. and Maluszynski, J.: A Grammatical View of Logic Programming, MIT Press,

Cambridge, MA, 1993.7. Floyd, R. W.: The syntax of programming languages – A survey, IEEE Trans. Electr. Comp.

13(4) (1964).8. Gupta, G., Pontelli, E., Ali, K. A. M., Carlsson, M., and Hermenegildo, M. V.: Parallel ex-

ecution of Prolog programs: A survey, J. Programming Languages Systems 23(4) (2001),472–602.

9. Jourdan, M., Parigot, D., Julie, C., Durin, O., and Le Bellec, C.: Design, implementation andevaluation of the FNC-2 attribute grammar system, in: Proc. of the ACM SIGPLAN90 Conf. onProgramming Languages, Design and Implementation, ACM Press, New York, 1990, pp. 209–222.

10. Panayiotopoulos, T., Papakonstantinou, G., and Sgouros, N.: An attribute grammar interpreterfor inexact reasoning, Inform. Software Technology 32(5) (1989).

11. Papakonstantinou, G. and Kontos, J.: Knowledge represantation with attribute grammars,Computer J. 29(3) (1986), 241–245.

12. Papakonstantinou, G., Moraitis, C., and Panayiotopoulos, T.: An attribute grammar interpreteras a knowledge engineering tool, Angewandte Informatik 9(86) (1986), 382–388.

13. Patterson, D. A. and Henessey, J. L.: Computer Organization and Design: The Hard-ware/Software Interface, MK Publishers, 1998.

14. Pavlatos, C., Panagopoulos, I., and Papakonstantinou, G.: Knowledge representation using theEarley’s parsing algorithm, to be presented in: SETN04, Samos, Greece.

15. Pereira, F. C. N. and Warren, D. H. D.: Definite clause grammars for language analysis –A survey of the formalism and comparison with augmented transition networks, J. ArtificialIntelligence 13 (1981), 231–278.

16. Russel, S. and Norvig, P.: Artificial Intelligence, a Modern Approach, Prentice-Hall, EnglewoodCliffs, NJ, 1995.

17. Tokuda, T. and Watanabe, Y.: An attribute evaluation of context-free languages, Inform.Processing Lett. 52(2) (1994), 91–98.

18. Tzafestas, S. G.: Knowledge Based System Diagnosis, Supervision and Control, Plenum, NewYork, 1989.

19. Vahid, F. and Givargis, T.: Embedded System Design: A Unified Hardware/Software Introduc-tion, Wiley, New York, 2002.

20. Voliotis, C., Papakonstantinou, G., and Kontos, J.: Attribute grammar based modeling ofconcurrent contraint logic programming, Internat. J. Artificial Intelligence Tools 4(3) (1996),383–411.

21. Waite, W. M.: Beyond LEX and YACC: How to generate the whole compiler, Technical Report,Boulder, CO, 1993.

Documents

An Embedded Microprocessor for Intelligent Control