DataLex Developer's Manual · rule-based and example-based inferencing mechanisms, and by the document generator. YSH was developed with the creation of applications to law in mind,

University of New South Wales Law Research Series

AUSTLII'S DATALEX DEVELOPER'S MANUAL (1ST EDITION, JUNE 2019)

ANDREW MOWBRAY, GRAHAM GREENLEAF AND PHILIP CHUNG

[2019] UNSWLRS 65

UNSW Law UNSW Sydney NSW 2052 Australia

E: [email protected] W: http://www.law.unsw.edu.au/research/faculty-publications AustLII: http://www.austlii.edu.au/au/journals/UNSWLRS/ SSRN: http://www.ssrn.com/link/UNSW-LEG.html

mailto:[email protected]://www.law.unsw.edu.au/research/faculty-publicationshttp://www.austlii.edu.au/au/journals/UNSWLRS/http://www.ssrn.com/link/UNSW-LEG.html

Andrew MowbrayAustLIIUniversity of Technology [email protected]

Graham GreenleafAustLIIUNSW [email protected]

Philip ChungAustLIIUNSW [email protected]

AustLII’sDataLex Developer’sManualAustralasian Legal Information Institute (AustLII)

First EditionJUNE 2019

AustLII’sDataLex Developer’sManualAustralasian Legal Information Institute (AustLII)

First EditionJUNE 2019

Copyright © 2019 Andrew Mowbray, Graham Greenleaf and Philip Chung

Andrew Mowbray is Professor Law and Information Technology, University of TechnologySydney and Co-Director, AustLII.

Graham Greenleaf AM is Professor of Law & Information Systems, UNSW Sydney and SeniorResearcher, AustLII.

Philip Chung is Associate Professor of Law, UNSW Sydney and Executive Director, AustLII.

AustLII, Sydney, Australia

http://www.austlii.edu.au/

AustLII is a joint facility of UTS and UNSW Faculties of Law.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is grantedwithout fee provided that copies are not made or distributed for profit or commercial advantage and thatcopies bear this notice and the full citation on the first page. Copyrights for third-party components of thiswork must be honored. For all other uses, contact the owner/author(s).

First Edition, June 2019

http://www.austlii.edu.au/

Contents

1 DataLex legal knowledge-base systems 51.1 Components of AustLII’s DataLex legal

knowledge-base systems . . . . . . . . . . . . . . . . . 51.2 Conventions used in this Manual . . . . . . . . . . . . . 71.3 Theoretical foundations of the DataLex approach . . . . 71.4 Creating a new knowledge-base in the DataLex

Community . . . . . . . . . . . . . . . . . . . . . . . . . 71.5 Creating a new knowledge-base using the

Development Tools . . . . . . . . . . . . . . . . . . . . . 91.6 Updates to this Manual . . . . . . . . . . . . . . . . . . . 10

2 Rule-based inferencing (I): Knowledge-bases and rules 112.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 112.2 Knowledge-bases and rules . . . . . . . . . . . . . . . . 122.3 Content of rules - keywords and descriptors . . . . . . 132.4 Example of a rule – FOI Act s11 . . . . . . . . . . . . . . 142.5 Running and de-bugging a DataLex application . . . . 152.6 Some style guidelines for DataLex applications . . . . . 16

3 Rule-based inferencing (II): Descriptors, keywords andexpressions 193.1 Attributes (or ‘Facts’) . . . . . . . . . . . . . . . . . . . . 193.2 Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3 Generating questions and explanations . . . . . . . . . 223.4 Named subjects - names of people and things . . . . . . 253.5 Variable attributes [advanced] . . . . . . . . . . . . . . . 273.6 Expressions - the use of operators . . . . . . . . . . . . . 28

4 Rule-based inferencing (III): Rule types, statements andevaluation order 314.1 Order of evaluation of rules, and within rules, in

inferencing . . . . . . . . . . . . . . . . . . . . . . . . . . 314.2 Goals - LISTED rules and GOAL rules . . . . . . . . . . 324.3 Order of evaluation of rules . . . . . . . . . . . . . . . . 324.4 Types of rules . . . . . . . . . . . . . . . . . . . . . . . . 334.5 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . 344.6 Appendix – List of Main Keywords (all types) used by

DataLex . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4 MOWBRAY, GREENLEAF & CHUNG

5 Integration of DataLex knowledge-bases with AustLII andother LIIs 415.1 Overview - Integration of DataLex knowledge-bases

with their sources . . . . . . . . . . . . . . . . . . . . . . 415.2 Automatic links to AustLII legislation . . . . . . . . . . 415.3 Automatic links to case law . . . . . . . . . . . . . . . . 425.4 Explicit links in a knowledge-base (the LINK ... TO ...

keywords) . . . . . . . . . . . . . . . . . . . . . . . . . . 435.5 Stored searches from DataLex knowledge-bases . . . . 435.6 ‘Co-operative inferencing’ – knowledge-bases in

multiple locations . . . . . . . . . . . . . . . . . . . . . . 43

6 Document assembly using DataLex 456.1 DOCUMENT rules . . . . . . . . . . . . . . . . . . . . . 456.2 Text generation statement types - PARAGRAPH and

TEXT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466.3 ‘Personalising’ documents - embedded attributes . . . . 476.4 Alternative clauses in a document . . . . . . . . . . . . 486.5 Generating successive paragraphs of a document - use

of CALL statements . . . . . . . . . . . . . . . . . . . . . 496.6 Numbering paragraphs . . . . . . . . . . . . . . . . . . 496.7 Integration of inferencing and document generation . . 506.8 Use of other DataLex features with document assembly 506.9 Example - a will generator . . . . . . . . . . . . . . . . . 51

7 Case-based (example-based) inferencing using DataLex 557.1 Example-based reasoning – overview . . . . . . . . . . 557.2 Relationship between examples and rules in DataLex’s

inferencing . . . . . . . . . . . . . . . . . . . . . . . . . . 557.3 Knowledge representation – EXAMPLEs . . . . . . . . 567.4 An example of a case representation by EXAMPLEs . . 577.5 Reports generated by DataLex EXAMPLE reasoning . . 577.6 Principles behind the case-based inferencing component 587.7 Steps in developing an EXAMPLE set . . . . . . . . . . 59

8 DataLex user interface manual 658.1 Relationship to the previous chapters . . . . . . . . . . 658.2 Starting a consultation . . . . . . . . . . . . . . . . . . . 658.3 Choice of goals . . . . . . . . . . . . . . . . . . . . . . . 668.4 Answering questions . . . . . . . . . . . . . . . . . . . . 668.5 Showing facts (What?) . . . . . . . . . . . . . . . . . . . 678.6 Forgetting facts (Forget) . . . . . . . . . . . . . . . . . . 678.7 Obtaining explanations for conclusions (How?) . . . . . 688.8 Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688.9 Links to sources . . . . . . . . . . . . . . . . . . . . . . . 698.10 Viewing consultations in verbose mode . . . . . . . . . 69

1DataLex legal knowledge-base systems

1.1 Components of AustLII’s DataLex legal knowledge-basesystems

AustLII’s DataLex inferencing software allows the development ofInternet-based applications combining knowledge-basedinferencing, a limited form of example-based inferencing, andautomated document assembly. Applications are in the form of aplain-text knowledge-base (KB), which is written in a quasi-naturallanguage knowledge representation.

The DataLex legal knowledge-based system has the followingprincipal components:

Figure 1.1: Components of the DataLexsystem.

This Manual describes the development and use of AustLII’sDataLex legal knowledge-base systems as at June 2019.1 1 Parts are derived from A Mowbray,

G Greenleaf, G King & S Cant WyshDeveloper’s Manuals, AustLII, June 1997.

1.1.1 The DataLex inferencing software - YSH

The DataLex inferencing software (inference engine) is also knownas YSH (pronounced ‘why-shell’), and is referred to as such in

6 DATALEX DEVELOPER ’S MANUAL

previous documentation. It uses a quasi-natural languageknowledge representation. Knowledge may be represented asbackward or forward chaining rules, or procedures, or anycombinations of these, for the purposes of rule-based inferencing.Knowledge may also be represented as clusters of fact descriptions,‘cases’, for the purposes of example-based inferencing usingnearest-neighbour discriminant analysis. Facts are shared by therule-based and example-based inferencing mechanisms, and by thedocument generator. YSH was developed with the creation ofapplications to law in mind, but is equally applicable to the creationof other automated interview style applications.

Andrew Mowbray is the author of the DataLex inferencingsoftware / YSH. The version covered in this Manual isDataLex/YSH Ver. 1.2.1.

1.1.2 The DataLex knowledge-base development tools

Development of DataLex applications primarily takes place withinthe DataLex section of the AustLII Communities environment.2 The 2 DataLex section of the AustLII

Communities

development environment is that of a closed wiki which preservesthe correct formatting of the knowledge-base, and provides anumber of tools for checking and correcting syntax. Automatedhypertext links between knowledge-base texts and the source textson AustLII and other LIIs also make it easier to check rules etcagainst the sources on which they are based, during development.There is also a development environment located outside AustLIICommunities which is used for teaching and development of testknowledge-bases.3 Their use is covered later in this Chapter. 3 DataLex knowledge-base

Development Tools

1.1.3 The DataLex user interface

The user interface to the DataLex inferencing software provides aneasy-to-use environment in which end-users conduct aquestion-and-answer dialogue with the application in order toprovide information (‘facts’) to it in order for the system to drawconclusions, and to conclude a user session by producing a report(and in some cases a document). The interface allows users to askWhy questions are being asked and How conclusions have beenreaching, as well as to Forget facts previously provided, and to testhypothetical facts through a What-if facility. The user interface alsouses AustLII’s automated hypertext markup and free text retrievalfacilities to provide automated hypertext links from dialogues,conclusions and reports to the relevant legal sources on AustLII oron other LIIs. This integration between the inferencing componentsand the legal sources located on LIIs is one of the principaldistinctive features of the DataLex approach.

The User Manual for the current4 DataLex user interface is in 4 A previous user manual, referred to insome documentation, was A Mowbray,G Greenleaf, G King & S Cant WyshUser’s Manual - 1997.

http://austlii.community/wiki/DataLex/http://austlii.community/wiki/DataLex/http://www.datalex.org/dev/import/http://www.datalex.org/dev/import/

DATALEX LEGAL KNOWLEDGE-BASE SYSTEMS 7

Chapter 8 of this Manual. The purposes of features of the software,as described in this Developer’s Manual, are sometimes bestunderstood by reference to the User’s Manual. The current interface5 5 The earliest web interface to YSH

(S Cant and G King, 1997) was calledWysh (‘web-ysh’), a Common GatewayInterface (CGI) to YSH written in Perland C. It associated each user sessionwith a session identifier and connectssubsequent requests to the appropriatesession, or process, by means of UNIXsockets. It is referred to in previousdocumentation.

to the DataLex inferencing software (2019) has been developed byPhilip Chung, Andrew Mowbray and AustLII consultants.

1.2 Conventions used in this Manual

The following conventions are used in this Manual to explaincommands or file names:

string Words or symbols in bold indicate the actual words orsymbols used;

string Words or symbols in italics indicate that their content isvariable;

| A vertical bar is used to divide a range of options - don’t typeit.

1.3 Theoretical foundations of the DataLex approach

There are a number of articles explaining and justifying the approachtaken by the DataLex project. The main articles, and bibliography, areas follows:

• A Mowbray, P Chung and G Greenleaf, ‘Utilising AI in the LegalAssistance Sector – Testing a Role for Legal Information Institutes’(29 April 2019), presented at LegalAIIA 2019, co-located withICAIL 2019, 17 June 2019, Montréal, Québec, Canada

• G Greenleaf, A Mowbray, and P Chung, ‘Building Sustainable FreeLegal Advisory Systems: Experiences from the History of AI &Law’ (2018) 34(1) Computer Law & Security Review 314

• G Greenleaf, A Mowbray, and P Chung, ‘The Datalex Project:History and Bibliography’ (3 January 2018). [2018] UNSWLRS 4or

• G Greenleaf, A Mowbray, and P van Dijk, ‘Representing and usinglegal knowledge in integrated decision support systems - DataLexWorkStations’ Artificial Intelligence & Law, Vol 3, 1995, Kluwer,97-142

1.4 Creating a new knowledge-base in the DataLexCommunity

The DataLex Community (part of the AustLII Communities) is acollaborative closed wiki-like platform used in the DataLex system

https://ssrn.com/abstract=3379441https://ssrn.com/abstract=3021452https://ssrn.com/abstract=3021452http://www.austlii.edu.au/au/journals/UNSWLRS/2018/4.htmlhttps://ssrn.com/abstract=3095897https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2183481


for creating and editing knowledge-bases or rule-bases.Log into the DataLex Community site

by clicking on the‘Log in’ button on the top right hand corner.

Figure 1.2: DataLex Community loginprocess.

Once the login process is verified, an extra row of buttons forediting and creating new rule-bases will appear on the page.

Figure 1.3: Editing existing rule-base inDataLex Community.

Click on ‘Edit’ to edit existing rule-bases.To start a new rule-base (or topic), click on the ‘New’ button. Do

so from the DataLex page, or it will be a sub-page from wherever youstart. The following window will appear:

Figure 1.4: Creating new rule-base (ortopic) in DataLex Community.

Enter the ‘Title’ of the rule-base to be constructed. The examplehere is ‘PrivacyKB’ for a privacy law knowledge-base. (For easyidentifiability, it is useful to give knowledge-bases the suffix ‘KB’.) Inthe ‘Template’ section, select ‘DataLexKBTemplate’ (not ‘Default’).Then, click on ‘Submit’.

In the editing screen for the rule-base (with heading ‘Title ofknowledge-base’), start editing your rule-base by deleting ‘ADD

http://austlii.community/wiki/DataLex/

DATALEX LEGAL KNOWLEDGE-BASE SYSTEMS 9

RULES HERE’. One way to start a rule base is to paste in a legislativesection, and start editing it to create rules.

To find your rule-base after you have logged back in, search for thefirst few letters of the name of your rule-base (eg search for ‘FOI’ tofind FOIDocumentOfAnAgencyKB).

1.5 Creating a new knowledge-base using the DevelopmentTools

If you do not yet have an account in AustLII Communities, go to (AustLII’s DataLexKnowledge-base Development Tools page – ‘KB Tools page’). No login isnecessary. This page enables creation and test-runs of smallapplications using the DataLex software. However, it does not allowtest apps to be saved.

Figure 1.5: DataLex knowledge-basedevelopment tools: Importing andEditing rules.

There are two ways to start writing an app:

(i) Simply start writing in the ‘Edit DataLex knowledge-base’ editingscreen, following the instructions in this Manual.

(ii) If you know what section of an Australian Act you would like tostart with, go to the ‘Import legislative section (available onAustLII)’ and enter the Act name, jurisdiction and section, thenstart editing as in (i) above.

Select the ‘Run Consultation’ button when you are ready to testyour KB. Please note:

http://www.datalex.org/dev/import/http://www.austlii.edu.au/


Figure 1.6: DataLex knowledge-base development tools: RunningConsultation and Checking rule-base.

• After running your app (successfully or unsuccessfully) you canuse your browser to come back to the KB Tools page to makefurther edits to your KB.

• If your app does not run as intended, use the ‘Check Fact CrossReferences’ and ‘Check Fact Translations’ buttons to rundiagnostics (see 2.5 below) to identify problems. Edit and runagain.

• If you have spent any significant time developing a test KB, youmight want to save a copy in a word processor or other file, in casethe browser malfunctions, or if you want to use the KB again afteryou have quit the browser.

Refinements:

• To add another section of an Act from AustLII to an existing KB,use the ‘Import and Append’ button after specifying the additionalsection.

• To over-write and erase an existing KB with a new section fromAustLII, use the ‘Import and Replace’ button.

• To clear an existing KB and start again, use ‘Clear knowledge-base’.

1.6 Updates to this Manual

This PDF version of the Manual will be updated periodically. Inbetween PDF editions, incremental updates may be made to the wikiversion of the Manual located on AustLII Communities at .

http://austlii.community/wiki/DataLex/DataLexDeveloperSManualhttp://austlii.community/wiki/DataLex/DataLexDeveloperSManual

2Rule-based inferencing (I): Knowledge-bases and rules

2.1 Introduction

The DataLex inferencing software is an Internet-based expert systemshell for the development of inferencing systems (sometimes called‘expert systems’) in the legal domain. It may be used to developsystems incorporating rule-based inferencing (discussed in thisChapter and the two following chapters), example or case-basedinferencing (Chapter 7), and automated document assembly(Chapter 6). The User Interface Manual is in Chapter 8.

2.1.1 Levels of complexity with DataLex

DataLex is very simple to use to create small practice expert systems,at least for most types of statute-based applications. This is becauseall you have to do, to get a small system up and running, is toparaphrase a section or two of an Act into a somewhat strict logicalform, using logical connectors such as IF, THEN, AND and OR. Theresult is a knowledge-base, expressed in DataLex’s ‘English like’knowledge representation language. The DataLex inference enginethen does the rest, running your knowledge-base to generate adialogue with the user, asking questions and giving answers. Youdon’t write any of the questions or answers – DataLex generatesthem automatically from your knowledge-base.

However, while DataLex can be used easily by relying only on asmall number of its features, the DataLex engine has a very powerfuland complex range of features which can be used as you proceed todevelop more sophisticated applications.

The easiest way to understand how aDataLex knowledge-base is written isto study the examples given in this andthe following Chapters, and then touse this Manual to explain features thatyou don’t understand fully.

2.1.2 Main features

The main features of DataLex are:

• a ’quasi-natural-language’ or English-like syntax, whichencourages isomorphism (similarity between the structure of aknowledge-base and source legal documents), transparency(purpose of rules is relatively obvious) and rapid prototyping(easy to get small systems running);


• rules of any degree of complexity may be written, usingpropositional logic;

• backward and forward chaining rule-based inferencing;

• conventional procedural code including mathematicalcalculations;

• a form of reasoning by analogy, or example-based reasoning; and

• a document generation facility.

DataLex is therefore a fairly versatile tool with which a variety ofinferencing applications may be created.

2.1.3 User commands and the DataLex User Interface Manual

The DataLex User Interface Manual, in Chapter 8 of this Manual,explains the interface to DataLex applications when they arerunning, from a user perspective. It should be read either before orin conjunction with this Chapter.

2.1.4 Developing a DataLex application – Where is the developer’sinterface?

See Chapter 1, sections 1.4 and 1.5 for the two ways to do this.The developer’s interface for DataLex applications is primarily

within the DataLex section of the AustLII Communities environment. There is also adevelopment environment located outside AustLII Communitieswhich is used for teaching and development of testknowledge-bases, the ‘DataLex Knowledge-base DevelopmentTools’ .

2.2 Knowledge-bases and rules

A knowledge-base is a set of declarations, so called because they’declare’ items of knowledge about a subject area. This type ofprogramming is therefore called ’declarative’ programming, incontrast to ’procedural’ programming, which is of the form ’first dothis step; then do this step ....’. The author of a DataLex applicationtherefore creates a ’knowledge-base’ rather than a ’program’.

2.2.1 Knowledge-bases as sets of rules

The most important category of declarations in DataLex is rules (soknowledge-bases are often called rule-bases). A knowledge-base, atits simplest, is therefore a set of rules. When the rule-base is ’run’ byDataLex it attempts to find the truth of a specified fact.

http://austlii.community/wiki/DataLex/http://www.datalex.org/dev/import/

RULE-BASED INFERENCING ( I) : KNOWLEDGE-BASES AND RULES 13

It does this by going to a rule which has that fact as its conclusionand examining each of the premises of that rule to determinewhether the conclusion of the rule is true. In evaluating the premisesof a rule DataLex uses any other rules which have any of thepremises as their conclusion. DataLex repeats this process alongeach branch of reasoning until it reaches a premise for which there isno rule to derive a conclusion. At this point, DataLex interrogatesthe user about the truth of the premise. It does not generally matter,therefore, in which order the rules occur. Rather, DataLex searchesthe knowledge-base for relevant rules in relation to each fact orpremise which it is evaluating.

2.2.2 Form of a simple rule

In its simplest form, a rule contains four elements:

(i) the keyword ‘RULE’, indicating the start of a new rule (in theabsence of any specification otherwise, the rule will be bothbackward chaining and forward chaining);

(ii) the name of the rule (usually just the name of the Act and sectionthat it paraphrases); The name of a rule should differ from that ofany other rule in the rule-base;

(iii) the keyword ‘PROVIDES’, indicating the start of the body of therule; and

(iv) the statement(s) which make up the inferencing content of therule (one or more statements). Statements consist of declarations.One of the simplest forms of a statement is ’IF condition THENconclusion’.

The simplest syntax for a rule is therefore as follows:

RULE name PROVIDES statements

The example below shows a rule with one moderately complex setof statements:

RULE Freedom of Information Act 1982 (Cth) s11 PROVIDES

a person has a legally enforceable right under s11 to obtain

access to a document ONLY IF

s11(a) applies OR

s11(b) applies

2.3 Content of rules - keywords and descriptors

Knowledge-base rules consist of keywords and descriptors. Keywordsare used to join together, in a logical form, a number of descriptors,which are simply terms or phrases used to describe some object, eventetc.


2.3.1 Keywords

Keywords give rules the logical structure used by DataLex to drawinferences. They are written in FULL UPPER CASE so DataLex candistinguish them from their equivalents in ordinary words (whichmay occur in descriptors).

Some examples of important keywords, or sets of keywords are:ONLY IF; IF .... THEN; IF ... THEN .... ELSE; IS; AND; OR; PLUS;MINUS ; PERSON; THING.

These and other keywords have functions in a DataLexknowledge-base which is very similar to their normal linguisticfunction as words. This correspondence is a large part of what givesDataLex a ’quasi natural language’ or ’English like’ syntax.

There is a list of keywords which may be used with DataLex at theend of Chapter 4.

DataLex is very case-sensitive. Itexpects keywords to be in FULLUPPER CASE.

2.3.2 Descriptors

Descriptors may be any sequence of words or symbols but must notcontain keywords (although they can contain the lower case versionsof them). Descriptors are generally written in lower case, with normalcapitalisation. See Chapter 3 for details of how descriptors should bewritten in order to work best in DataLex.

In the example below, some descriptors used are ‘a person has alegally enforceable right under s11 to obtain access to a document’,‘s11(a) applies’ and ‘the document is not an exempt document’. Theseare all attributes.

There are a number of varieties of descriptors, of which the mostimportant are (i) constants, (ii) facts or attributes, (iii) named subjects(a special type of attribute) and (iv) rule names. Each is discussedin detail in the following chapter. A type of attribute used only indocuments (see Chapter 6) is called ‘text’.

First, however, a simple example of a rule, and how to make it run,is given.

2.4 Example of a rule – FOI Act s11

2.4.1 The section

The Freedom of Information Act 1982 (Cth) s11 reads:11. Subject to this Act, every person has a legally enforceable right

to obtain access in accordance with this Act to –(a) a document of an agency, other than an exempt document; or(b) an official document of a Minister, other than an exempt

document.A rule-base of 3 rules consisting solely of this section could read

as follows. The rules have been (over-)simplified, for demonstrationpurposes, by ignoring the words ‘subject to this Act’ in s11.


2.4.2 A rule-base of 3 rules

RULE Freedom of Information Act 1982 (Cth) s11 PROVIDES

a person has a legally enforceable right under s11 to obtain

access to a document ONLY IF

s11(a) applies OR

s11(b) applies

RULE Freedom of Information Act 1982 (Cth) s11(a) PROVIDES

s11(a) applies ONLY IF

the document is a document of an agency AND

the document is not an exempt document

RULE Freedom of Information Act 1982 (Cth) s11(b) PROVIDES

s11(b) applies ONLY IF

the document is an official document of a Minister AND


2.5 Running and de-bugging a DataLex application

Text, such as that above, is all that is needed for a valid knowledge-base. The knowledge-base can be invoked as a DataLex session byselecting the ‘Run Consultation’ button.

If a knowledge-base does not behave as intended, go back to theediting page, edit the rule, and run it again. The main purpose of thetype/paste window on the manual start page is to allow thedeveloper to test minor changes to rules without having to create anew web page each time in order to do so.

2.5.1 Debugging

In addition to the ‘Run Consultation’ button, there are two additionalbuttons which allow you to check for some types of errors in yourknowledge-base, either before you try to run it, or after you so, andit does not perform quite as expected. They are ‘Check Fact CrossReferences’ and ‘Check Fact Translations’.

There is also another debugging tool that can be used while theapplication is running, Verbose Mode (see 8.10).

2.5.2 Check Fact Cross References

Use of similarly named but not identically named attributes is oneof the main causes of errors in YSH knowledge-bases, particularlywhere rules which are supposed to chain do not do so. The ‘ CheckFact Cross References’ button allows you to check for such errors.

The ‘Check Fact Cross References’ button causes eachfact/attribute to be printed (in alphabetical order - except where itbegins with an hypertext link) showing the names of rules which set(*) and rules which use (-) the attribute. Named subjects are alsolisted.

Use of similar but not identicalattributes is one of the main causesof errors in DataLex. The ‘Check FactCross References’ button allows you tocheck for such errors.


2.5.3 Check Fact Translations

Use of the ‘Check Fact Translations’ button enables you to check thatyour attributes are expressed correctly.

For each attribute in the knowledge-base, in the order in whichthey occur, it shows: (i) prompts (questions); (ii) a translation inpositive form; and (iii) a translation in negative form. For example,the interrogative, positive and negative translations of the attribute‘s11(a) applies’ are as follows:

-Does s11(a) apply?

-S11(a) applies.

-S11(a) does not apply.

Use the ‘Check Fact Translations’button to check that your attributes areexpressed correctly.

2.6 Some style guidelines for DataLex applications

Although DataLex is designed to be fairly flexible, it is worth bearingin mind the following guide-lines for developing rule-bases:

2.6.1 Simplicity

Try to aim for simplicity wherever possible. Complicated kludgesand workarounds detract from the readability of the code and canhave unexpected repercussions, particularly when theknowledge-base is later expanded or changed. Don’t use facilitiessimply because they are available.

2.6.2 Isomorphism

Where the knowledge-base represents rules from a legal sourcedocument such as a piece of legislation, try to directly translate thestatutory rules into DataLex rules, observing as far as possible theorder and grouping of the legislative rules, and adding as littleinterpretation as possible. Keep other rules, such as interpretation or’common sense’ rules which do not derive directly from thelegislation, in a separate part of your rule-base.

2.6.3 Small rules

Avoid large and complicated rules. Small rules are easier tounderstand and will assist with automatic explanations.

2.6.4 Attribute Names

Include the legal basis for attributes in their descriptors, as in thelayout is in "material form" as defined in s.5. This will make for moremeaningful explanations. Avoid using unnecessarily longdescriptors. These make for convoluted questions and explanations.Do not use the translation and prompt options unnecessarily. Try


changing the attribute name to get DataLex to handle it properly,first. Avoid use of embedded attributes.

2.6.5 Rule Types

Use only the default rule type unless you have a good reason fordoing otherwise. Forward chaining rules and daemons shouldgenerally only be used to alter the operation of rules encompassingknowledge rather than to embody knowledge themselves.

2.6.6 Declarative Representation

Do not represent knowledge procedurally using DETERMINE andCALL statements except where unavoidable. Avoid being concernedabout the actual operation of knowledge-rich rules and insteadconcentrate on describing the item of knowledge with which you aredealing.

2.6.7 Comments

Avoid relying on comments to understand your code. The codeshould largely be transparent. However, you can use comments toindicate what legislative provisions you have omitted from yourDataLex representation.

3Rule-based inferencing (II): Descriptors, keywords and

expressions

Rules in knowledge-bases consist of keywords and descriptors.Keywords are used to join together, in a logical form, a number ofdescriptors, which are simply terms or phrases used to describe someobject, event etc. Descriptors may be any sequence of words orsymbols but must not contain keywords (although they can containthe lower case versions of them). Descriptors are generally written inlower case, with normal capitalisation. The most important types ofdescriptors, discussed in this Chapter, are (i) constants, (ii) facts orattributes, (iii) named subjects (a special type of attribute) and (iv)rule names.

3.1 Attributes (or ‘Facts’)

An attribute is any descriptor (a sequence of words or symbolswhich does not contain a keyword) which is not a constant (seebelow). The purpose of attributes is to hold values which aredetermined during the evaluation of a knowledge-base. Everysequence of text in a knowledge-base which is not a keyword or aconstant must therefore make sense as something which has a value.(Chapter 6 explains an exception where text follows the keywordTEXT)

Attributes are also referred to as ‘facts’ in the DataLex errormessages.

Attributes and other descriptors have amaximum length of 256 characters.

3.1.1 Consistent naming of attributes

Consistent naming of attributes, including consistency incapitalisation and punctuation, is vital to DataLex’s operation.DataLex does not forgive inconsistency.

Lack of consistency is the principalcause of DataLex applications runningother than as expected. Use the ‘CheckFact Cross References’ button to checkfor possible inconsistencies in namingof attributes.

DataLex cannot tolerateinconsistencies in either capitalisationor punctuation.

3.1.2 Boolean (true/false) attributes and their names

The default attribute type is boolean (that is, true/false). Whennaming boolean attributes, you should choose a name starting with a


subject, then a verb (expressed in the positive or negative) and,optionally, an object.

For example, each of the following is a boolean attribute, correctlyexpressed:

Subject Verb Objectthe claimant satisfies s23(1)the circuit layout is in material formsection 9 appliessection 9 does not apply to bills of exchange

Table 3.1: Examples of recommendedsubject/verb/object form

The purpose of the recommended subject/verb/object form isexplained below in relation to the generation of questions andexplanations (see 3.3 Generating questions and explanations).

3.1.3 Non-boolean attributes - types

DataLex recognises the following attribute types:

Type Values Example

BOOLEAN T/F/U (default type) See aboveINTEGER whole numbers only the number of applicantsREAL fractions accepted the number of degrees toleranceSTRING a string of text the alleged defamatory statementSEX M or F the sex of the claimantDOLLAR dollars and cents the value of the estatDATE a date the date of the intestate’s death

Table 3.2: Attribute types recognised byDataLex

Non-boolean attributes are introduced in one of two ways: (i)automatically by use; or (ii) formally by a declaration.

3.1.4 Automatic type recognition of non-boolean attributes

If the first use of an attribute in a knowledge-base requires DataLexto recognise it as something other than boolean, that type isautomatically associated with it. From then on, you must use theattribute consistently or an error message will result. In other words,DataLex is able to make an ’intelligent guess’ about the type ofnon-boolean attribute that is intended, based on other aspects of theexpression it is first found in. For example, in the expression ’IF thedate of arrival IS GREATER THAN 1 May 1977’, DataLex is able towork out that the attribute ’the date of arrival’ is probably anon-boolean attribute of type DATE, because another date (1 May1977) appears in conjunction with a relational operator.

3.1.5 Formal attribute type declarations for non-boolean attributes

While DataLex is generally accurate in recognising non-booleanattributes, it sometimes makes an error. This may be avoided orcorrected by an explicit declaration of the type of the attribute.

RULE-BASED INFERENCING ( I I) : DESCRIPTORS , KEYWORDS AND EXPRESSIONS 21

The syntax for formal declarations is:

TYPE attribute-name

optionally followed by a list of translations and valid ranges(discussed below).

For example, to declare attributes to be of the types ‘DATE’ and‘DOLLAR’:

DATE the date of the intestate’s death

DOLLAR the value of the estate

Because of these declarations, or because of automatic recognition,DataLex would only accept responses from a user that were of thespecified types.

Attribute declarations should appear outside of rules andprocedures. Otherwise, they can appear anywhere in aknowledge-base, provided they appear somewhere in theknowledge-base prior to where the attribute is first used. It is oftenconvenient to group them all at the start.

3.1.6 Range limitation of attribute values [advanced]

If there was a need to further limit the range of acceptable responsesfrom the user (eg to dates only within a specified period, or toamounts less than a certain maximum), then a RANGE statement isavailable

The syntax is:

RANGE expression [TO expression]

The statement should appear immediately after a fact declaration.It may be used multiple times if there are many valid ranges. Wherethe optional TO expression is used it indicates that the value for thefact should be between the result of the first expression and the resultof the second expression. However, in this case the expressions mustproduce numeric results.

Some examples of RANGE statements are:

STRING the name of the intelligence agency

RANGE "ASIO"

RANGE "ASIS"

RANGE "DSD"

DOLLAR the value of the household chattels

RANGE Ø TO the value of the estate

3.1.7 Attribute names for non-boolean attributes

You must choose an attribute name which can be followed by an ‘is’then a value so that DataLex can correctly provide prompts andtranslations. For example, the non-boolean attribute declarations


given above will correctly result in the following prompts and (whenanswered) translations:

DATE the date of the intestate’s death

What is the date of the intestate’s death ?

The date of the intestate’s death is 1st January 1991.

DOLLAR the value of the estate

What is the value of the estate ?

The value of the estate is $250,000.DataLex cannot tolerate inconsistenciesin either capitalisation or punctuation.

3.2 Constants

Whereas attributes have a variable value which is determinedduring the evaluation of a knowledge-base, a constant has a fixedvalue. DataLex recognises any of the following descriptors asconstants: an integer (eg 1000), a real number (eg 7.15), a dollaramount (eg $950 or $950.00), the words ‘true’ and ‘false’ (booleanconstant) and the words ‘male’ and ‘female’ (sex constant), a date (inany sensible format), and any descriptor placed in double quotes (astring constant).

DataLex is generally able to automatically recognise constants,and to give them the correct type. If DataLex does not recognise adescriptor as being in any of these categories of constant, it assumesthat the descriptor is an attribute.

Constants are used primarily in expressions which use binaryoperators (eg PLUS; EQUALS; IS LESS THAN; IN) and inassignment statements (see below).

3.3 Generating questions and explanations

One of DataLex’s main features is its capacity to automaticallygenerate questions (prompts) by re-parsing the attribute that it isattempting to find a value for, into an interrogative form (ie byre-parsing the part of the rule it is at present evaluating). Similarly, itcan provide explanations by re-parsing rules that it has previouslyevaluated, substituting the values that it has established for thoserules.

3.3.1 Automatic generation of questions (prompts) andexplanations

Provided that boolean attribute names appear in thesubject/verb/object form explained above (see 3.1.2 Boolean(true/false) attributes and their names), or non-boolean attributenames appear in the ’is’ form explained above (see 3.1.3 Attributenames for non-boolean attributes), DataLex will normally be able toaffect sensible translations automatically, for use during problem


sessions. For the above examples, the following automatic promptsand translations would be generated by DataLex:

Does the claimant satisfy s23(1) ?

The claimant satisfies s23(1).

The claimant does not satisfy s23(1).

Is the circuit layout in material form ?

The circuit layout is in material form.

The circuit layout is not in material form.

Does section 9 apply ?

Section 9 applies.

Section 9 does not apply.

What is the date of the intestate’s death ?

The date of the intestate’s death is 1st January 1991.

Use the ‘Check Fact Translations’ button to check whether sensibleprompts and translations are being generated.

3.3.2 Automatic recognition of different forms of the same attribute

DataLex re-parses all boolean attribute names into a consistentpositive form for storage purposes, and so recognises differentgrammatical forms of the same attribute. For example, the followingstatements all refer to the same attribute:

the Act applies

the Act does not apply

the Act does apply

It therefore does not matter which form you use in a rule, asDataLex will normally understand that you are referring to the sameattribute. In other words, different forms of the attribute can be usedin different rules.

3.3.3 Verbs declarations - correcting DataLex’s grammar

While DataLex’s ability to ‘understand’ and generate differentgrammatical forms of the same attribute is reasonably sophisticated,it sometimes makes errors in translating verbs into different tenses.

For boolean attribute names, the most important component ofthe describing sentence is the verb. DataLex only knows about asmall list of very common verbs. For the remainder it simply makesan educated guess (ie it uses a fairly simple set of heuristic rulesconcerning the behaviour of verbs). DataLex has to be able to locatethe verb and transform its plurality and tense.

Where DataLex makes a mistake, its behaviour can be altered by adeclaration specifying that a word is a verb and giving theappropriate forms. The syntax for this is:


VERBS { base~first_person~third_person_singular~past }

as in:

VERBS

ma~ke~kes~de

s~ell~ells~old

This declares that the forms of ‘make’ (which has the base ‘ma’)are ‘make’ (first person), ‘makes’ (third person singular), and ‘made’(past).

Where another form is the same as the base, just leave that formblank and put two tildes consecutively ie ~~ .

This verbs declaration should appear outside of other declarationssuch as rules and procedures. It is sensible to put a list of all verbsat the start of a knowledge-base. However, a verbs declaration mayoccur more than once.

Here is a list of common verbs known to cause problems. If youuse any of them, you should declare that verb. It would be useful toinclude this list in any application, in case any of these verbs are used.

b~ear~ears~ore

se~nd~nds~nt

t~ell~ells~old

s~ell~ells~old

g~ive~ives~ave

t~ake~akes~ook

ma~ke~kes~de

c~ome~omes~ame

f~ind~inds~ound

b~uy~uys~ought

beg~in~ins~an

br~eak~eaks~oke

br~ing~ings~ought

s~ay~ays~aid

Where DataLex cannot recognise a verb, this can sometimes beremedied by putting the word ’will’ in front of the verb in theattribute, because DataLex recognises ’will .......’ as a compoundverb.

3.3.4 Adding your own attribute translations - PROMPT andTRANSLATE [advanced]

One of the main purposes of DataLex’s automatic re-parsing of rulesto produce prompts and explanations is so that there is normally noneed to maintain separate bodies of text for each attribute, with allthe complications this implies for development and maintenance.

However, if the automatic parsing performed by DataLex isinadequate for some reason, it is possible to ‘override’ it and todeclare what the prompt and translation should be for a particularattribute.


For example, the attribute ‘the date of death of the intestate’would normally generate the prompt ‘What is the date of death ofthe intestate?’ and the translation would be ‘The date of death of theintestate is ....’. This can be altered by adding PROMPT andTRANSLATE statements after an attribute type declaration for theattribute. For example:

DATE the date of death of the intestate

PROMPT when did the intestate die

TRANSLATE AS the intestate died on

The use of angle brackets (ie ) without an attribute name causesthe value of the attribute being evaluated to be substituted.

Where an attribute has more than one possible value, differenttranslations for each value may be provided. For example:

INTEGER the number of surviving children

PROMPT how many children survived the intestate

TRANSLATE 0 AS no children survived the intestate

TRANSLATE 1 AS one child survived the intestate

TRANSLATE AS children survived the intestate

Where no value appears (as in the last TRANSLATE statementabove) this is used as the default translation for values which do notmatch any of the other TRANSLATE statements.

Avoid using your own attributeprompts or translations if possible.DataLex knowledge-bases are easier tomaintain if translations are automatic.

3.4 Named subjects - names of people and things

Attribute descriptors often contain references to persons and thingsas their subjects (eg ‘the intestate’, ‘the property’). By default, thegenerated prompts and translations just use these embedded subjectdescriptions literally. If you wish, you can have these automaticallyreplaced with names, pronouns and possessives. Subjects which areto be treated in this way are referred to as named subjects.

The use of named subjects allows you to instantiate the dialoguesthat DataLex generates, making them appear much more responsiveto the answers you have already given.

Use named subjects wherever possible,as they improve communication.

3.4.1 Named subject declarations

Named subjects are a set of special attributes. They are declared inthe same way as attributes, but are given the types PERSON, THINGor PERSONTHING. When an attribute containing a defined subjectis first evaluated, automatic prompts for a subject name and, in thecase of persons, the subjects’ sex, will be issued. Where the type isPERSONTHING, the subject may be either a person or a thing (egwhere either a natural person or a company may be a subject). Aprompt (Is x a natural person ?) will be issued to determine this.


Examples:

PERSON the claimant

THING the agreement

PERSONTHING the first party

PERSON the intestate

Once a named subject is declared, DataLex will recognise it as anamed subject in any subsequent part of the rule-base, without needfor any further identification of it as such. Named subjects referred toin other attributes are recognised automatically, and their values aresubstituted in the other attributes.

For example, where there have been named subject declarationssuch as the ones above, an attribute in a rule such as ’the claimanthas made a statutory declaration concerning the agreement’ wouldgenerate a prompt such as ’Has John Smith made a statutorydeclaration concerning the Contract of Insurance?’.

3.4.2 The automatic attribute declarations [advanced]

When a named subject is declared, it results in up to another threeautomatic attribute declarations. These take the following forms:

the name of subject (set for all types)the sex of subject (set for PERSONS and PERSONTHINGs)subject is a natural person (set only for PERSONTHINGs)

Table 3.3: Examples of automaticattribute declarations.

These automatically declared attributes can be manipulated justlike normal ones. The types are STRING, SEX and BOOLEANrespectively. This allows you to work out whether or not aPERSONTHING is a natural person, or to force gender as in:

RULE the definition of "defacto" PROVIDES

IF the sex of the intestate EQUALS male THEN

the sex of the de facto IS female

ELSE

the sex of the de facto IS male

It also allows you to change the default prompts and translations,as in:

PERSONTHING the claimant

STRING the name of the claimant

PROMPT please enter the claimants’ name

TRANSLATE AS the claimants’ name is

BOOLEAN the claimant is a natural person

TRANSLATE true AS the claimant is a natural person

TRANSLATE false AS the claimant is a company

SEX the sex of the claimant

TRANSLATE male AS the claimant is a man

TRANSLATE female AS the claimant is a woman


3.5 Variable attributes [advanced]

An important aspect of DataLex is that it allows legal knowledge tobe represented in something approaching English (’quasi naturallanguage’ knowledge representation). This is one reason whypropositional logic is used as the form of representation, as opposedto predicate calculus. Predicate logic is, however, more powerful.One of its advantages is that it allows rules where there may be anumber of instances of an attribute which need to be considered inthe one problem session (eg the attribute ’is a child of the intestate’may be satisfied by three children, all of whom may need to beconsidered).

Variable attributes have been introduced into DataLex as anexperimental way of dealing with such problems. However, theydetract from the ’English-like’ nature of the syntax and should onlybe used sparingly.

A variable attribute is allowed to contain one (only) variableelement, which element is represented as . Whenever DataLexencounters this symbol in a rule, it looks for instance of theattribute in other rules which are identical except that they have thevariable element ’filled in’. These instances of the variability are then’read into’ the rule under consideration. In effect, DataLex createsmultiple versions of the rule under consideration, one for eachinstance of the variable element being satisfied. In any expressioncontaining the variable, each instance of the variable will begiven the same value. DataLex then proceeds to process whicheverversion of the rule is satisfied on the facts given. A variable attributeis therefore a shorthand way of writing multiple rules with slightlydifferent wordings.

For example, s32(4) of the Copyright Act 1968 (Cth) specifieswhether a person is a ’qualified person’ in determining whether awork is protected by copyright. Various different timing and otherconditions can satisfy the requirements for a ’qualified person’. Therule below shows that only one rule need be written to capture this.

RULE Copyright Act 1968 s32(4) PROVIDES

the author was a ’qualified person’ under s32(4) ONLY IF

the author was an Australian citizen OR

the author was an Australian protected person OR

the author was a person resident in Australia

If the system needs to determine at any time a value for theattribute “the author was a ’qualified person’ at the time the work wasmade under s32(4)” (emphasis added), in order to process anotherrule, the above rule will cause the following questions to be asked:

Was the author a ’qualified person’

at the time the work was made under s32(4)? [emphasis added]


Was the author an Australian protected person


Was the author a person resident in Australia


If the answer to any of these is ’yes’, the rule will fire and theattribute “the author was a ’qualified person’ at the time the work wasmade under s32(4)” (emphasis added) will obtain a ’true’ value.

Similarly, if the system needs to know a value for the attribute, “theauthor was a ’qualified person’ for a substantial part of the period duringwhich the work was made under s32(4)” (emphasis added), the rule willask the appropriate questions to obtain a value for this attribute.

In other words, one variable rule can be used to obtain values fornumerous similar but not identical attributes which have similarconditions for their satisfaction.

Variable attributes should only be usedsparingly and with considerable care.

3.6 Expressions - the use of operators

An expression consists of attribute and constant references,connected by operators (types of keywords). Expressions are used tobuild more complex statements. Attribute names and constants havealready been discussed. Operators therefore describe relationshipsbetween two attributes (in the case of binary operators), or (in thecase of a Unary operator) transform an existing attribute. Theavailable operators (in order of precedence) are:

3.6.1 (Pre) Unary Operators

NOT boolean NOTDAY extract day from dateMONTH extract month from dateYEAR extract year from date

3.6.2 (Post) Unary Operators

DAYS date days multiplierWEEKS date weeks multiplierMONTHS date months multiplierYEARS date years multiplier

3.6.3 Binary Operators

DIVIDED BY arithmetic divisionTIMES arithmetic multiplicationPLUS arithmetic additionMINUS arithmetic subtractionIN relation in (substring)EQUALS relational equality


NOT EQUALS relational inequalityIS GREATER THAN relational greater thanIS LESS THAN relational less thanIS GREATEREQUAL THAN relational greater equalsIS LESSEQUAL THAN relation less or equalAND boolean conditional ANDOR boolean conditional OR

(The normal AND and OR; AND has higher binding strength than OR;DataLex ceases evaluation of expressions where an ‘AND’ condition failsor an ‘OR’ condition is satisfied, and does not evaluate the other argumentsin the expression)

AND/OR boolean conditional OR (high binding)(A special OR with a higher binding strength than AND; use instead ofBEGIN-END pairs to ensure the order of evaluation)

AND/WITH boolean non-conditional ANDOR/WITH boolean non-conditional ORAND/OR/WITH boolean non-conditional OR (high binding)

(Special AND and OR operators where DataLex continues to evaluate theother arguments in the expression even though an ‘AND’ condition failsor an ‘OR’ condition is satisfied; Used to force DataLex to evaluate allalternatives.)

3.6.4 Examples of the use of expressions

the year in which the layout was made PLUS 10

the date of death PLUS 50 YEARS

YEAR the date of death

the value of the estate IS GREATER THAN 0

4Rule-based inferencing (III): Rule types, statements

and evaluation order

Rule types and statements control the order in which the evaluation ofrules, and attributes within rules, take place in a DataLex inferencingsession.

4.1 Order of evaluation of rules, and within rules, ininferencing

The main elements of the process whereby DataLex uses a rule-baseto draw inferences are as follows:

(i) The evaluation of a particular attribute (eg determining the truth ofa fact) is set as DataLex’s current ’goal’ - see 4.2 below concerning’LISTED’ and ’GOAL’ rules for how such goals are determined.

(ii) DataLex uses the rules in its rule-base to infer a value for thisattribute, principally through the use of backward chaining andforward chaining reasoning. DataLex uses both backward andforward chaining reasoning, in that rules are first invoked in abackward-chaining fashion whenever an attribute needs to beevaluated in order to determine whether a rule will ’fire’.However, whenever a new attribute value becomes known, allrules using that attribute are silently evaluated (a forwardchaining daemon).

However, which rules participate in the backward chainingprocess and which in the forward chaining process, and how theydo so, is determined to some extent by what types of rules theyare declared to be - see below concerning Types of rules.

(iii) Once a rule is being evaluated for purposes of either backward orforward chaining, the order in which the attributes and statementswhich make up the rule’s content are evaluated depends largely onthe order in which they occur.


4.2 Goals - LISTED rules and GOAL rules

DataLex must start its inferencing process by determining what itscurrent goal is. DataLex will attempt to evaluate the first rule in therule-base, if no rule is declared to be either a GOAL or LISTED. Ifonly one rule other than the first is a GOAL RULE or LISTED RULE,DataLex will start to evaluate that rule. Otherwise, DataLex will startby giving the user a choice between all rules in the rule-base whichare declared to be either GOAL or LISTED rules. The chosen rule isthen treated as the current goal.

For present purposes, there is no difference between LISTED rulesand GOAL rules.

4.2.1 Multiple GOALS

More than one rule may be declared to be a GOAL. When DataLex isinvoked it will automatically present the user with a list of the namesof all rules specified as GOALS, and ask the user which one is to beevaluated. Names of rules which are GOALS must therefore besensible enough to appear in a menu of goals.

4.3 Order of evaluation of rules

When the system is inferring a value using backward and/orforward chaining rules, it will evaluate rules in the order in whichthey appear in the knowledge-base. The order of appearance will notnormally have any effect on the outcome of a consultation, but canaffect whether questions of the user are asked in a sensible order.More general rules should be declared before more specific ones,where they relate to the same subject matter. Procedures may bedeclared in any order.

4.3.1 Calling rules [advanced]

All types of rules can be specifically CALLed. The syntax is:

CALL rule/procedure name

The statements for the named rule or procedure will be executedand control will be returned to the next statement after the CALL.This statement should only be used to control knowledge-rich rules.It should not be part of any rule which itself contains knowledge ofany sort.

4.3.2 Rule names

The rule name is used to document what the rule is about and to givea point of reference for calls. Each rule name should be different. Rulenames are essential if a rule is to be a GOAL RULE, because the user

RULE-BASED INFERENCING ( I I I) : RULE TYPES , STATEMENTS AND EVALUATION ORDER 33

must know which rule they are choosing to evaluate. Rule names areoptional but should be used.

Examples of some ways of naming rules:

RULE subsistence of copyright PROVIDES ....

RULE Copyright Act s36(1) PROVIDES ....

RULE Copyright Act s36(2) PROVIDES ....

RULE Copyright Act s36(2) [continuation 1] PROVIDES ....

4.3.3 The ORDER declaration [advanced]

The order of rule evaluation can be controlled by specifying the ruleorder in an ORDER block, with the syntax:

ORDER rule-name {THEN rule-name}

The main purpose of this is to allow rules to be written in theorder in which they appear in legislation, without this necessarilydetermining the order in which they might fire. An order declarationmust appear before the rules named.

4.4 Types of rules

4.4.1 The default type - backward and forward chaining

The default rule type is both backward chaining and aforward-chaining daemon. So, a rule that starts

RULE name of the rule PROVIDES ...

will be both backward and forward chaining, in default of anyother specification.

Use the default form unless there isgood reason not to.

4.4.2 Declaring other types of rules

You can alter this rule behaviour by declaring the type of the rule.The possible types are BACKWARD, DAEMON, DOCUMENT,FORWARD and PROCEDURE. Each is explained below.

To declare that a rule is a particular type, you put the type of therule before the keyword RULE at the start of the rule. Examples:

BACKWARD RULE the name of the rule PROVIDES ...

This rule will only be backward chaining.

FORWARD RULE the name of the rule PROVIDES ...

This rule will only be forward chaining.


4.4.3 Syntax for rule types

The rule declaration syntax is:

[GOAL]PROCEDURE|DAEMON|BACKWARD|FORWARD|RULE[RULE] [name] PROVIDES statements

4.4.4 Backward rules

If a rule is declared to be a BACKWARD RULE it is only ever used forbackward chaining.

4.4.5 Forward rules

FORWARD RULES are only used for forward-chaining. DataLexattempts to evaluate a FORWARD rule when the first attributeneeded to execute the rule becomes known. Where necessary,FORWARD rules will ask the user for any other attribute valuenecessary to evaluate the rule (ie they do not operate ‘silently’ - theyask questions where necessary).

4.4.6 Daemons

DAEMONS are like FORWARD rules but operate silently (ie theynever ask the user for information and will silently fail to file if theyneed to do so).

4.4.7 Procedures

PROCEDURES are not invoked by either forward or backwardchaining. Evaluation of a procedure must be invoked explicitly,either by the procedure being called (see 4.3.1 below concerningcalls), or by the procedure being declared to be a goal and invoked asa goal.

4.4.8 Documents

DOCUMENTS are like procedures but are used to generatedocuments (see later Chapter 6 concerning Documents ).

4.5 Statements

There are several different types of statements. These include:assignments and assertions (using ONLY IF, IS and ASSERT),conditional evaluation of facts (using IF-THEN and IF-THEN-ELSEstatements), conditional looping (using WHILE-DO andREPEAT-UNTIL statements), DETERMINE statements and CALLstatements.


For most purposes, conditional evaluation of facts(IF-THEN-ELSE) and assignments and assertions (using ONLY IF, ISand ASSERT) are the only types that need to be used.

4.5.1 Conditional evaluation of facts (IF-THEN-ELSE statements)

IF-THEN-ELSE statements provide for conditional evaluation ofattributes. The syntax is:

IF expression THEN statement [ ELSE statement ]

expression is evaluated and if true, the statement following theTHEN is executed. If an ELSE statement is provided and expressionevaluates false, then the statement following ELSE will be executed.

The ELSE part of the statement is optional.Examples:

IF it is raining

THEN you should take an umbrella

ELSE you should go out

4.5.2 Inclusive definitions

Where a statutory definition is only inclusive (ie not exhaustive), theIF-THEN form is appropriate. For example, the definition of’dramatic work’ in the Copyright Act 1968 (Cth) can be representedin part as

IF the work is a choreographic work or other dumb show OR the work

is a scenario for a script for a cinematograph film

THEN the work is a dramatic work

There is no ELSE because many other undefined types of dramamay qualify as dramatic works.

One rule can include a number of IF-THEN statements insuccession.

4.5.3 Assigning values - Assignments and Assertions (IS, ONLYIF and ASSERT)

Values may be assigned to attributes by use of the IS operator (or theequivalent ONLY IF operator) or (in the case of boolean attributes) byassertion.

4.5.4 Assertions

An assertion is used to state that an attribute has a true or false value(ie to assert that it is true or false). Assertions can therefore only beused with boolean (true or false) attributes. An assertion statementsimply consists of a boolean attribute name expressed in the positiveor negative form, optionally preceded by the keyword ASSERT orAND.


For example:

the Act applies

is the same as

ASSERT the Act applies

The following are also the same:

the corporation is an overseas corporation AND the Act does not apply

ASSERT the corporation is an overseas corporation AND the Act does not apply

The ASSERT or AND keyword should only be used where it isnecessary to separate multiple assignments and assertions, or toseparate an assignment or assertion from a previous expression. Forexample:

the circuit layout is in material form AND

the circuit layout is an eligible layout

4.5.5 Assignments

IS and ONLY IF are used to assert that two attributes have identicalvalues (but not that either are true/false), or that an attribute isidentical to a constant. They can therefore be used in either of twoways:

attribute IS constantattribute1 (unknown) IS attribute2 (known)

There is no difference between the IS and ONLY IF operators, butnormally the use of IS will yield more natural English statements inrelation to valued attributes (dates, numbers etc) where ONLY IF ismore appropriate in the case of booleans (true/false).

4.5.6 Syntax for assignments and assertions

The syntax for assignments and assertions is:

[AND|ASSERT] attribute IS expression

or

[AND|ASSERT] attribute ONLY IF expression

Where an ELSE statement is merely the negation of a THENstatement, this is exactly the same as an ONLY IF statement (which ispreferable as it is more understandable). For example,

IF it is raining

THEN you should take an umbrella

ELSE you should not take an umbrella

would be better expressed as

you should take an umbrella ONLY IF it is raining


4.5.7 DETERMINE Statement

The DETERMINE statement allows for control over attributeevaluation. The syntax is:

DETERMINE [IF] attribute

The effect is to cause DataLex to determine a value for theattribute by first evaluating any relevant backward chaining rules(commencing with any which have attribute as a conclusion), andthen, if necessary, prompt the end-user for a value.

The DETERMINE statement is sometimes useful as part of aGOAL RULE. It allows the user to simply specify that DataLexshould attempt to evaluate a particular attribute. For example, theFOI example given earlier could commence with a rule including thestatement:

DETERMINE IF a person has a legally enforceable right under

s11 to obtain access to a document

However, this procedural approach will defeat the purpose of adeclarative rule base if mis-used. In the above example, it wouldprovide no advantages.

Avoid the use of DETERMINEstatements.

4.5.8 CALL Statement [advanced]

The CALL statement allows rules and procedures to be invokedexplicitly. The syntax is:

CALL procedure-name

The statements for the named rule or procedure will be executedand control will be returned to the next statement after the CALL.This statement should only be used to control knowledge-rich rules.It should not be part of any rule encompassing knowledge of anysort. CALLs are a procedural device which detracts from thedeclarative nature of the knowledge-base. They are valuable mainlyfor document generation, which is inherently procedural (seeChapter 6).

Use of CALLs should generally beavoided (except in DOCUMENT rules).

4.5.9 WHILE-DO and REPEAT-UNTIL Statements [advanced]

The WHILE-DO and REPEAT-UNTIL statement pairs, provide forconditional looping. The syntax is:

WHILE expression DO statement

and

REPEAT statements UNTIL expression


4.5.10 Use of BEGIN - END pairs

Multiple statements can be grouped by use of a BEGIN-END pair.This is the same as using parentheses to group statements.

Example:

IF the Act does not apply THEN BEGIN

the claimant fails AND

there is nothing more to do

END

The use of BEGIN-END pairs is largelyunnecessary due to the AND/ORoperator (see below).


4.6 Appendix – List of Main Keywords (all types) used by DataLex

4.6.1 Rule types

RULE

GOAL

PROCEDURE

BACKWARD

FORWARD

DAEMON

DOCUMENT

EXAMPLE

4.6.2 Attribute types

BOOLEAN

INTEGER

REAL

DOLLAR

SEX

STRING

DATE

4.6.3 Document types

(Documents only)

PARAGRAPH

LINE

TEXT

4.6.4 Named subject types

THING

PERSONTHING

4.6.5 Translation operators

TRANSLATE - AS

PROMPT

4.6.6 Statement operators

ASSERT

DETERMINE

IF - THEN - ELSE

ONLY IF

IS

WHILE - DO

REPEAT - UNTIL

CALL

BEGIN - END

4.6.7 Expression operators

(Pre) Unary Operators

NOT

MINUS

PLUS

DAY

MONTH

YEAR

(Post) Unary Operators

DAYS

WEEKS

MONTHS

YEARS

Binary Operators

DIVIDED BY

TIMES

PLUS

MINUS

IN

EQUALS

NOT EQUALS

IS GREATER THAN

IS LESS THAN

IS GREATEREQUAL THAN

IS LESSEQUAL THAN

AND/OR

AND/OR/WITH

AND

AND/WITH

OR

OR/WITH

4.6.8 Layout operators

(Documents only)

NUMBERED

LEVEL

4.6.9 Miscellaneous keywords

RANGE - TO

VERBS

PROVIDES

ORDER

5Integration of DataLex knowledge-bases with AustLII

and other LIIs

5.1 Overview - Integration of DataLex knowledge-bases withtheir sources

DataLex has five principle features which enable it to be integratedinto the web context, and, in particular, into AustLII and AustLIICommunities:

1. Automated addition of links to AustLII legislation;

2. Automated addition of links to case law on AustLII or anycollaborating legal information institute (LII), or with a citationtable in the LawCite citator;

3. Explicit links to any other web resources;

4. Explicit links to searches over AustLII (or other search engine); and

5. Cooperative inferencing using knowledge-bases from multiplepages or sites.

Further forms of integration which are not yet available are theinclusion of links from AustLII primary materials to DataLexknowledge-bases, and the inclusion of knowledge-bases in AustLIIsearch results.

See the articles listed in Chapter 1 for the theoretical advantages ofvarious types of integration discussed in this chapter.

5.2 Automatic links to AustLII legislation

Links to names of Acts (and sections within Acts) that are located onAustLII can be added automatically to your knowledge-base, withoutthe need to create explicit links to those Acts or sections.

To effectively create links to AustLII legislation, observe thefollowing guidelines:

• Each time an Act or section is referred to in the body of a rule, putthe full name of the Act and section (for example ‘Privacy Act 1988


section 6D’). If the Act name is not included, the mark-up softwaremight not be able to determine in which Act the section is to befound.

• Reference to ‘section 5’ or ‘s.5’ or ‘s5’ or ‘s5(3) or ‘subsection 5(3)’are effective, but ‘paragraph 5(3)’ is not – change ‘paragraph’ to‘section’.

• Automatic links are not created to words defined in Acts.However, as shown below, explicit links can be created to suchdefinitions.

• Automatic links are not (as yet) provided to legislation injurisdictions outside Australia, but explicit links may be created tosuch legislation (see below).

5.3 Automatic links to case law

Where a decision in a case is properly cited (either by a neutralcitation or proprietary citation) in the name of a rule, or in the bodyof the rule, this will result in the automatic creation of a hypertextlink to either (i) the text of the decision, if the decision is included inAustLII or another collaborating LII (eg NZLII, BAILII, HKLII,PacLII, SafLII, CanLII), or (ii) the LawCite citator, if the decision hasa citation table there. The LawCite record for a decision can also beaccessed from that decision.

Links to these cases are available in relevant reports andexplanations, and to provide assistance when the user is answeringquestions relevant to a case. For example, in the Finder KBapplication, when the user is asked about the finder of a chattel ‘Washe the occupier of the premises ?’, and responds ‘Why?’, the systemreplies ‘This will help determine whether or not the situation issimilar to Armory v Delamirie [1722] EWHC KB J94.’, with a link tothe LawCite citator entry.

As discussed in Chapter 7, with EXAMPLE rules based ondecisions in particular cases, it is particularly important that a fulltitle and citation for the decision be included in the title of theEXAMPLE. Automatic links to cases in Reports means that the usercan go to the cases cited in the Report, in order to assess whetherthey agree with the suggestions for following and distinguishingparticular cases given in the Report. In making such a decision theycan inspect not only the text of the suggested cases, but also theLawCite record for each of the suggested cases in order to determinewhether there are subsequent cases that have a bearing on thesuggested cases (and may have been decided after theknowledge-base was written). For discussion of the value of suchfacilities, see the article ‘Utilising AI in the Legal Assistance Sector –Testing a Role for Legal Information Institutes’ cited in Chapter 1.

http://www.bailii.org/ew/cases/EWHC/KB/1722/J94.html

INTEGRATION OF DATALEX KNOWLEDGE-BASES WITH AUSTLI I AND OTHER LI IS 43

5.4 Explicit links in a knowledge-base (the LINK ... TO ...keywords)

In addition to automatic links to AustLII, specific links can bespecified in the knowledge-base. The keywords LINK and TO areused to specify in a knowledge-base that a particular word or phraseis always to appear as a hypertext link to a particular URL. This isvery useful for creating links to definitions or cases.

LINK ...TO ... can be used to create links from a knowledge-base toanywhere on the world-wide-web, not just to AustLII.

5.4.1 Example

LINK document of an agency TO

http://www.austlii.edu.au/au/legis/cth/consol_act/foia1982222/

s4.html#document_of_an_agency

RULE Freedom of Information Act 1982 (Cth) s11(a) PROVIDES

s11(a) applies ONLY IF

the document is a document of an agency AND


5.5 Stored searches from DataLex knowledge-bases

It is also possible to use LINK ...TO ... to create links from aknowledge-base to a stored search over AustLII, or over any otherweb-based search engine.

5.5.1 Example

To link to a search over AustLII for the phrase ‘official document of aMinister’:

LINK official document of a Minister TO

http://www.austlii.edu.au/cgi-bin/sinosrch.cgi?

method=auto&query=%22official+document+of+a+Minister%22

RULE Freedom of Information Act 1982 (Cth) s11(b) PROVIDES

s11(b) applies ONLY IF

the document is an official document of a Minister AND


5.6 ‘Co-operative inferencing’ – knowledge-bases in multiplelocations

‘Co-operative inferencing’, as we are tentatively calling it, is aninnovative aspect of DataLex. It allows different knowledge-basedevelopers to place knowledge-bases on any web page anywhere incccse other knowledge-bases located elsewhere on the web whichthey specify are to be ‘included’. In this sense, knowledge-base


development becomes a ‘co-operative’ activity where developers cancontribute their small (or not so small) knowledge-bases to a largerenterprise.

5.6.1 The INCLUDE keyword

The use of the keyword INCLUDE in a knowledge-base, followed bythe URL of another page containing a DataLex knowledge-base, willcause the second knowledge-base to be loaded with the firstknowledge-base, and the two run together.

More than two knowledge-bases can be declared to be INCLUDEd.There is no limit on the number.

It does not matter if an INCLUDEd knowledge-base INCLUDEsthe knowledge-base that INCLUDEd it - ie DataLex does not go intoan endless loop loading the same knowledge-bases.

It is useful to make the URLs of INCLUDEd knowledge-bases livelinks, so that users of a knowledge-base can conveniently view allknowledge-bases which are to be included in a consultation. See the‘FOI s11 (start here)’ knowledge-base for examples.

5.6.2 Example

To include a KB ‘DefinitionOfDocument’ in the evaluation of thisfreedom of information KB, so that the attribute ‘the item requestedis a document’ will be evaluated:

INCLUDE http://austlii.community/wiki/DataLex/DefinitionOfDocument

GOAL RULE Access to documents under Freedom of Information Act 1982 (Cth)

s11 PROVIDES

the person applying does have a legally enforceable right under

s11 of the Freedom of Information Act 1982 to obtain access

to the document requested ONLY IF

the item requested is a document AND

Freedom of Information Act 1982 s11 (1)(a) applies AND/OR

Freedom of Information Act 1982 s11 (1)(b) applies

5.6.3 At least one GOAL rule must be specified if INCLUDE is used

As in the example above, you must specify which rule is the GOALRULE that is to start the consultation, because the operation ofINCLUDE means that you cannot be certain which rule DataLex willconsider is the first one appearing in your knowledge-base.

If more than one GOAL RULE is specified in a set of ‘co-operative’knowledge-bases, the user will be given a choice of which rule is tostart the consultation. GOAL RULEs may be declared in anyknowledge-base.

6Document assembly using DataLex

DataLex also includes an automated document generation (or’assembly’) component. This aspect is not yet developed fully. Thefeatures described below are sufficient to generate simpledocuments.

6.1 DOCUMENT rules

Documents may be generated by declaring rules of typeDOCUMENT. Normally, one DOCUMENT rule will generate oneparagraph of a document, and a group of rules can be used togenerate all the clauses of a legal document. DOCUMENT rulesdiffer from other types of rules only in that the statementsPARAGRAPH and TEXT are available to write paragraphs todocuments. The syntax is discussed below under the headingStatements.

Document rules are only ever effective if they are declared a GOALrule or if explicitly called (via the CALL statement) from other rules.

6.1.1 DOCUMENT rules as goals

If the GOAL rule is a DOCUMENT rule, the usual report generatedby a consultation is replaced by the generated document (ie no reportis generated).

6.1.2 Example - a clause of a will

The following example is a DOCUMENT rule for one clause of a will,with two alternative conditional forms of the clause. The elements ofthe example are explained below.

DOCUMENT Revocation PROVIDES

IF all former testamentary disposition are to be revoked THEN

NUMBERED PARAGRAPH I revoke all former testamentary dispositions.

ELSE

NUMBERED PARAGRAPH I revoke all former testamentary dispositions

except clause(s) of my testamentary disposition dated

which clause(s) I hereby confirm.


6.2 Text generation statement types - PARAGRAPH andTEXT

The two statement types PARAGRAPH and TEXT allow text to beadded to documents from rules of type DOCUMENT. They have noeffect in non-document rules, and should not be used in such rules.The syntax for these special types of document statements is:

[NUMBERED] [LEVEL number] [PARAGRAPH|TEXT] text

6.2.1 The text argument and embedded attributes

The text argument is a piece of text (of any length) to be generated aspart of the document being assembled if the conditions of the rule aresatisfied. A text argument may include embedded attributes, but isnot in itself an attribute, so DataLex does not attempt to obtain a value(true/false) for it. DataLex recognises that a piece of text is text, not anattribute, because it is preceded by the statement type PARAGRAPHor TEXT.

However, DataLex does attempt to obtain a value for any attributesembedded within the text, provided that those attributes are enclosedin angle brackets (ie ).

For example, the following statement would cause all of the textafter ‘PARAGRAPH’ to be printed in a new paragraph. The values ofthe embedded attributes (the attributes within angle brackets ie )will be obtained from the user in a dialogue (see below).

PARAGRAPH I revoke all former testamentary dispositions

except clause(s) of my testamentary disposition dated


The example given above and on the previous page will generatethe following dialogue:

1) Are all former testamentary disposition to be revoked ?

** n

2) What is list of clauses from the old will which are to be saved?

** 1, 5 and 17

3) What is the date of the old will ?

** 1 May 1993

REPORT

1. I revoke all former testatmentary dispositions except clause(s)

1, 5 And 17 of my testamentary disposition dated 1 May 1993 which

clause(s) I hereby confirm.

DOCUMENT ASSEMBLY USING DATALEX 47

6.2.2 Differences between PARAGRAPH and TEXT

The difference between the two types of statements is simply one oflayout: the PARAGRAPH statement places an HTML paragraphmarker before the text (ie a carriage return and a blank line) andTEXT just inserts a space (ie no new line).

For example, the statements:

PARAGRAPH I revoke all former testamentary dispositions.

I give all my property to my husband.

will be generated as:

I revoke all former testamentary dispositions. I give all my property to my husband.

The correct code to cause the second sentence to be a newparagraph is:

PARAGRAPH I revoke all former testamentary dispositions.

PARAGRAPH I give all my property to my husband.

PARAGRAPH must be used to cause anew paragraph of text to be included ina document. It is insufficient to simplyplace new paragraphs or lines in thetext argument, as DataLex will ignorethese when it generates the document.

6.3 ‘Personalising’ documents - embedded attributes

Where a document contains variable information (eg the name of thetestator, the value of property, the date of death), this variableinformation (an attribute) can be included in the text of a documentstatement by embedding the attribute in the text. In the exampleabove, the embedded attribute ‘’ will cause the user to be prompted to listthose clause numbers, and the numbers will then be included in thegenerated document. The embedded attribute ’‘ will cause the user to be prompted for the value of thatattribute.

6.3.1 Relationship between named subjects and embedded attributes

DataLex will not recognise that an attribute is embedded in text justbecause it has been declared to be a named subject. For example, thedeclarations

DATE the date of the old will

STRING list of clauses from the old will which are to be saved

will not cause DataLex to ask the user for values in a rule where anglebrackets have been omitted, such as

PARAGRAPH I revoke all former testamentary dispositions

except clause(s) list of clauses from the old will which are to be

saved of my testamentary disposition dated the date of the old will



It is necessary to put attributes in anglebrackets (< >); merely making themnamed subjects is insufficient.

However, merely putting an attribute in angle brackets does notgive it a type - to do so it is necessary to declare it as a named subjectas well. For example, in the dialogue above, an answer ’a few weeksago’ to the question ’What is the date of the old will ?’ will beaccepted. In contrast, if the declaration ’DATE the date of the oldwill’ had been made, the following dialogue will occur:

3) What is the date of the old will ?

** a few weeks ago

Please respond with a date.

It is preferable to declare all embedded attributes as namedsubjects, as well as embedding them in angle brackets, so as toensure that the user always gives the correct type of answer (eg adate).

6.3.2 Prompts for embedded attributes

The prompt for an embedded attribute is always ’what is .....?’, evenif the embedded attribute is a named subject of type PERSON.Therefore, it is necessary to either tolerate questions such as ’What isthe testator’s spouse?’ , or to embed attributes in the form ’’ so as to get a question ’What is thename of the testator’s spouse?’. However, if this second approach istaken, the embedded attribute will not match the named subject ’thetestator’s spouse’ and this may cause other problems when thatattribute is used elsewhere. In some cases it may be better to toleratethe awkwardly phrased question.

6.4 Alternative clauses in a document

An important element in document assembly is to allow alternativeversions of a clause or paragraph or sentence to be generated,depending on the user’s circumstances. For example, the structure ofthe example given above for a clause of a will is as follows:

DOCUMENT Revocation PROVIDES

IF all former testamentary disposition are to be revoked THEN

NUMBERED PARAGRAPH .......Alternative text (1).......

ELSE

NUMBERED PARAGRAPH ........Alternative text (2)........

Because of the use of the IF-THEN-ELSE statement, which versionof the clause is generated depends upon the value of the attribute ‘allformer testamentary disposition are to be revoked’. The user will beprompted for a value for this attribute by being asked ‘Are all formertestamentary disposition to be revoked?’. If the user answers ‘yes’,then text (1) will be generated, but otherwise (ELSE) text (2) will begenerated.

DOCUMENT ASSEMBLY USING DATALEX 49

By the use of IF-THEN-ELSE statements, and any other conditionalstatements used in DataLex, templates for complex documents maybe created.

6.5 Generating successive paragraphs of a document - use ofCALL statements

The discussion above concentrates on the generation of singleparagraphs of documents. To assemble a whole document it isusually necessary to create a GOAL rule which provides an overallprocedural order for the creation of the document. For example, inthe Will Generator example below, the following GOAL rule is used:

GOAL DOCUMENT Last Will & Testament PROVIDES

the date of execution of the will IS today

CALL Preamble

CALL Revocation

CALL Contemplation of Marriage

By use of the CALL statement, this rule calls three other rules insuccession, those with the names ‘Preamble’, ‘Revocation’ and‘Contemplation of Marriage