Capturing and Analyzing Legal Requirements

.lusoftware verification & validationVVS

Capturing and Analyzing Legal Requirements

Lionel Briand

Interdisciplinary Centre for Security, Reliability and Trust (SnT) University of Luxembourg

MoDRE - August 2015

Acknowledgments • Mehrdad Sabetzadeh

• Ghanem Soltana

• Nicolas Sannier

• Morayo Adedjouma

• Rajwinder Panesar-Walawege

• Domenico Bianculli

• Wei dou 2

SnT Centre and SVV Lab •  SnT centre, Est. 2009: Interdisciplinary, ICT

security-reliability-trust

•  250 scientists and Ph.D. candidates, 25 industry partners

•  SVV Lab: Established January 2012, www.svv.lu

•  25 scientists (Research scientists, associates, and PhD candidates)

•  Industry-relevant research on system dependability: security, safety, reliability

•  Partners: Cetrel, CTIE, Delphi, SES, IEE, Hitec …

3

Mode of Collaboration •  Tight, long-term industrial collaborations •  Well-defined problems in context •  Strong emphasis on high-impact research

4

Context • Many organizations and systems need to comply with laws,

regulations, prescribed business processes, standards …

• Such legal provisions are complex: large documents, many concepts and cross-dependencies

• They can be more prescriptive or declarative in nature

• They often require interpretation in context

• The interpretation and formalization of such provisions lead to legal requirements

5

Outline • Different challenges related to legal requirements:

• Modeling, simulating, analyzing, and verifying prescriptive laws

• Dealing with dependencies among legal provisions

• Managing compliance with standards (e.g., safety)

• Checking compliance with prescribed business processes

• Reflections 6

Modeling, Simulating, and Analyzing Prescriptive Laws

Problem definition

•  CTIE: Government Computing Centre, Model-Driven Engineering

• How can we ensure that "e-Government systems"comply with the law?

• How does the evolution of law "and systems impact compliance?

• What is the impact of changes in the law on societies and administrations?

8

Challenges • Laws are often abstract. They need to be "

interpreted in context

• Multiple stakeholders are involved."A shared understanding of legal requirements"is necessary

• Compliance verification and evolution"analysis need to be done on a large scale

9

Modeling legal requirements • Model legal requirements in an intuitive and yet precise way

• A model is an analyzable interpretation of the natural language provisions in legal texts

• Enable scalable analysis of legal requirements • Reliance on open modeling standards: access to mature

training and tools, reduced licensing fees • Tax law is used as a concrete case study

• But the solutions are generalizable

10

Applications

11 11

Simulation"data

Software system

Traces to

Generates

Results match?

Impact of legal"decisions

Models of legal requirements

Example 1: Domain model

12

Address- country: Country

TaxPayer

addresses

taxpayer

*

Income

LocalIncome ForeignIncome

incomes

taxpayers

1..*

1NonResidentTaxPayer

ResidentTaxPayer

- LU- ...

Country«enumera(on»

1..*

• Useful for capturing the main concepts in the law and the relationships between them (here, the tax law)

Example 2: Policy model

13

•  Activity diagram capturing administrative legal policies

•  Profile: Simulatable model with traceability to the law, agents, and tax data

Benefits

14

• Models capture knowledge in an explicit way

• Models provide a "communication bridge"between stakeholders

• Models are key to "automation and"scalability

System /SoftwareAnalyst

LegalExpert

Models

Methodology for modeling legal policies

15

Relevant legal texts

Feedback

Model the domain

Model legal policies

Domain model

Custom semantic concepts for policies

Policy models

Model validation

Case study •  Complete set of models for calculation of withholding taxes

•  Tax class categorization

• All deductions and credits "associated with withholding

•  Commuting Expenses, Misc. Expenses,"Spousal Expenses, Special Expenses, "Extraordinary Expenses,"Employment Credit, Pension Credit, "Single Parent Credit

• Models validated with legal experts for understandability and accuracy

16

Policy simulation

17 17

Simulation"data

Software system

Generates

Results match?

Impact of legal"decisions

Models of legal requirements

Simulation of withholding taxes

18

Requires "input data Requires "

executable"code

Gross salary

Tax classcategory

Taxes due

f

Taxableincome

f

Expenses

Simulation process

19


Domain model

Policy models

Results report

Is data available?

Yes

No

Instance model from existing data

Generate data

Use existing data

Generate code

Executable code

Run analysis

≠ Æ Ø

∞Instance

model from generated data


¨

Model-based simulator generator

20

Simulation data generator

• Data generation is necessary when

• Access to real data is hard (for example, due to privacy)

• Real data has not been collected (for example, due to costs)

• Simulation of future contingencies

21

Data generation strategy

1.  Capture characteristics of simulation population using probabilities

2.  Attach probabilistic information to domain model

22

Expert estimates

Census data

Domain model

Example probabilistic annotation

23

Source: STATEC, Luxembourg

60% of income types are Employment, 20% are Pension, and the remaining 20% are Other

Income

Employment

«probabilistic type»{frequency: 0.6}

Pension


Other


(abstract)

24


Domain model

Policy models

Results report

Is data available?

Yes

No

Instance model from existing data

Generate data

Use existing data

Generate code

Executable code

Run analysis

≠ Æ Ø

∞Instance

model from generated data


¨

Example output from simulator

25

Current commuting expenses (CE)

Changed CE: •  Min. distance: 4 à 10 •  Flat rate: 99 à 50 •  Max. flat rate (2574)"

dropped

Hypothetical scenario

Case study •  Annotated domain model for withholding taxes"

with information from STATEC

•  Automatically generated "populations of up to 10,000 "tax payers

•  Simulated various combinations"of deductions and credits

•  Highlights:

•  Data generation and simulation are highly scalable

•  Statistical properties of the real population could be "reproduced with > 2,000 generated tax payers

26

are constraints (not to be confused with classes). Referencesto Fig. 5 for the stereotypes and Fig. 6 for the examples arenot repeated throughout the section.

• «probabilistic type» extends the Class and EnumerationLiteral

metaclasses with relative frequencies. For example, «proba-bilistic type» is applied to the specializations of Income, statingthat 60% of income types are Employment, 20% are Pension,and the remaining 20% are Other. In this example, the relativefrequencies for the specializations of Income add up to 1.This means that no residual frequency is left for instantiatingIncome (the parent class). Here, instantiating an Income is notpossible as Income is an abstract class. One could neverthelesshave situations where the parent class is also instantiable. Insuch situations, the relative frequency of a parent class isthe residual frequency from its (immediate) subclasses. Anexample of «probabilistic type» applied to enumeration literalscan be found in the (truncated) Disability enumeration class.Here, we are stating that 90% of the population does not haveany disability, while 7.5% has vision problems.

• «probabilistic value» extends the Property and Constraint

metaclasses. Extending the Property metaclass is aimed ataugmenting class attributes with probabilistic information. Asfor the Constraint metaclass, the extension is aimed at provid-ing a container for expressing probabilistic information usedby two other stereotypes, «multiplicity» and «dependency»(discussed later). The «probabilistic value» stereotype has anattribute, precision, to specify decimal-point precision, and anattribute, usesOCL, to state whether any of the attributes ofthe stereotype’s subtypes uses OCL to retrieve a value froman instance of the domain model. A «probabilistic value»can be: (1) a «fixed value», (2) «from chart», which couldin turn be a bar or a histogram, or (3) «from distribution»of a particular type, e.g., normal or triangular. The namesand values of distribution parameters are specified using theparameterNames and parameterValues attributes, respectively.The index positions of parameterNames match those of thecorresponding parameterValues. The same goes with the indexpositions of items/bins and frequencies in «from chart».

- Uniform Range- NormalDistribution- TriangularDistribution- BetaDistribution- GammaDistribution- ...

DistributionType«enumera(on»

EnumerationLiteral«metaclass»

Class«metaclass»

- frequency: Real [1]probabilistic type

«stereotype»

Property«metaclass»

- frequencies: Real [1..*]from chart«stereotype»

- bins: String [1..*]from histogram

«stereotype»

- items: String [1..*]from barchart

«stereotype»

- value: String [1]fixed value«stereotype»

- type: DistributionType [1]- parameterNames: String [*]- parameterValues: String [*]

from distribution«stereotype»

Constraint«metaclass»

Association«metaclass»

multiplicity«stereotype»

dependency«stereotype»

1..* OCLtrigger

0..1contexttargetMember 1 1 context

1 constraint

- objectPoolQueries: String [1..*]- reuseProbabilities: Real [1..*]

use existing «stereotype»

- precision: Integer [0..1]- usesOCL: Boolean [1]

probabilistic value«stereotype»

context 1

value dependency«stereotype»

type dependency«stereotype»

Fig. 5. Profile for Expressing Probabilistic Characteristics of a Population

«multiplicity»{targetMember: Income;

constraint: [income mult]}«type dependency»

{context: TaxPayer; OCLtrigger: [income types based on age]}

«from histogram» {usesOCL: false;

bins: [[0.5..0.7], [0.71..0.9], [0.91..1]];frequencies: [0.7, 0.2, 0.1]}

rate for vision disability

1 taxpayer

Disability - «probabilistic type» {frequency: 0.9} None- «probabilistic type» {frequency: 0.075} Vision- ...

«enumeration»

«fixed value» {value: 0; usesOCL: false}

rate for no disability

incomes 1..*

0..1 taxCard1 income

Income- gross_value: Real [1]- prorata_period: Real [1]

expenses *

1 income

allowances *1 beneficiaryExpense

- amount: Real [1] «from distribution» {usesOCL: true; precision: 2; context: Income; type: Uniform Range; parameterNames: [lowerBound, upperBound];parameterValues: [50, self.gross_value * 0.5 ]}

«use existing»{context: Expense;

objectPoolQueries: [self.getOwner().getHouseholdMembers()];reuseProbabilities: [0.7];}

TaxCard- invalidity: Real [1]«fixed value» {value: 0; usesOCL: false}

Employment


Pension


Other


«from barchart» {usesOCL: false;items: [1, 2, 3, 4];

frequencies: [0.8, 0.15, 0.045, 0.005]}

income mult

«from barchart» {usesOCL: false;

items: [Pension, Employment, Other];frequencies: [0.85, 0.1, 0.05]}

income types based on age

FromLaw

+ {static} invalidityFlatRate (in disabilityRate: Real): Real

Condition: self.disabilityType = Disability: : None

Condition: self.disabilityType = Disability: : Vision Condition: self.getAge() >= 60

- disabilityRate: Real [1] «value dependency»{context: TaxPayer; OCLtrigger: [rate for no disability, rate for vision disability]}«from histogram»{usesOCL: false; precision: 2; bins: [[0..0.2], [0.21..0.5], [0.51..0.7], [0.71..1]];frequencies: [0.4, 0.3, 0.2, 0.1]}- disabilityType: Disability [1]- birthYear: Integer [1] ...

TaxPayer

(abstract)

FromAgent- {static} TAX_YEAR: Integer [1] = 2015

(abstract)

(abstract)

Fig. 6. Partial Domain Model of Luxembourg’s Income Tax Law Annotated with Probabilistic Information

4

To answer the RQs, we ran the simulator (automaticallyderived from the six policy models) over simulation data(automatically generated by Alg. 1). We discuss the resultsbelow. All the results were obtained on a computer with a3.0GHz dual-core processor and 16GB of memory.RQ1. The execution times of the data generator and thesimulator are influenced mainly by two factors: the size ofthe data to produce –here, the number of tax cases– andthe number and complexity of the policy models to simulate.Note that the data generator instantiates only the slice modelthat is relevant to the policies of interest and not the entiredomain model. This is why the selected policy models havean influence on the the execution time of the data generator.

To answer RQ1, we measured the execution times of thedata generator and the simulator with respect to the abovetwo factors. Specifically, we picked a random permutation ofthe six policies –ID, CIS, PE, FD, LD, CIP– and generated10,000 tax cases, in increments of 1,000, first for ID, then forID combined with CIS, and so on. When all the six policiesare considered, a generated tax case has an average of ⇡24objects. We then ran the simulation for different numbers oftax cases and the different combinations of policy models con-sidered. Since the data generation process is probabilistic, weran the process (and the simulations) five times. In Figs. 9(a)and (b), we respectively show the execution times (average ofthe five runs) for the data generator and the simulator.

As suggested by Fig. 9(a) the execution time of the datagenerator increases linearly with the number of tax cases. Wefurther observed a linear increase in the execution times of thedata generator as the size of the slice model increased (detailsare not shown). In particular, FD and LD introduced severalnew concepts into the slice model, resulting in a proportionalincrease in the slope of the fourth and fifth curves (fromthe bottom) in Fig. 9(a). We note that as more policies areincluded, the slice model will eventually saturate, as the largestpossible slice model is the full domain model.

With regards to simulation, the execution times partlydepend on the complexity of the workflows in the underlyingpolicies (e.g., the nesting of loops), and partly on the OCLqueries that supply the input parameters to the policies. Thelatter factor deserves attention when simulation is run overa large instance model. Particularly, OCL queries containingiterative operations may take longer to run as the instancemodel grows. The non-linear complexity seen in the fifth andsixth curves (from the bottom) in Fig. 9(b) is due to an OCLallInstances() call in LD, which can be avoided by changing thedomain model and optimizing the query. This would result inthe fifth and sixth curves to follow the same linear trend seenin the other curves. Since the measured execution times arealready small and reasonable, such optimization is warrantedonly when the execution times need to be further reduced.

As suggested by Figs. 9(a) and (b), our data generator andsimulator are highly scalable: Generating 10,000 tax casescovering all six policies took ⇡30 minutes. Simulating thepolicies over 10,000 tax cases took ⇡24 minutes.

(b)

(a)

(c)

100 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k

00.

10.

20.

30.

40.

5

Distance for age histogramsDistance for income histogramsDistance for income type histogramsDistance for aggregation of histograms

Number of generated tax cases

Eucl

idea

n di

stan

ce

0 1K 2K 3K 4K 5K 6K 7K 8K 9K

05

1015

2025

30 ID + CIS + PE + FD + LD + CIPID + CIS + PE + FD + LDID + CIS + PE + FDID + CIS + PEID + CISID


Exec

utio

n tim

e (in

min

utes

)

0 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k

04

812

1620

24 ID + CIS + PE + FD + LD + CIPID + CIS + PE + FD + LDID + CIS + PE + FDID + CIS + PEID + CISID


Exec

utio

n tim

e (in

min

utes

)

Fig. 9. Execution Times for Data Gen-eration (a) & Simulation (b); EuclideanDistances between Generated Data &Real Population Characteristics (c)

RQ2. To answer RQ2,we compare informationfrom STATEC for age, in-come and income type, allrepresented as histograms,against histograms builtover generated data of var-ious sizes. Similar to RQ1,we ran the data generatorfive times and took the av-erage for analysis. Amongalternative ways to com-pare histograms, we useEuclidean distance whichis widely used for this pur-pose [9]. Fig. 9(c) presentsEuclidean distances for theage, income, and incometype histograms as wellas the Euclidean distancefor the normalized aggre-gation of the three. Asindicated by the figure,the Euclidean distance forthe aggregation falls below0.05 for 2000 or more taxcases produced by our datagenerator. This suggests aclose alignment betweenthe generated data andLuxembourg’s real popula-tion across the three crite-ria considered.

The above analysis provides confidence about the qualityof the data produced by our data generator. The analysisfurther establishes a lower-bound for the number of tax casesto generate (2,000) to reach a high level of data quality.

RQ3. We answer RQ3 using the Kolmogorov-Smirnov (KS)test [10], a non-parametric test to compare the cumulativefrequency distributions of two samples and determine whetherthey are likely to be derived from the same population. Thistest yields two values: (1) D, representing the maximumdistance observed between the cumulative distributions of thesamples. The smaller D is, the more likely the samples areto be derived from the same population; and (2) p-value,representing the probability that the two cumulative sampledistributions would be as far apart as observed if they werederived from the same population. If the p-value is small(< 0.05), one can conclude that the two samples are fromdifferent populations.

To check the consistency of data produced across differentruns of our data generator, we ran the generator five times,each time generating a sample of 5000 tax cases. We thenperformed pairwise KS tests for the age, income, and incometype information from the samples. Table I shows the results,

9

Other Applications of Simulation

• Detection of undesirable situations, e.g., sharp increases in tax rates for certain categories of tax payers

• Optimization of parameters, e.g., thresholds and rates for achieving revenue targets

27

Legal Text Analysis

Motivation

• Laws are complex natural language"documents

•  Interdependencies between legal"provisions cause interdependencies"between legal requirements

• Automated extraction and analysis"of interdependencies will "significantly reduce compliance costs

29

Law

Legal Requirements

System Requirements

Design

Test Plan

dependencies

dependencies

Interdependencies in the law

30

Art. 2 Individuals are considered resident taxpayers if they have either their fiscal or habitual residence in the Grand Duchy. Individuals are considered non-resident taxpayers if they neither have their fiscal nor their habitual residence in the Grand Duchy and if they have local income within the meaning of Art. 156. Resident taxpayers are subject to income tax because of their income, both local and foreign. Non-resident taxpayers are subject to income tax only because of their local income within the meaning of Art.156.

R1: The system shall levy taxes on non-residents’ local income as per the annual tax scale.

derived from Cross reference to

another article

Requirements dependencies

R1

derivedfrom

Article 2 Article 156 Cross Reference(CR)

R2 R3 R4

derivedfrom

potential dependencies

R2: For non-residents, rental and lease income earned in the Grand Duchy shall be considered as local income.

31

Automatic resolution of "legal cross references

32

Other important applications • Generation of navigable legal portals

•  Identifying related legal texts that need to be reviewed for consistency after a change occurs

•  Identifying governmental web pages and online forms that need revision after a change in the law

• Checking for potentially-problematic situations

• For example, cyclic cross references 33

Approach

34

Based on "Natural Language Processing

Defining the Schema of a Text

•  Schema of a legal text: structure and navigation flow of the provisions of a legal text

•  Guidelines such as “Traité de Légistique Formelle” provide a generic structure for legislative texts in Luxembourg

•  Deviations are nevertheless frequent!

•  Tailoring the schema is often necessary in order to handle drafting practices evolution over time

35

36

Generic Text Schema

•  The structural elements inside the dashed boundary are either new in the tax law or have revised associations with other grouping concepts

37

Income Tax Law Schema vs. Generic Schema

Resolution of Cross References

•  All CREs are interpretable"except delegating ones: “un règlement grand-ducal pourra déroger à …”, and unspecific ones: “une disposition de la loi du …”

•  Interpretation rules are based on pre-defined patterns and the text segmentation to precisely determine the context of the cross reference

38

Results and Next Steps

•  Results: Highly accurate detection and resolution of cross-references across many legal texts

•  Investigate whether it is possible to retrieve semantic information about cross references: compliance, constraint, definition, delegation, exception, refinement

•  Enhance analysis capabilities: Change impact analysis, circularity analysis , compliance requirements specification …

39

Managing Compliance with Standards

Compliance is Complex and Expensive

41

Standards, laws and regulations are textual. They need to be interpreted and adapted to context

Multiple stakeholders are involved in the regulation, compliance, and auditing chain

The volume of evidence required for demonstrating compliance is very large

Compliance arguments need to be assessed in a credible manner and based on evidence

There are trade-offs between different mechanisms for achieving compliance

Different Facets of Compliance

42

Managing compliance Creating explicit interpretations of

compliance requirements

Aligning requirements to industrial and organizational

practices

Performing compliance analysis

Managing evidence and traceability to

different requirements

Managing change

43

Compliance Analysis for IEC 61508 •  IEC 61508

•  Specifies functional safety"requirements for safety-related "control systems

•  Widely-used safety standard "for control systems

•  7 parts; approx. 500 pages

•  Understanding and operationalizing "the standard is a daunting task!

•  Certification of safety-critical control systems in maritime & energy

•  Industry partners

44

Evolution from mechanical operation to computer-based operation

Safety Certification: Confirmation by examination and provision of convincing evidence that the specified safety requirements have been adequately fulfilled.

Det Norske Veritas

Industrial Context

Models as an Enabler for Compliance

•  Models provide …

•  A precise and analyzable interpretation of the (textual) content of the standards, laws and regulations •  Core concepts, relationships, processes, …

•  Means for traceability between legal texts and systems •  A basis for automation

45

46

•  A conceptual model is a map of important concepts, their attributes and relationships

•  Methodology to go from standard text to class diagrams

Expert interpretation of the

standard Hazard Risk

Poses •  Hazardous Element •  Initiating Mechanism

•  Likelihood •  Consequence

Expressing the Interpretation of"IEC 61508 as a Conceptual Model

Poses

•  Large and complex, organized into packages

The IEC 61508 Conceptual Model

48

Overview of Solution

•  Devise class diagrams for a creating domain model of the application and a conceptual model of the safety standard

•  Create UML profile (with constraints) of the safety standard based on its conceptual model

•  Apply the stereotypes of this profile to a domain model of the system that is undergoing certification

49

Create Instance for Specific Certification

Construct Conceptual Model of Standard1

Define UML Profile based on Conceptual Model

2

Elaborate Domain Model for Compliance3

4Application Domain Expert

Domain Model of System

Standard / Certification Expert

Figure 1: Methodology for the Creation of Evidence of a Safety Standard.

Steps 1 and 2 of the approach require input from experts familiar with the certi-fication process (including the standard used for certification) but not necessarily theapplication domain. Fulfilling step 3 requires expertise in both certification and the ap-plication domain; whereas, step 4 only requires knowledge of the application domain.In the remainder of this section, we present detailed descriptions of the four steps inour approach.

3.1. Step 1: Conceptual Model of a Safety StandardIn Section 2, we noted the need for having an explicit interpretation of the under-

lying safety standard. We achieve this in the first step of our approach through thecreation of a conceptual model. A conceptual model is a formal description of someaspect of the physical and social world around us for the purpose of understandingand communicating amongst humans [21]. It employs some formal notation which is acombination of diagrammatic and linguistic constructs and serves as a point of commonagreement amongst a team of people and can also be used as a means of forwardingthis understanding to newcomers joining the team.

A conceptual model of a safety standard should thus capture the main concepts andrelationships in the evidence information required for showing compliance to the stan-dard. We use UML class diagrams [24] for conceptual modeling of safety standards. InUML class diagrams, concepts are represented as classes and concept attributes as classattributes. Relationships are represented by associations. Generalization associationsare used to derive more specific concepts from abstract ones. When an attribute as-sumes a value from a predefined set of possible values, we use enumerations. Finally,the package notation is used to make groupings of concepts and thus better managecomplexity.

Our choice of UML is based on the fact that it is a well-recognized and standardizednotation and that the UML class diagram notation adequately fulfils our needs. From apractical standpoint, it is in general useful to ensure that the notation being employedis already accepted in industry and at the same time easy to learn for practitioners.

Creating a conceptual model of a standard requires a careful analysis of the stan-dard’s text to identify the salient concepts and relationships mentioned in the text. To

6

Certifier Supplier

Specialized checklists, plans, progress measures,

agreements, etc.

Schema for safety evidence

Safety evidence"repository

Diagnostics

basis for

Safety Collaboration Tool

Compliance Checking"Tool

Recommended Practice

IEC 61508 Conceptual Model

Aid to understanding and communicating

IEC 61508

Means for collaboration"

between suppliers and certifiers

Safety profile (UML)

Creation of safety evidence repositories

Automatic checking of compliance

Some Applications of the Model

50

Certifier Supplier


agreements, etc.



Diagnostics

basis for






IEC 61508







51

Certifier Supplier


agreements, etc.



Diagnostics

basis for






IEC 61508







52

Certifier Supplier


agreements, etc.



Diagnostics

basis for






IEC 61508







53

Compliance Risk Assessment

54

•  Analyzing the risks of a compliance breach (e.g., compromises) •  An assurance case built to:

•  argue about compliance based on existing evidence

•  show due diligence •  assess risks

•  Three-tiered structure: •  Claims about compliance •  Arguments •  Evidence

Claim (Top Goal)

Arguments (Decomposed Goals)

Evidence

supports

supports

Goal Modeling

55

Overall Safety Goal

Goal Decomposition

Evidence

Foundations

56

ExpertElicitation

Monte Carlo��Simulation

• Origin: Cognitive Psychology, Nuclear Safety Engineering

• Used in our framework for: eliciting likelihoods for satisfaction of

goals

• Origin: Requirements Engineering• Used in our framework for:

decomposition of high-level objectives, propagation of

probabilistic values from sub-goals to overall safety goals

• Origins: Physics Simulation, Risk Management, Cost Estimation

• Used in our framework for:computing a probability distribution for the satisfaction of overall safety goals

Goal Modeling

MODUS Tool

57

•  Combine goal modeling, simulation, and expert elicitation

•  Interfaces with commercial"risk assessment tools for"Monte Carlo and sensitivity analysis

•  Used in new technology qualification in the energy and maritime domains

•  Features

•  Goal modeling environment •  Implementation of the expert"

elicitation process •  Risk analysis •  Reporting and project"

management facilities

Compliance with Business Processes

Context

•  eGovernment services are expressed and developed as business processes

•  System development is outsourced

•  Common usage of business process models

•  Systems deployed are typically buggy, no full validation is usually possible before deployment

•  Off-line and run-time verification of compliance

59

Why Run-time Verification?

• To detect when a (critical) failure occurs

• To collect data for debugging

• To determine corrective actions

• Some information available only at run time

• Behavior may depend on the (evolving) environment

60

Requirements

• Solution adoptable in contexts where MDE is already a development practice

• High-level domain-specific language to allow business analysts to express the properties to be checked

• Solution based on standard and stable MDE technology

• Performance of the solution comparable to the state of the art

61

Our Vision

62

Our Vision

63

Our Vision

64

Our Vision

65

The Three Main Ingredients for Run-time Verification

66

language algorithm run-time architecture

Requirements Specification Language

67

Property Checking

68

Requirements Specification Language

69

Usability Expressiveness

Efficiency

OCLR

•  Temporal logic not an option

•  Extension of OCL for run-time verification

•  Facilitate the expression of temporal properties and real-time constraints based on a trace conceptual model

•  Templates based on Dwyer’s classification

•  Facilitate (optimal) translation to OCL properties and their verification using a constraint checking engine

70

Dwyer’s Pattern System

71

{ Globally Before After Between-and After-until

Scope

Dwyer’s Pattern System

72

Pattern Universality Existence Absence

Response Precedence {

Extensions • Extensions (new constructs) identified based on analyzing

actual properties in a large case study

• Specific occurrence of an event: “A password reset email will be sent to the user after the third incorrect login attempt”

• Time distance from a boundary: “A speaker should be ready within 10 minutes after the second keynote.”

73

Case Study

•  Identity card management system

• 47 properties analyzed and translated into OCLR

• Properties checked on synthesized traces

• Focus on scalability

• Comparison with MonPoly

74

Identity Card Management

75

Request Production

Delivery

Expiration

Loss

Request

76

temporal R1: !let r : Request in !before becomesTrue(r.card.state = CardState::InProduction) isCalled(notifyApproved(r.applicant)) !responding at most 3 tu!becomesTrue(r.state = RequestState::Approved)!

!

Scope: Before Pattern: Response

“Once a card request is approved, the applicant is notified within three days; this notification has to occur before the production of the card is started”

Properties with “globally”

0 200 400 600 800

1000 1200 1400 1600 1800 2000

K 100K 200K 300K 400K 500K 600K 700K 800K 900K 1,000K 1,100K

Aver

age C

heck

Tim

e (m

s)

Trace Size

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12

77

Comparison "t(MonPoly) / t(OCLR-Check)

0 1 2 3 4 5 6 7 8 9

10

K 100K 200K 300K 400K 500K 600K 700K 800K 900K 1,000K 1,100K

Ratio

of C

heck

Tim

e (m

s)

Trace Size

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12

78

Insights

•  All the properties (trace size <= 1M) can be checked within 2 seconds

•  OCLR-Check scales linearly with respect to the size of the trace

•  The performance of OCLR-Check is as good as or better than the state-of-art tool (MonPoly)

79

Reflections

Capturing Legal Requirements •  The language must facilitate the modeling of targeted concepts (expressiveness)

•  Events and timing, Procedural steps and conditions/constraints

•  Obligations, permissions, prohibitions

•  The language is an interface among all stakeholders and must account for their professional background, domain expertise and practice

•  The language must support the intended analysis or enable translation to a language that does (usually the latter)

•  Methodology and tool support, rather than formal semantics …

81

Capturing Legal Requirements •  Example: Income tax law project

•  Conditions and procedural steps

•  Traceability to legal provisions, officers, data model

•  Stakeholders include both IT analysts and legal experts

•  Application: Simulation, search, optimization

•  Rely on augmented activity and class diagrams (profile) with additional information including traceability to sources of requirements and descriptive statistics

82

Capturing Legal Requirements •  Example: Uncovering dependencies among legal requirements

•  Analysis of cross-references in legal texts

•  Must be configurable to many legal texts

•  Impact analysis of changes in the law

•  Analysis of potential problems, e.g., dangling and circular references

•  Modeling of structure and organization of legal documents

83

Capturing Legal Requirements •  Example: Safety standard (IEC 61508)

•  Application: compliance and risk assessment

•  Conceptual model of the standard, supplier organization and processes, and their mapping

•  Mapping among stakeholders’ concepts (interpretation and certification) and between standards (specialization and overlapping standards)

•  Evidence requirements mandated by the standard

•  Analyzable arguments for satisfaction of safety requirements

84

Capturing Legal Requirements

•  Example: Business Processes (eGovernment)

•  Business Process Models used as specification as part of an MDE methodology

•  BPM Properties often capture legal and organizational regulations

•  Temporal properties and real-time constraints

•  Temporal logic not an option

•  OCL extension: OCLR

•  Templates to model properties according to scope and pattern (Dwyer)

•  Trace conceptual models to define an operational mapping of properties

85

Analysis of Legal Requirements

• What analysis mechanisms are suited?

• Logical reasoning (theorem proving)

• Simulation

• Natural Language Processing

• Search and optimization

• Constraint checking 86


• This depends on the intended applications

• Identify inconsistency or incompleteness

• Find loopholes and undesirable cases

• Check compliance

• This also partly depends on the scale of the legal requirements to be considered

87


•  Example: Income tax law

•  Laws tend to be large, the tax law is no exception

•  This leads to many legal requirements (in our case large models)

•  Some of our applications require simulation (overall impact of policy changes, compliance testing)

•  But even for identifying undesirable situations we rely on simulation

•  NLP of legal text to uncover potential dependencies among requirements 88

Analysis of Legal Requirements •  Example: Safety Standard

•  Support shared understanding between suppliers and certifier about what evidence needs to be collected and how it is going to be used: questionnaire derived from the conceptual model of the standard, explicit and analyzable arguments

•  Compliance analysis: making sure all evidence items have been collected (constraint checking)

•  Quantitative goal satisfaction analysis based on evidence and expert judgment

89


•  Example: Run-time compliance analysis of business Processes

•  Performance is a challenge, large traces

•  Reliance on standard (OCL) and mature constraint checking technology

•  Optimized mapping to OCL properties on trace conceptual model

•  Generated OCL properties are highly complex, but need not be read by any analyst or domain expert

90

Summary

91

Language enabling their analysis

Type of analysis

Scalability of analysis Ease of translation Background and practices of stakeholders

Alignment with domain concepts to be expressed

Legal texts Language to model legal requirements

Configurable, automated support (with traceability)

General Conclusions •  Solutions for capturing and analyzing legal requirements are very driven by contextual factors

•  They include, prominently, human factors, e.g., background, practices

•  Modeling (profiles) and domain-specific languages are typically an important part of these solutions

•  Usability studies of domain-specific languages

•  Applications typically requires interdisciplinary solutions

•  Scale is an essential consideration

•  Realistic and well-reported case studies are needed regarding usability and scalability

92

General Conclusions

•  Legal requirements engineering is clearly an important topic

•  But do we have a clear understanding of the needs in various legal domains?

•  Is there a good match between research and practice?

•  Do we need more interdisciplinary work?

•  Do we make, as researchers, a sufficient attempt to achieve practicality and simplicity?

•  There is no alternative to reality in an engineering research discipline

93

References

Legal Modeling and Textual Analysis •  [MODELS 2015] G. Soltana et al., A Model-Based Framework for Probabilistic Simulation of Legal Policies,

ACM/IEEE Int. Conference on Model Driven Engineering Languages and Systems, 2015

•  [MODELS 2014] G. Soltana et al., UML for Modeling Procedural Legal Rule: Approach and a Study of Luxembourg’s Tax Law, ACM/IEEE Int. Conference on Model Driven Engineering Languages and Systems, 2014

•  [RE 2014] M. Adedjouma et al., Automated Detection and Resolution of Legal Cross References, IEEE Int. Conference on Requirements Engineering, 2014

Compliance with Business Process Models •  [SAM2014] W. Dou et al., Revisiting Model-driven Engineering for Run-time Verification of Business

Processes, System Analysis and Modeling Conference (SAM 2014)

•  [ECMFA 2014] W. Dou et al., OCLR: a More Expressive, Pattern-based Temporal Extension of OCL, European Conference on Modelling Foundations and Applications (ECMFA 2014)

•  [TR 2015] W. Dou et al., A Model-Driven Approach to Offline Trace Checking of Temporal Properties with OCL. SnT Technical Report, Submitted to a journal

95

Compliance with Safety Standards •  [IST 2013] R. Panesar-Walawege et al., Supporting the Verification of Compliance to Safety

Standards via Model-Driven Engineering: Approach, Tool-Support and Empirical Validation, Information and Software Technology. 55(5):836-864 (2013)

•  [IEEE SW 2012] D. Falessi et al., Planning for Safety Evidence Collection: A Tool-Supported Approach Based on Modeling of Standards Compliance Information, IEEE Software, Volume 29, Number 3, May-June 2012

•  [ISSRE 2011] R. Panesar-Walawege et al., A Model-Driven Engineering Approach to Support the Verification of Compliance to Safety Standards. 22nd IEEE International Symposium on Software Reliability Engineering, 2011.

•  [ER 2011] R. Panesar-Walawege et al., Using UML Profiles for Sector-Specific Tailoring of Safety Evidence Information. 30th International Conference on Conceptual Modeling, 2011.

•  [ICST 2010] R. Panesar Walawege et al., Characterizing the Chain of Evidence for Software Safety Cases: A Conceptual Model Based on the IEC 61508 Standard. 3rd IEEE International Conference on Software Testing, Verification, and Validation, 2010.

96

.lusoftware verification & validationVVS

Capturing and Analyzing Legal Requirements

Lionel Briand

Interdisciplinary Centre for Security, Reliability and Trust (SnT) University of Luxembourg

MoDRE - August 2015