Upload
lionel-briand
View
541
Download
2
Embed Size (px)
Citation preview
.lusoftware verification & validationVVS
Capturing and Analyzing Legal Requirements
Lionel Briand
Interdisciplinary Centre for Security, Reliability and Trust (SnT) University of Luxembourg
MoDRE - August 2015
Acknowledgments • Mehrdad Sabetzadeh
• Ghanem Soltana
• Nicolas Sannier
• Morayo Adedjouma
• Rajwinder Panesar-Walawege
• Domenico Bianculli
• Wei dou 2
SnT Centre and SVV Lab • SnT centre, Est. 2009: Interdisciplinary, ICT
security-reliability-trust
• 250 scientists and Ph.D. candidates, 25 industry partners
• SVV Lab: Established January 2012, www.svv.lu
• 25 scientists (Research scientists, associates, and PhD candidates)
• Industry-relevant research on system dependability: security, safety, reliability
• Partners: Cetrel, CTIE, Delphi, SES, IEE, Hitec …
3
Mode of Collaboration • Tight, long-term industrial collaborations • Well-defined problems in context • Strong emphasis on high-impact research
4
Context • Many organizations and systems need to comply with laws,
regulations, prescribed business processes, standards …
• Such legal provisions are complex: large documents, many concepts and cross-dependencies
• They can be more prescriptive or declarative in nature
• They often require interpretation in context
• The interpretation and formalization of such provisions lead to legal requirements
5
Outline • Different challenges related to legal requirements:
• Modeling, simulating, analyzing, and verifying prescriptive laws
• Dealing with dependencies among legal provisions
• Managing compliance with standards (e.g., safety)
• Checking compliance with prescribed business processes
• Reflections 6
Modeling, Simulating, and Analyzing Prescriptive Laws
Problem definition
• CTIE: Government Computing Centre, Model-Driven Engineering
• How can we ensure that "e-Government systems"comply with the law?
• How does the evolution of law "and systems impact compliance?
• What is the impact of changes in the law on societies and administrations?
8
Challenges • Laws are often abstract. They need to be "
interpreted in context
• Multiple stakeholders are involved."A shared understanding of legal requirements"is necessary
• Compliance verification and evolution"analysis need to be done on a large scale
9
Modeling legal requirements • Model legal requirements in an intuitive and yet precise way
• A model is an analyzable interpretation of the natural language provisions in legal texts
• Enable scalable analysis of legal requirements • Reliance on open modeling standards: access to mature
training and tools, reduced licensing fees • Tax law is used as a concrete case study
• But the solutions are generalizable
10
Applications
11 11
Simulation"data
Software system
Traces to
Generates
Results match?
Impact of legal"decisions
Models of legal requirements
Example 1: Domain model
12
Address- country: Country
TaxPayer
addresses
taxpayer
*
Income
LocalIncome ForeignIncome
incomes
taxpayers
1..*
1NonResidentTaxPayer
ResidentTaxPayer
- LU- ...
Country«enumera(on»
1..*
• Useful for capturing the main concepts in the law and the relationships between them (here, the tax law)
Example 2: Policy model
13
• Activity diagram capturing administrative legal policies
• Profile: Simulatable model with traceability to the law, agents, and tax data
Benefits
14
• Models capture knowledge in an explicit way
• Models provide a "communication bridge"between stakeholders
• Models are key to "automation and"scalability
System /SoftwareAnalyst
LegalExpert
Models
Methodology for modeling legal policies
15
Relevant legal texts
Feedback
Model the domain
Model legal policies
Domain model
Custom semantic concepts for policies
Policy models
Model validation
Case study • Complete set of models for calculation of withholding taxes
• Tax class categorization
• All deductions and credits "associated with withholding
• Commuting Expenses, Misc. Expenses,"Spousal Expenses, Special Expenses, "Extraordinary Expenses,"Employment Credit, Pension Credit, "Single Parent Credit
• Models validated with legal experts for understandability and accuracy
16
Policy simulation
17 17
Simulation"data
Software system
Generates
Results match?
Impact of legal"decisions
Models of legal requirements
Simulation of withholding taxes
18
Requires "input data Requires "
executable"code
Gross salary
Tax classcategory
Taxes due
f
Taxableincome
f
Expenses
Simulation process
19
Relevant legal texts
Domain model
Policy models
Results report
Is data available?
Yes
No
Instance model from existing data
Generate data
Use existing data
Generate code
Executable code
Run analysis
≠ Æ Ø
∞Instance
model from generated data
Model legal policies
¨
Model-based simulator generator
20
Simulation data generator
• Data generation is necessary when
• Access to real data is hard (for example, due to privacy)
• Real data has not been collected (for example, due to costs)
• Simulation of future contingencies
21
Data generation strategy
1. Capture characteristics of simulation population using probabilities
2. Attach probabilistic information to domain model
22
Expert estimates
Census data
Domain model
Example probabilistic annotation
23
Source: STATEC, Luxembourg
60% of income types are Employment, 20% are Pension, and the remaining 20% are Other
Income
Employment
«probabilistic type»{frequency: 0.6}
Pension
«probabilistic type»{frequency: 0.2}
Other
«probabilistic type»{frequency: 0.2}
(abstract)
24
Relevant legal texts
Domain model
Policy models
Results report
Is data available?
Yes
No
Instance model from existing data
Generate data
Use existing data
Generate code
Executable code
Run analysis
≠ Æ Ø
∞Instance
model from generated data
Model legal policies
¨
Example output from simulator
25
Current commuting expenses (CE)
Changed CE: • Min. distance: 4 à 10 • Flat rate: 99 à 50 • Max. flat rate (2574)"
dropped
Hypothetical scenario
Case study • Annotated domain model for withholding taxes"
with information from STATEC
• Automatically generated "populations of up to 10,000 "tax payers
• Simulated various combinations"of deductions and credits
• Highlights:
• Data generation and simulation are highly scalable
• Statistical properties of the real population could be "reproduced with > 2,000 generated tax payers
26
are constraints (not to be confused with classes). Referencesto Fig. 5 for the stereotypes and Fig. 6 for the examples arenot repeated throughout the section.
• «probabilistic type» extends the Class and EnumerationLiteral
metaclasses with relative frequencies. For example, «proba-bilistic type» is applied to the specializations of Income, statingthat 60% of income types are Employment, 20% are Pension,and the remaining 20% are Other. In this example, the relativefrequencies for the specializations of Income add up to 1.This means that no residual frequency is left for instantiatingIncome (the parent class). Here, instantiating an Income is notpossible as Income is an abstract class. One could neverthelesshave situations where the parent class is also instantiable. Insuch situations, the relative frequency of a parent class isthe residual frequency from its (immediate) subclasses. Anexample of «probabilistic type» applied to enumeration literalscan be found in the (truncated) Disability enumeration class.Here, we are stating that 90% of the population does not haveany disability, while 7.5% has vision problems.
• «probabilistic value» extends the Property and Constraint
metaclasses. Extending the Property metaclass is aimed ataugmenting class attributes with probabilistic information. Asfor the Constraint metaclass, the extension is aimed at provid-ing a container for expressing probabilistic information usedby two other stereotypes, «multiplicity» and «dependency»(discussed later). The «probabilistic value» stereotype has anattribute, precision, to specify decimal-point precision, and anattribute, usesOCL, to state whether any of the attributes ofthe stereotype’s subtypes uses OCL to retrieve a value froman instance of the domain model. A «probabilistic value»can be: (1) a «fixed value», (2) «from chart», which couldin turn be a bar or a histogram, or (3) «from distribution»of a particular type, e.g., normal or triangular. The namesand values of distribution parameters are specified using theparameterNames and parameterValues attributes, respectively.The index positions of parameterNames match those of thecorresponding parameterValues. The same goes with the indexpositions of items/bins and frequencies in «from chart».
- Uniform Range- NormalDistribution- TriangularDistribution- BetaDistribution- GammaDistribution- ...
DistributionType«enumera(on»
EnumerationLiteral«metaclass»
Class«metaclass»
- frequency: Real [1]probabilistic type
«stereotype»
Property«metaclass»
- frequencies: Real [1..*]from chart«stereotype»
- bins: String [1..*]from histogram
«stereotype»
- items: String [1..*]from barchart
«stereotype»
- value: String [1]fixed value«stereotype»
- type: DistributionType [1]- parameterNames: String [*]- parameterValues: String [*]
from distribution«stereotype»
Constraint«metaclass»
Association«metaclass»
multiplicity«stereotype»
dependency«stereotype»
1..* OCLtrigger
0..1contexttargetMember 1 1 context
1 constraint
- objectPoolQueries: String [1..*]- reuseProbabilities: Real [1..*]
use existing «stereotype»
- precision: Integer [0..1]- usesOCL: Boolean [1]
probabilistic value«stereotype»
context 1
value dependency«stereotype»
type dependency«stereotype»
Fig. 5. Profile for Expressing Probabilistic Characteristics of a Population
«multiplicity»{targetMember: Income;
constraint: [income mult]}«type dependency»
{context: TaxPayer; OCLtrigger: [income types based on age]}
«from histogram» {usesOCL: false;
bins: [[0.5..0.7], [0.71..0.9], [0.91..1]];frequencies: [0.7, 0.2, 0.1]}
rate for vision disability
1 taxpayer
Disability - «probabilistic type» {frequency: 0.9} None- «probabilistic type» {frequency: 0.075} Vision- ...
«enumeration»
«fixed value» {value: 0; usesOCL: false}
rate for no disability
incomes 1..*
0..1 taxCard1 income
Income- gross_value: Real [1]- prorata_period: Real [1]
expenses *
1 income
allowances *1 beneficiaryExpense
- amount: Real [1] «from distribution» {usesOCL: true; precision: 2; context: Income; type: Uniform Range; parameterNames: [lowerBound, upperBound];parameterValues: [50, self.gross_value * 0.5 ]}
«use existing»{context: Expense;
objectPoolQueries: [self.getOwner().getHouseholdMembers()];reuseProbabilities: [0.7];}
TaxCard- invalidity: Real [1]«fixed value» {value: 0; usesOCL: false}
Employment
«probabilistic type»{frequency: 0.6}
Pension
«probabilistic type»{frequency: 0.2}
Other
«probabilistic type»{frequency: 0.2}
«from barchart» {usesOCL: false;items: [1, 2, 3, 4];
frequencies: [0.8, 0.15, 0.045, 0.005]}
income mult
«from barchart» {usesOCL: false;
items: [Pension, Employment, Other];frequencies: [0.85, 0.1, 0.05]}
income types based on age
FromLaw
+ {static} invalidityFlatRate (in disabilityRate: Real): Real
Condition: self.disabilityType = Disability: : None
Condition: self.disabilityType = Disability: : Vision Condition: self.getAge() >= 60
- disabilityRate: Real [1] «value dependency»{context: TaxPayer; OCLtrigger: [rate for no disability, rate for vision disability]}«from histogram»{usesOCL: false; precision: 2; bins: [[0..0.2], [0.21..0.5], [0.51..0.7], [0.71..1]];frequencies: [0.4, 0.3, 0.2, 0.1]}- disabilityType: Disability [1]- birthYear: Integer [1] ...
TaxPayer
(abstract)
FromAgent- {static} TAX_YEAR: Integer [1] = 2015
(abstract)
(abstract)
Fig. 6. Partial Domain Model of Luxembourg’s Income Tax Law Annotated with Probabilistic Information
4
To answer the RQs, we ran the simulator (automaticallyderived from the six policy models) over simulation data(automatically generated by Alg. 1). We discuss the resultsbelow. All the results were obtained on a computer with a3.0GHz dual-core processor and 16GB of memory.RQ1. The execution times of the data generator and thesimulator are influenced mainly by two factors: the size ofthe data to produce –here, the number of tax cases– andthe number and complexity of the policy models to simulate.Note that the data generator instantiates only the slice modelthat is relevant to the policies of interest and not the entiredomain model. This is why the selected policy models havean influence on the the execution time of the data generator.
To answer RQ1, we measured the execution times of thedata generator and the simulator with respect to the abovetwo factors. Specifically, we picked a random permutation ofthe six policies –ID, CIS, PE, FD, LD, CIP– and generated10,000 tax cases, in increments of 1,000, first for ID, then forID combined with CIS, and so on. When all the six policiesare considered, a generated tax case has an average of ⇡24objects. We then ran the simulation for different numbers oftax cases and the different combinations of policy models con-sidered. Since the data generation process is probabilistic, weran the process (and the simulations) five times. In Figs. 9(a)and (b), we respectively show the execution times (average ofthe five runs) for the data generator and the simulator.
As suggested by Fig. 9(a) the execution time of the datagenerator increases linearly with the number of tax cases. Wefurther observed a linear increase in the execution times of thedata generator as the size of the slice model increased (detailsare not shown). In particular, FD and LD introduced severalnew concepts into the slice model, resulting in a proportionalincrease in the slope of the fourth and fifth curves (fromthe bottom) in Fig. 9(a). We note that as more policies areincluded, the slice model will eventually saturate, as the largestpossible slice model is the full domain model.
With regards to simulation, the execution times partlydepend on the complexity of the workflows in the underlyingpolicies (e.g., the nesting of loops), and partly on the OCLqueries that supply the input parameters to the policies. Thelatter factor deserves attention when simulation is run overa large instance model. Particularly, OCL queries containingiterative operations may take longer to run as the instancemodel grows. The non-linear complexity seen in the fifth andsixth curves (from the bottom) in Fig. 9(b) is due to an OCLallInstances() call in LD, which can be avoided by changing thedomain model and optimizing the query. This would result inthe fifth and sixth curves to follow the same linear trend seenin the other curves. Since the measured execution times arealready small and reasonable, such optimization is warrantedonly when the execution times need to be further reduced.
As suggested by Figs. 9(a) and (b), our data generator andsimulator are highly scalable: Generating 10,000 tax casescovering all six policies took ⇡30 minutes. Simulating thepolicies over 10,000 tax cases took ⇡24 minutes.
(b)
(a)
(c)
100 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k
00.
10.
20.
30.
40.
5
Distance for age histogramsDistance for income histogramsDistance for income type histogramsDistance for aggregation of histograms
Number of generated tax cases
Eucl
idea
n di
stan
ce
0 1K 2K 3K 4K 5K 6K 7K 8K 9K
05
1015
2025
30 ID + CIS + PE + FD + LD + CIPID + CIS + PE + FD + LDID + CIS + PE + FDID + CIS + PEID + CISID
Number of generated tax cases
Exec
utio
n tim
e (in
min
utes
)
0 1k 2k 3k 4k 5k 6k 7k 8k 9k 10k
04
812
1620
24 ID + CIS + PE + FD + LD + CIPID + CIS + PE + FD + LDID + CIS + PE + FDID + CIS + PEID + CISID
Number of generated tax cases
Exec
utio
n tim
e (in
min
utes
)
Fig. 9. Execution Times for Data Gen-eration (a) & Simulation (b); EuclideanDistances between Generated Data &Real Population Characteristics (c)
RQ2. To answer RQ2,we compare informationfrom STATEC for age, in-come and income type, allrepresented as histograms,against histograms builtover generated data of var-ious sizes. Similar to RQ1,we ran the data generatorfive times and took the av-erage for analysis. Amongalternative ways to com-pare histograms, we useEuclidean distance whichis widely used for this pur-pose [9]. Fig. 9(c) presentsEuclidean distances for theage, income, and incometype histograms as wellas the Euclidean distancefor the normalized aggre-gation of the three. Asindicated by the figure,the Euclidean distance forthe aggregation falls below0.05 for 2000 or more taxcases produced by our datagenerator. This suggests aclose alignment betweenthe generated data andLuxembourg’s real popula-tion across the three crite-ria considered.
The above analysis provides confidence about the qualityof the data produced by our data generator. The analysisfurther establishes a lower-bound for the number of tax casesto generate (2,000) to reach a high level of data quality.
RQ3. We answer RQ3 using the Kolmogorov-Smirnov (KS)test [10], a non-parametric test to compare the cumulativefrequency distributions of two samples and determine whetherthey are likely to be derived from the same population. Thistest yields two values: (1) D, representing the maximumdistance observed between the cumulative distributions of thesamples. The smaller D is, the more likely the samples areto be derived from the same population; and (2) p-value,representing the probability that the two cumulative sampledistributions would be as far apart as observed if they werederived from the same population. If the p-value is small(< 0.05), one can conclude that the two samples are fromdifferent populations.
To check the consistency of data produced across differentruns of our data generator, we ran the generator five times,each time generating a sample of 5000 tax cases. We thenperformed pairwise KS tests for the age, income, and incometype information from the samples. Table I shows the results,
9
Other Applications of Simulation
• Detection of undesirable situations, e.g., sharp increases in tax rates for certain categories of tax payers
• Optimization of parameters, e.g., thresholds and rates for achieving revenue targets
27
Legal Text Analysis
Motivation
• Laws are complex natural language"documents
• Interdependencies between legal"provisions cause interdependencies"between legal requirements
• Automated extraction and analysis"of interdependencies will "significantly reduce compliance costs
29
Law
Legal Requirements
System Requirements
Design
Test Plan
dependencies
dependencies
Interdependencies in the law
30
Art. 2 Individuals are considered resident taxpayers if they have either their fiscal or habitual residence in the Grand Duchy. Individuals are considered non-resident taxpayers if they neither have their fiscal nor their habitual residence in the Grand Duchy and if they have local income within the meaning of Art. 156. Resident taxpayers are subject to income tax because of their income, both local and foreign. Non-resident taxpayers are subject to income tax only because of their local income within the meaning of Art.156.
R1: The system shall levy taxes on non-residents’ local income as per the annual tax scale.
derived from Cross reference to
another article
Requirements dependencies
R1
derivedfrom
Article 2 Article 156 Cross Reference(CR)
R2 R3 R4
derivedfrom
potential dependencies
R2: For non-residents, rental and lease income earned in the Grand Duchy shall be considered as local income.
31
Automatic resolution of "legal cross references
32
Other important applications • Generation of navigable legal portals
• Identifying related legal texts that need to be reviewed for consistency after a change occurs
• Identifying governmental web pages and online forms that need revision after a change in the law
• Checking for potentially-problematic situations
• For example, cyclic cross references 33
Approach
34
Based on "Natural Language Processing
Defining the Schema of a Text
• Schema of a legal text: structure and navigation flow of the provisions of a legal text
• Guidelines such as “Traité de Légistique Formelle” provide a generic structure for legislative texts in Luxembourg
• Deviations are nevertheless frequent!
• Tailoring the schema is often necessary in order to handle drafting practices evolution over time
35
36
Generic Text Schema
• The structural elements inside the dashed boundary are either new in the tax law or have revised associations with other grouping concepts
37
Income Tax Law Schema vs. Generic Schema
Resolution of Cross References
• All CREs are interpretable"except delegating ones: “un règlement grand-ducal pourra déroger à …”, and unspecific ones: “une disposition de la loi du …”
• Interpretation rules are based on pre-defined patterns and the text segmentation to precisely determine the context of the cross reference
38
Results and Next Steps
• Results: Highly accurate detection and resolution of cross-references across many legal texts
• Investigate whether it is possible to retrieve semantic information about cross references: compliance, constraint, definition, delegation, exception, refinement
• Enhance analysis capabilities: Change impact analysis, circularity analysis , compliance requirements specification …
39
Managing Compliance with Standards
Compliance is Complex and Expensive
41
Standards, laws and regulations are textual. They need to be interpreted and adapted to context
Multiple stakeholders are involved in the regulation, compliance, and auditing chain
The volume of evidence required for demonstrating compliance is very large
Compliance arguments need to be assessed in a credible manner and based on evidence
There are trade-offs between different mechanisms for achieving compliance
Different Facets of Compliance
42
Managing compliance Creating explicit interpretations of
compliance requirements
Aligning requirements to industrial and organizational
practices
Performing compliance analysis
Managing evidence and traceability to
different requirements
Managing change
43
Compliance Analysis for IEC 61508 • IEC 61508
• Specifies functional safety"requirements for safety-related "control systems
• Widely-used safety standard "for control systems
• 7 parts; approx. 500 pages
• Understanding and operationalizing "the standard is a daunting task!
• Certification of safety-critical control systems in maritime & energy
• Industry partners
44
Evolution from mechanical operation to computer-based operation
Safety Certification: Confirmation by examination and provision of convincing evidence that the specified safety requirements have been adequately fulfilled.
Det Norske Veritas
Industrial Context
Models as an Enabler for Compliance
• Models provide …
• A precise and analyzable interpretation of the (textual) content of the standards, laws and regulations • Core concepts, relationships, processes, …
• Means for traceability between legal texts and systems • A basis for automation
45
46
• A conceptual model is a map of important concepts, their attributes and relationships
• Methodology to go from standard text to class diagrams
Expert interpretation of the
standard Hazard Risk
Poses • Hazardous Element • Initiating Mechanism
• Likelihood • Consequence
Expressing the Interpretation of"IEC 61508 as a Conceptual Model
Poses
• Large and complex, organized into packages
The IEC 61508 Conceptual Model
48
Overview of Solution
• Devise class diagrams for a creating domain model of the application and a conceptual model of the safety standard
• Create UML profile (with constraints) of the safety standard based on its conceptual model
• Apply the stereotypes of this profile to a domain model of the system that is undergoing certification
49
Create Instance for Specific Certification
Construct Conceptual Model of Standard1
Define UML Profile based on Conceptual Model
2
Elaborate Domain Model for Compliance3
4Application Domain Expert
Domain Model of System
Standard / Certification Expert
Figure 1: Methodology for the Creation of Evidence of a Safety Standard.
Steps 1 and 2 of the approach require input from experts familiar with the certi-fication process (including the standard used for certification) but not necessarily theapplication domain. Fulfilling step 3 requires expertise in both certification and the ap-plication domain; whereas, step 4 only requires knowledge of the application domain.In the remainder of this section, we present detailed descriptions of the four steps inour approach.
3.1. Step 1: Conceptual Model of a Safety StandardIn Section 2, we noted the need for having an explicit interpretation of the under-
lying safety standard. We achieve this in the first step of our approach through thecreation of a conceptual model. A conceptual model is a formal description of someaspect of the physical and social world around us for the purpose of understandingand communicating amongst humans [21]. It employs some formal notation which is acombination of diagrammatic and linguistic constructs and serves as a point of commonagreement amongst a team of people and can also be used as a means of forwardingthis understanding to newcomers joining the team.
A conceptual model of a safety standard should thus capture the main concepts andrelationships in the evidence information required for showing compliance to the stan-dard. We use UML class diagrams [24] for conceptual modeling of safety standards. InUML class diagrams, concepts are represented as classes and concept attributes as classattributes. Relationships are represented by associations. Generalization associationsare used to derive more specific concepts from abstract ones. When an attribute as-sumes a value from a predefined set of possible values, we use enumerations. Finally,the package notation is used to make groupings of concepts and thus better managecomplexity.
Our choice of UML is based on the fact that it is a well-recognized and standardizednotation and that the UML class diagram notation adequately fulfils our needs. From apractical standpoint, it is in general useful to ensure that the notation being employedis already accepted in industry and at the same time easy to learn for practitioners.
Creating a conceptual model of a standard requires a careful analysis of the stan-dard’s text to identify the salient concepts and relationships mentioned in the text. To
6
Certifier Supplier
Specialized checklists, plans, progress measures,
agreements, etc.
Schema for safety evidence
Safety evidence"repository
Diagnostics
basis for
Safety Collaboration Tool
Compliance Checking"Tool
Recommended Practice
IEC 61508 Conceptual Model
Aid to understanding and communicating
IEC 61508
Means for collaboration"
between suppliers and certifiers
Safety profile (UML)
Creation of safety evidence repositories
Automatic checking of compliance
Some Applications of the Model
50
Certifier Supplier
Specialized checklists, plans, progress measures,
agreements, etc.
Schema for safety evidence
Safety evidence"repository
Diagnostics
basis for
Safety Collaboration Tool
Compliance Checking"Tool
Recommended Practice
IEC 61508 Conceptual Model
Aid to understanding and communicating
IEC 61508
Means for collaboration"
between suppliers and certifiers
Safety profile (UML)
Creation of safety evidence repositories
Automatic checking of compliance
Some Applications of the Model
51
Certifier Supplier
Specialized checklists, plans, progress measures,
agreements, etc.
Schema for safety evidence
Safety evidence"repository
Diagnostics
basis for
Safety Collaboration Tool
Compliance Checking"Tool
Recommended Practice
IEC 61508 Conceptual Model
Aid to understanding and communicating
IEC 61508
Means for collaboration"
between suppliers and certifiers
Safety profile (UML)
Creation of safety evidence repositories
Automatic checking of compliance
Some Applications of the Model
52
Certifier Supplier
Specialized checklists, plans, progress measures,
agreements, etc.
Schema for safety evidence
Safety evidence"repository
Diagnostics
basis for
Safety Collaboration Tool
Compliance Checking"Tool
Recommended Practice
IEC 61508 Conceptual Model
Aid to understanding and communicating
IEC 61508
Means for collaboration"
between suppliers and certifiers
Safety profile (UML)
Creation of safety evidence repositories
Automatic checking of compliance
Some Applications of the Model
53
Compliance Risk Assessment
54
• Analyzing the risks of a compliance breach (e.g., compromises) • An assurance case built to:
• argue about compliance based on existing evidence
• show due diligence • assess risks
• Three-tiered structure: • Claims about compliance • Arguments • Evidence
Claim (Top Goal)
Arguments (Decomposed Goals)
Evidence
supports
supports
Goal Modeling
55
Overall Safety Goal
Goal Decomposition
Evidence
Foundations
56
ExpertElicitation
Monte Carlo���Simulation
• Origin: Cognitive Psychology, Nuclear Safety Engineering
• Used in our framework for: eliciting likelihoods for satisfaction of
goals
• Origin: Requirements Engineering• Used in our framework for:
decomposition of high-level objectives, propagation of
probabilistic values from sub-goals to overall safety goals
• Origins: Physics Simulation, Risk Management, Cost Estimation
• Used in our framework for:computing a probability distribution for the satisfaction of overall safety goals
Goal Modeling
MODUS Tool
57
• Combine goal modeling, simulation, and expert elicitation
• Interfaces with commercial"risk assessment tools for"Monte Carlo and sensitivity analysis
• Used in new technology qualification in the energy and maritime domains
• Features
• Goal modeling environment • Implementation of the expert"
elicitation process • Risk analysis • Reporting and project"
management facilities
Compliance with Business Processes
Context
• eGovernment services are expressed and developed as business processes
• System development is outsourced
• Common usage of business process models
• Systems deployed are typically buggy, no full validation is usually possible before deployment
• Off-line and run-time verification of compliance
59
Why Run-time Verification?
• To detect when a (critical) failure occurs
• To collect data for debugging
• To determine corrective actions
• Some information available only at run time
• Behavior may depend on the (evolving) environment
60
Requirements
• Solution adoptable in contexts where MDE is already a development practice
• High-level domain-specific language to allow business analysts to express the properties to be checked
• Solution based on standard and stable MDE technology
• Performance of the solution comparable to the state of the art
61
Our Vision
62
Our Vision
63
Our Vision
64
Our Vision
65
The Three Main Ingredients for Run-time Verification
66
language algorithm run-time architecture
Requirements Specification Language
67
Property Checking
68
Requirements Specification Language
69
Usability Expressiveness
Efficiency
OCLR
• Temporal logic not an option
• Extension of OCL for run-time verification
• Facilitate the expression of temporal properties and real-time constraints based on a trace conceptual model
• Templates based on Dwyer’s classification
• Facilitate (optimal) translation to OCL properties and their verification using a constraint checking engine
70
Dwyer’s Pattern System
71
{ Globally Before After Between-and After-until
Scope
Dwyer’s Pattern System
72
Pattern Universality Existence Absence
Response Precedence {
Extensions • Extensions (new constructs) identified based on analyzing
actual properties in a large case study
• Specific occurrence of an event: “A password reset email will be sent to the user after the third incorrect login attempt”
• Time distance from a boundary: “A speaker should be ready within 10 minutes after the second keynote.”
73
Case Study
• Identity card management system
• 47 properties analyzed and translated into OCLR
• Properties checked on synthesized traces
• Focus on scalability
• Comparison with MonPoly
74
Identity Card Management
75
Request Production
Delivery
Expiration
Loss
Request
76
temporal R1: !let r : Request in !before becomesTrue(r.card.state = CardState::InProduction) isCalled(notifyApproved(r.applicant)) !responding at most 3 tu!becomesTrue(r.state = RequestState::Approved)!
!
Scope: Before Pattern: Response
“Once a card request is approved, the applicant is notified within three days; this notification has to occur before the production of the card is started”
Properties with “globally”
0 200 400 600 800
1000 1200 1400 1600 1800 2000
K 100K 200K 300K 400K 500K 600K 700K 800K 900K 1,000K 1,100K
Aver
age C
heck
Tim
e (m
s)
Trace Size
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12
77
Comparison "t(MonPoly) / t(OCLR-Check)
0 1 2 3 4 5 6 7 8 9
10
K 100K 200K 300K 400K 500K 600K 700K 800K 900K 1,000K 1,100K
Ratio
of C
heck
Tim
e (m
s)
Trace Size
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12
78
Insights
• All the properties (trace size <= 1M) can be checked within 2 seconds
• OCLR-Check scales linearly with respect to the size of the trace
• The performance of OCLR-Check is as good as or better than the state-of-art tool (MonPoly)
79
Reflections
Capturing Legal Requirements • The language must facilitate the modeling of targeted concepts (expressiveness)
• Events and timing, Procedural steps and conditions/constraints
• Obligations, permissions, prohibitions
• The language is an interface among all stakeholders and must account for their professional background, domain expertise and practice
• The language must support the intended analysis or enable translation to a language that does (usually the latter)
• Methodology and tool support, rather than formal semantics …
81
Capturing Legal Requirements • Example: Income tax law project
• Conditions and procedural steps
• Traceability to legal provisions, officers, data model
• Stakeholders include both IT analysts and legal experts
• Application: Simulation, search, optimization
• Rely on augmented activity and class diagrams (profile) with additional information including traceability to sources of requirements and descriptive statistics
82
Capturing Legal Requirements • Example: Uncovering dependencies among legal requirements
• Analysis of cross-references in legal texts
• Must be configurable to many legal texts
• Impact analysis of changes in the law
• Analysis of potential problems, e.g., dangling and circular references
• Modeling of structure and organization of legal documents
83
Capturing Legal Requirements • Example: Safety standard (IEC 61508)
• Application: compliance and risk assessment
• Conceptual model of the standard, supplier organization and processes, and their mapping
• Mapping among stakeholders’ concepts (interpretation and certification) and between standards (specialization and overlapping standards)
• Evidence requirements mandated by the standard
• Analyzable arguments for satisfaction of safety requirements
84
Capturing Legal Requirements
• Example: Business Processes (eGovernment)
• Business Process Models used as specification as part of an MDE methodology
• BPM Properties often capture legal and organizational regulations
• Temporal properties and real-time constraints
• Temporal logic not an option
• OCL extension: OCLR
• Templates to model properties according to scope and pattern (Dwyer)
• Trace conceptual models to define an operational mapping of properties
85
Analysis of Legal Requirements
• What analysis mechanisms are suited?
• Logical reasoning (theorem proving)
• Simulation
• Natural Language Processing
• Search and optimization
• Constraint checking 86
Analysis of Legal Requirements
• This depends on the intended applications
• Identify inconsistency or incompleteness
• Find loopholes and undesirable cases
• Check compliance
• This also partly depends on the scale of the legal requirements to be considered
87
Analysis of Legal Requirements
• Example: Income tax law
• Laws tend to be large, the tax law is no exception
• This leads to many legal requirements (in our case large models)
• Some of our applications require simulation (overall impact of policy changes, compliance testing)
• But even for identifying undesirable situations we rely on simulation
• NLP of legal text to uncover potential dependencies among requirements 88
Analysis of Legal Requirements • Example: Safety Standard
• Support shared understanding between suppliers and certifier about what evidence needs to be collected and how it is going to be used: questionnaire derived from the conceptual model of the standard, explicit and analyzable arguments
• Compliance analysis: making sure all evidence items have been collected (constraint checking)
• Quantitative goal satisfaction analysis based on evidence and expert judgment
89
Analysis of Legal Requirements
• Example: Run-time compliance analysis of business Processes
• Performance is a challenge, large traces
• Reliance on standard (OCL) and mature constraint checking technology
• Optimized mapping to OCL properties on trace conceptual model
• Generated OCL properties are highly complex, but need not be read by any analyst or domain expert
90
Summary
91
Language enabling their analysis
Type of analysis
Scalability of analysis Ease of translation Background and practices of stakeholders
Alignment with domain concepts to be expressed
Legal texts Language to model legal requirements
Configurable, automated support (with traceability)
General Conclusions • Solutions for capturing and analyzing legal requirements are very driven by contextual factors
• They include, prominently, human factors, e.g., background, practices
• Modeling (profiles) and domain-specific languages are typically an important part of these solutions
• Usability studies of domain-specific languages
• Applications typically requires interdisciplinary solutions
• Scale is an essential consideration
• Realistic and well-reported case studies are needed regarding usability and scalability
92
General Conclusions
• Legal requirements engineering is clearly an important topic
• But do we have a clear understanding of the needs in various legal domains?
• Is there a good match between research and practice?
• Do we need more interdisciplinary work?
• Do we make, as researchers, a sufficient attempt to achieve practicality and simplicity?
• There is no alternative to reality in an engineering research discipline
93
References
Legal Modeling and Textual Analysis • [MODELS 2015] G. Soltana et al., A Model-Based Framework for Probabilistic Simulation of Legal Policies,
ACM/IEEE Int. Conference on Model Driven Engineering Languages and Systems, 2015
• [MODELS 2014] G. Soltana et al., UML for Modeling Procedural Legal Rule: Approach and a Study of Luxembourg’s Tax Law, ACM/IEEE Int. Conference on Model Driven Engineering Languages and Systems, 2014
• [RE 2014] M. Adedjouma et al., Automated Detection and Resolution of Legal Cross References, IEEE Int. Conference on Requirements Engineering, 2014
Compliance with Business Process Models • [SAM2014] W. Dou et al., Revisiting Model-driven Engineering for Run-time Verification of Business
Processes, System Analysis and Modeling Conference (SAM 2014)
• [ECMFA 2014] W. Dou et al., OCLR: a More Expressive, Pattern-based Temporal Extension of OCL, European Conference on Modelling Foundations and Applications (ECMFA 2014)
• [TR 2015] W. Dou et al., A Model-Driven Approach to Offline Trace Checking of Temporal Properties with OCL. SnT Technical Report, Submitted to a journal
95
Compliance with Safety Standards • [IST 2013] R. Panesar-Walawege et al., Supporting the Verification of Compliance to Safety
Standards via Model-Driven Engineering: Approach, Tool-Support and Empirical Validation, Information and Software Technology. 55(5):836-864 (2013)
• [IEEE SW 2012] D. Falessi et al., Planning for Safety Evidence Collection: A Tool-Supported Approach Based on Modeling of Standards Compliance Information, IEEE Software, Volume 29, Number 3, May-June 2012
• [ISSRE 2011] R. Panesar-Walawege et al., A Model-Driven Engineering Approach to Support the Verification of Compliance to Safety Standards. 22nd IEEE International Symposium on Software Reliability Engineering, 2011.
• [ER 2011] R. Panesar-Walawege et al., Using UML Profiles for Sector-Specific Tailoring of Safety Evidence Information. 30th International Conference on Conceptual Modeling, 2011.
• [ICST 2010] R. Panesar Walawege et al., Characterizing the Chain of Evidence for Software Safety Cases: A Conceptual Model Based on the IEC 61508 Standard. 3rd IEEE International Conference on Software Testing, Verification, and Validation, 2010.
96
.lusoftware verification & validationVVS
Capturing and Analyzing Legal Requirements
Lionel Briand
Interdisciplinary Centre for Security, Reliability and Trust (SnT) University of Luxembourg
MoDRE - August 2015