Upload
hubert-bryan
View
216
Download
2
Embed Size (px)
Citation preview
Testing Interactive Software: A Challenge for Usability and Reliability
Philippe PalanqueLIIHS-IRIT,
University Toulouse 3, 31062 Toulouse, France
Regina Bernhaupt ICT&S-Center,
Universität Salzburg5020 Salzburg, Austria
Ronald Boring Idaho National
LaboratoryIdaho Falls 83415,
Idaho, [email protected]
Chris JohnsonDept. of Computing Science,
University of Glasgow, Glasgow, G12 8QQ, Scotland
Sandra BasnyatLIIHS-IRIT,
University Toulouse 3,
31062 Toulouse, France
Special Interest Group – CHI 2006 – Montréal – 22nd April 2006
Outline of the SIG Short introduction about the SIG (10 mn) Short presentations (20 mn)
Software engineering testing for reliability (Philippe) Human reliabilty for interactive systems testing
(Ron) Incident and accident analysis and reporting for
testing (Sandra) HCI testing for usability (Regina)
Gathering feedback from audience (10 mn) Presentation of some case studies (20 mn) Listing of issues and solutions for interactive
systems testing (20 mn) Discussion and summary (10 mn)
Introduction What are interactive applications What is interactive applications testing
Coverage testing Non regression testing
Usability versus reliability What about usability testing of a non
reliable interactive application What about reliable applications with poor
usability
Interactive Systems
A paradigm switch
Control flow is in the hands of the user
Interactive application idle waiting for input from the users
Code is sliced Execution influenced by internal and
external states Nothing new but …
6
Classical Behavior
Sortie ?
Fin
sortie ?
Lire une entree
Effectuer un traitement.
Lire une entrée
Effectuer un traitement
Read Input
Exit ?
End
Read Input
Process InputProcess Input
Exit ?
Event-based Functioning
Application
Window Manager
States
At
star
tup
Get next event
Dispatch event
Register Event Handlers
Call Window Manager
Finished
Event Handler 1
Event Handler 2
Event Handler n
EH Registration
Event Queue
At
run
tim
e
Ack received
Wait for next event
Safety Critical Interactive Systems
Safety Critical Systems Software Engineers System centered Reliability Safety requirements
(certification) Formal specification Verification / Proof Waterfall model /
structured Archaic interaction
techniques
Interactive Systems Usability experts User centered Usability Human factors Task analysis & modeling Evaluation Iterative process /
Prototyping Novel Interaction
techniques
Some Well-known Examples (1/2)
Some Well-known Examples
The Shift from Reliability to Fault-Tolerance
Failures will occur Mitigate failures Reduce the impact of a failure
A small demo …
Informal Description of a Civil Cockpit application
The working mode The tilt selection mode: AUTO or
MANUAL (AUTO)The CTRL push-button allows to swap between the two modes
The stabilization mode: ON or OFFThe CTRL push-button allows to swap between the two modesThe access to the button is forbidden when in AUTO tilt selection mode
The tilt angle: a numeric edit box permits to select its valueinto range [-15°; 15°]
Modifications are forbidden when in AUTO tilt selection mode
Various perspectives of this Special Interest Group
Software engineering testing for reliability
Human reliability testing Incident and accident analysis and
reporting for testing HCI testing for usability
14
Consequence: Inconvenience
Consequence: Danger
What do we mean by human error?
Conceptualizing error Humans are natural “error emitters”
On average we make around 5-6 errors every hour Under stress and fatigue that rate can increase
dramatically Most errors are inconsequential or mitigated
No consequences or impact from many mistakes made
Where there may consequences, many times defenses and recovery mechanisms prevent serious accidents
15
Human Reliability Analysis (HRA) Classic Definition
The use of systems engineering and human factors methods in order to render a complete description of the human contribution to risk and to identify ways to reduce that risk
What’s Missing HRA can be used to predict human performance
issues and to identify human contributions to incidents before they occur
Can be used to design safe and reliable systems
16
Performance Shaping Factors (PSFs) Are environmental, personal, or task-
oriented factors that influence the probability of human error
Are an integral part of error modeling and characterization
Are evaluated and used during quantification to obtain a human error rate applicable to a particular set of circumstances
Specifically, the basic human error probabilities obtained for generic circumstances are modified (adjusted) per the specific situation
17
Example: SPAR-H PSFs
Maximizing Human Reliability Increasingly, human reliability needs to go beyond being a diagnostic tool to become a prescriptive tool NRC and nuclear industry are looking at new designs for control
rooms and want plants designed with human reliability in mind, not simply verified after the design is completed
NASA has issued strict Human-Rating Requirements (NPR 8705.2) that all space systems designed to come in contact with humans must demonstrate that they impose minimal risk, they are safe for humans, and they maximize human reliability in the operation of that system
How do we make reliable human systems? Design Test Model
19
} “classic” human factors
} human reliability analysis
Best Achievable Practices for HR The Human Reliability Design
Triptych
20
21
Concluding Thoughts Human error is ubiquitous Pressing need to design ways to prevent
human error Impetus comes from safety-critical systems
Lessons learned from safety-critical systems potentially apply across the board, even including designing consumer software that is usable
Designing for human reliability requires merger of two fields Human factors/HCI for design and testing Human reliability for modeling
Incidents and Accidents as a Support for Testing Aim, contribute to a design method for
safer safety-critical interactive systems Inform a formal system model Ultimate goals
Embedding reliability, usability, efficiency and error tolerance within the end product
While ensuring consistency between models
The Approach (1/2) Address the issue of system redesign
after the occurrence of an incident or accident
2 Techniques Events and Causal Factors Analysis Marking Graphs extracted from a system
model 2 Purposes
Ensure current system model accurately models the sequence of events that led to the accident
Reveal further scenarios that could eventually lead to similar adverse outcomes
The Approach (2/2)
Incident & accident
investigation part
Accident Report
Safety-Case
Analysis
Model The System
ECF Analysis
Re-model The
System
Formal ICO System Model
Including Erroneous Events
Marking Graph
Analysis
Re-Design System Model to make Accident
Torlerant
Extraction of Relevant Scenarios
DocumentProcedureData
Decision Modelling
Key
Part of the whole process
System design
part
Seal On North
Grinder Overheated
Worker checks valves in
containment area are in correct
position to make the switch
South pump idle since 3 days
South pump grinder motor replaced due
to bad seal
Fuel does not flow to
kilns
Low Pressure in the Pipes
Air in the Pipes
Blockage caused by a clog
Blockage caused by a closed valve
Kiln op. radios supervisor to
inform that fuel is not getting to
kilns
Supervisor is informed that
fuel is not getting to kilns
Kiln op. monitors system
through fuel line sensors in control room
Decision Made to
Switch from North to
South DS
Supervisor and Kiln Operator
Discuss Situation
Kiln Operator Notices Seal On
North Grinder Overheated
Kiln Operator Notices Fuel
does not flow to Kilns
Supervisor bleeds air from ¾” ball valves of fuel pipe
system
Supervisor believes it is
an air problem
Worker bleeds air at
south pumps
Worker Unaware of Manufacturer’s
Guidelines against Bleeding Air while
Pumps in Operation
Supervisor Unaware of Manufacture’s
Guidelines against Bleeding Air while
Pumps in Operation
Worker radios kiln op. to find out if fuel has started to go to
kilns
Kiln op. tells Worker fuel is not going
through
Kiln Operator observes Fuel is still not reaching
the Kiln Area
Low Pressure in the Kiln Area
Blockage in pipe abruptly removed
Water- Hammer
Effect
Grinder explode
d & propelled off its base
Air continues to block the monyo grinder fuel piping
Supervisor bleeds air of valve south 330 pump while in
operation
Fuel sprayed from grinder
base covering Supervisor and Worker
Fuel ignites
Kiln area pressure sensor senses low pressure
Kiln area pressure sensor sends signal to pump motors
to activate the ‘Step- Increase’ program
Fuel starts
flowing in the piping
Max PSIG of 334 per motor can be achieved
Kiln area pressure sensor sends signal to F-system
PLC does not respond
PLC not connected to
F-System
Kiln area pressure sensor senses low
pressure
Supervisor unaware of pump
manufacturer advice not to bleed air while pumps in operation
Follow-up maintenance
checks were not performed after new PLC was
installed
New PLC was installed 3
months prior to accident
FOXBRO was still connected
to old PLC system
manufacturer's warning to not
bleed the lines of air while the pumps
were operating
Manufacturer’s warning not
included in training
PLC did not activate the auto shut-down of
pumps (should occur if <60PSI not sensed after
3 mins of startup)
Fire suppression sensors mounted
40' above floor
Fireball occurred approx. 20' above floor
Fire suppression system did not
activate
Kiln control operator saw the fire from the
control room and radioed for help
Control room supervisor, activated the manual
emergency shut down on the pumps
Worker and Supervisor’s clothes catch
fire
Supervisor runs outside
Kiln area
Supervisor uses
extinguisher to extinguish
himself
Worker runs outside
Supervisor tries to
extinguish worker
Supervisor directed Worker to switch the pumps from North to South
12:45pm
Worker Goes to containment area where pumps are
located
12:46pm
Worker shuts down North
system pumps & starts South
system pumps
12:48pm
Pump Motors increase speed in order to achieve pressure of
70 PSI
12:48pm
F-System sends signal to PLC
12:51pm
Supervisor observes pipe entering south
grinder begin to shake&vibrate
12:58pm
ECFA Chart of the Accident
Marking Trees & Graphs
Marking Tree – identify the entire set of reachable states Is a form of state transition diagram Analysis support tools available However, can impose considerable
overheads when considering complex systems such as those in case study
The Approach Not Simplified
Safety-Cases
Safety-Case
Analysis
Accident ReportECF
Analysis
Accident Scenarios
Model The
System
Re-model The
System
Not OK
Marking Graph
Analysis
All Possible Scenarios of
System Model
HERT Data
Formal ICO System Model
Formal ICO System Model
Including Erroneous Events
Extraction of Relevant Scenarios
Re-Design System Model to make
Accident Torlerant
Re-Designed Model
Check New Model
Simulate Scenarios
Relevant Runnable Scenarios
System Model Problems
OK
Finish
INCIDENT AND ACCIDENT INVESTIGATION PART
SYSTEM DESIGN PART
Document
Modelling
model
Decision Procedure
Key
Data
Usability Evaluation Methods (UEM)
UEMs conducted by experts Usability Inspection Methods,
Guideline Reviews, … Any type of interactive systems
UEMs involving the user Empirical evaluation, Observations, … Any type of interactive systems (from
low-fi prototypes to deployed applications)
Usability Evaluation Methods (UEM)
Computer supported UEMs Automatic testing based on
guidelines, … Task models-based evaluations,
metrics-based evaluation, … Applications with standardized
interaction techniques (Web, WIMP)
Issues of Reliability and Usability
Testing the usability of a non reliable system?
Constructing reliable systems without concerning usability?
Possible ways to enhance, extend, enlarge UEMs to address these needs?
Gathering feedback from the audience through case studies Do we need to integrate methods OR develop new
methods ? In favor of integration
Joint meetings (including software developers) through brainstorming + rapid prototyping (more problems of non usable reliable systems)
Problems Some issues are also related to system reliability (ATMs)
problem of testing a prototype versus testing the system Issues of development time rather than application type Application type has an impact of the processes
selected for development Don’t know how to build a reliable interactive system …
whatever time we have How can reliablity-oriented methods support usability-
oriented methods
Gathering feedback from the audience through case studies How to design for testability (both the reliability of the
software and the usability) Is testing enough or do we need proof Usability testing is at higher level of abstraction (goal
oriented) while software testing is at lower level (functions oriented)
Is there an issue with interaction techniques (do we need precise description of interaction techniques and is it useful for usability testing?)
Automated testing through user-events simulation (how to understand how the user can react to that?)
Issue of reliability according to the intention of the user? and not only the reliability of the system per se
Beyond one instance of use but on reproducing the use many times
Gathering feedback from the audience and case studies
Control Room (Ron) Home/Mobile – testing in non
traditional environments (Regina) Mining case study (Sandra)
First Case Study: Control Room
Advanced Control Room DesignTransitioning to new domains of Human
System Interaction
Problem: Next generation nuclear power plants coupled with advanced instrumentation and controls (I&C), increased levels of automation and onboard intelligence all coupled with large-scale hydrogen production present unique operational challenges.
PBMR Conceptual design
Typical DesignHybrid Controls
Example
Software Interface with:Software Interface with: Cumbersome dialog boxCumbersome dialog box No discernible exitsNo discernible exits Good shortcutsGood shortcuts
Example
10 1 1 1 10 .1 1 1 1 0.1
UCC =
0.1 x 2 =
0.2
Second Case Study: Mobile interfaces
Testing Mobile Interfaces
Lab or field Method selection Data gathering/
analysis Problematic Area:
Testing in non traditional environment
Non Traditional Environments
Combine and balance different UEMs according to usability/reliability issues
Combine Lab and Field Select UEMs according to
development phase
Third Case Study: Mining Accident
Reminder
Events & Causal Factors Analysis (ECFA)
Provides scenario of events and causal factors that contributed to the accident Chronologically sequential representation Provides overall picture Relation between factors
Gain overall perspective of Casual factors such as conditions (pressure,
temperature…), evolution of system states
Analysing the accident
Fatal mining accident involving human operators, piping system & control system
Decided to switch from North to South Fuel didn’t arrive to plant kilns Bled pipes while motors in operation Motor speed auto-increase due to low
pressure Fuel hammer effect Grinder exploded
Seal On North
Grinder Overheated
Worker checks valves in
containment area are in correct
position to make the switch
South pump idle since 3 days
South pump grinder motor replaced due
to bad seal
Fuel does not flow to
kilns
Low Pressure in the Pipes
Air in the Pipes
Blockage caused by a clog
Blockage caused by a closed valve
Kiln op. radios supervisor to
inform that fuel is not getting to
kilns
Supervisor is informed that
fuel is not getting to kilns
Kiln op. monitors system
through fuel line sensors in control room
Decision Made to
Switch from North to
South DS
Supervisor and Kiln Operator
Discuss Situation
Kiln Operator Notices Seal On
North Grinder Overheated
Kiln Operator Notices Fuel
does not flow to Kilns
Supervisor bleeds air from ¾” ball valves of fuel pipe
system
Supervisor believes it is
an air problem
Worker bleeds air at
south pumps
Worker Unaware of Manufacturer’s
Guidelines against Bleeding Air while
Pumps in Operation
Supervisor Unaware of Manufacture’s
Guidelines against Bleeding Air while
Pumps in Operation
Worker radios kiln op. to find out if fuel has started to go to
kilns
Kiln op. tells Worker fuel is not going
through
Kiln Operator observes Fuel is still not reaching
the Kiln Area
Low Pressure in the Kiln Area
Blockage in pipe abruptly removed
Water- Hammer
Effect
Grinder explode
d & propelled off its base
Air continues to block the monyo grinder fuel piping
Supervisor bleeds air of valve south 330 pump while in
operation
Fuel sprayed from grinder
base covering Supervisor and Worker
Fuel ignites
Kiln area pressure sensor senses low pressure
Kiln area pressure sensor sends signal to pump motors
to activate the ‘Step- Increase’ program
Fuel starts
flowing in the piping
Max PSIG of 334 per motor can be achieved
Kiln area pressure sensor sends signal to F-system
PLC does not respond
PLC not connected to
F-System
Kiln area pressure sensor senses low
pressure
Supervisor unaware of pump
manufacturer advice not to bleed air while pumps in operation
Follow-up maintenance
checks were not performed after new PLC was
installed
New PLC was installed 3
months prior to accident
FOXBRO was still connected
to old PLC system
manufacturer's warning to not
bleed the lines of air while the pumps
were operating
Manufacturer’s warning not
included in training
PLC did not activate the auto shut-down of
pumps (should occur if <60PSI not sensed after
3 mins of startup)
Fire suppression sensors mounted
40' above floor
Fireball occurred approx. 20' above floor
Fire suppression system did not
activate
Kiln control operator saw the fire from the
control room and radioed for help
Control room supervisor, activated the manual
emergency shut down on the pumps
Worker and Supervisor’s clothes catch
fire
Supervisor runs outside
Kiln area
Supervisor uses
extinguisher to extinguish
himself
Worker runs outside
Supervisor tries to
extinguish worker
Supervisor directed Worker to switch the pumps from North to South
12:45pm
Worker Goes to containment area where pumps are
located
12:46pm
Worker shuts down North
system pumps & starts South
system pumps
12:48pm
Pump Motors increase speed in order to achieve pressure of
70 PSI
12:48pm
F-System sends signal to PLC
12:51pm
Supervisor observes pipe entering south
grinder begin to shake&vibrate
12:58pm
ECFA Chart of the Accident
Listing of issues and solutions for interactive systems testing
Hybrid methods (Heuristic evaluation refined (prioritisation of Heuristics))
Remote usability testing Task analysis + system modelling Cognitive walkthrough (as is)
Towards Solutions
Formal models for supporting usability testing
Formal models for incidents and accidents analysis
Usability and human reliability analysis
Usability Heuristics Heuristics are key factors that comprise a
usable interface (Nielsen & Molich, 1990)
Useful in identifying usability problems Obvious cost savings for developers 9 heuristics identified for use in the present
study In our framework, these usability heuristics are
used as
“performance shaping factors” to constitute a usability error probability (UEP)
Heuristic Evaluation and HRA
“Standard” heuristic evaluation
HRA-based heuristic evaluation
Heuristic Evaluation MatrixSteps
• Determine level of heuristic
• Determine product of heuristic multipliers
• Multiply product by nominal error rate
Consequence Consequence DeterminationDetermination
Strict consequence assignment in PRA/HRA, Strict consequence assignment in PRA/HRA, part of cut sets approachpart of cut sets approach
• More molar approach taken in the present studyMore molar approach taken in the present study
• “ “Likely effect of usability problem on usage”Likely effect of usability problem on usage”• NotNot literal consequence model literal consequence model
• Results in usability consequence coefficient (UCC)Results in usability consequence coefficient (UCC)
• Four consequence levels assigned Four consequence levels assigned • high, medium, low, and nonehigh, medium, low, and none
Usability Consequence Usability Consequence MatrixMatrix
Steps
• Determine level of usability consequence
• Multiply UEP by consequence Multiplier
• Usability Consequence Coefficient determines priority of fix
Example
Software Interface with:Software Interface with: Cumbersome dialog boxCumbersome dialog box No discernible exitsNo discernible exits Good shortcutsGood shortcuts
ExampleExample
10 1 1 1 10 .1 1 1 1 0.1
UCC =
0.1 x 2 =
0.2
Listing of issues and solutions for new interaction techniques testing
Roadmap on Testing
Interactive Systems
Target Applications, Domains - context
Software Engineering Issues Notations and Tools
User Interface Interaction Technique
No more usability problems ? No more bugs ?
Automated autonomous Real-Time Systems (VAL, TCAS) B (Atelier B), Z, … No Interaction
Technique
WIMP - hierarchical
Direct Manipulation
Augmented Reality
Command and Control Systems
All Types of Applications
Tangible User Interface
2006
2020TO
DA
Y
2009
Web Applications
Multimodal Interaction
Business ApplicationsUML, E/R, …
•Full concurrency•Dynamic instantiation•Hardware/Software•Infinite number of states•Tool support•Advanced Analysis techniques
Embodied UI
Mobile phones
Mobile systems
Web systems
Gaming
Future Plans and Announcements Future plans
Web site is setup and will be populated (slides, list of attendees, topics, …) http://liihs.irit.fr/palanque/SIGchi2006.html
Further work IFIP WG 13.5 on Human Error Safety and System
Developement [email protected] NoE ResIST (Resilience for IST) www.resist-noe.org Workshop on Testing in Non-Traditional Environments
at CHI 2006 MAUSE: www.cost-294.org
Announcements DSVIS 2006, HCI Aero, HESSD next year
Best Achievable Practices for HRThe Human Reliability Design Triptych
63
Best Practices for Design Compliance with applicable standards and best practices
documents Where applicable, ANSI, ASME, IEEE, ISO, or other discipline-specific
standards and best practices should be followed Consideration of system usability and human factors
System should be designed according to usability and human factors standards such as NASA-STD-3000, MIL-STD-1472, or ISO
Iterative design-test-redesign-retest cycle Tractability of design decisions
Where decisions have been made that could affect the functions of the system, these decisions should be clearly documented
Verified reliability of design solutions Reliability of systems should be documented through vendor data,
cross-reference to the operational history of similar existing systems, and/or test results.
It is especially important to project system reliability throughout the system lifecycle, including considerations for maintenance once the system has been deployed
It is also important to incorporate the estimated mean time before failure into the estimated life of the system
64
Best Practices for Testing Controlled studies that avoid confounds or experimental artifacts
Testing may include hardware reliability testing, human-system interaction usability evaluation, and software debugging
Use of maximally realistic and representative scenarios, users, and/or conditions
Testing scenarios and conditions should reflect the range of actions the system will experience in actual use, including possible worst-case situations
Use of humans-in-the-loop testing A system that will be used by humans should always be tested
by humans Use of valid metrics such as statistically significant results for
acceptance criteria Where feasible, the metrics should reflect system or user
performance across the entire range of expected circumstances In many cases, testing will involve use of a statistical sample
evaluated against a pre-defined acceptance (e.g., alpha) level for “passing” the test
Documented test design, hypothesis, manipulations, metrics, and acceptance criteria
Should include the test design, hypothesis (or hypotheses), manipulations, metrics, and acceptance criteria
65
Best Practices for Modeling Compliance with applicable standards and best practices documents
E.g., NASA NPR 8705.5, Probabilistic Risk Assessment (PRA) Procedures for NASA Programs and Projects or NRC NUREG-1792, Good Practices for Implementing Human Reliability Analysis
Use of established modeling techniques It is better to use an existing, vetted method than to make use of
novel techniques and methods that have not been established Validation of models to available operational data
To ensure a realistic modeling representation, models must be baselined to data obtained from empirical testing or actual operational data
Such validation increases the veracity of model extrapolations to novel domains
Completeness of modeling scenarios at the correct level of granularity A thorough task analysis, a review of relevant past operating
experience, and a review by subject matter experts help to ensure the completeness of the model
The appropriate level of task decomposition or granularity should be determined according to the modeling method’s requirement, the fidelity required to model success and failure outcomes, and specific requirements of the system that is being designed
Realistic model end states End states should reflect reasonable and realistic outcomes
across the range of operating scenarios
66