46
Automation to overcome human error: true or illsion? A Case Study: TCAS Alerts Towards higher levels of automation in ATM HALA! Summer School Cursos de Verano Politécnica de Madrid La Granja, July 2011 Guest Lecturer: José L Garcia-Chico CRIDA [email protected]

Automation to overcome human error: *true or illsion?

Embed Size (px)

DESCRIPTION

Automation is not the solution to eliminate the ill-defined term "human error". Incident investigation gives contextual situation to understand errors. A study case of TCAS is presented. Presented at UPM summer courses

Citation preview

Page 1: Automation to overcome human error: *true or illsion?

Automation to overcome human error:

true or illsion? A Case Study: TCAS Alerts

Towards higher levels of automation in ATM HALA! Summer School Cursos de Verano Politécnica de Madrid La Granja, July 2011

Guest Lecturer: José L Garcia-Chico CRIDA [email protected]

Page 2: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 3: Automation to overcome human error: *true or illsion?

!   Automation (=technology) !   Any mechanical or electronic

replacement of human labor (either physical or mental)

!   Sensing environmental variables !   Data processing and decision making

(by computers) !   Action, either on the environment or

by communicating information

!   Aviation systems are socio-technological systems !   Computers/machines !   Software !   Humans !   Procedures & Organizational

processes 23-24/02/11

Use of automation is pervasive in safety-critical domains

Page 4: Automation to overcome human error: *true or illsion?

Improve System Performance • Fast and consistent (reliable) • Multichannel for data sensing • Fatigue free

Reduce costs • Fast • Available cheap computing

Enhanced human abilities • Offload skill based tasks (easy and repetitive) • Limitations (e.g., memory, slow, variable)

The cool factor • Available computing power • Technological imperative (cutting edge)

Why automate?

Most common given reasons

Reduce human error • Fast and accurate (variability ↓↓ ) • Power of huge data processing • Emotions/stress free

(Hollnagel , 2004)

Page 5: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 6: Automation to overcome human error: *true or illsion?

Some definitions of human error... refered to expectations and the context.

23-24/02/11 Aspectos fundamentales de la Validación en SESAR

Error will be taken as a generic term to encompass all those occasions in which a planned sequence of mental or physical activities fails to achieve its intended outcome, and when these failures cannot be attributed to the intervention of some change agency. (Reason, 1990)

An inappropriate action, or intention to act, given a goal and the context in which one is trying to reach that goal. (Ramon, 1995)

Actions by human operators can fail to achieve their goal in two different ways: the actions can go as planned, but the plan can be inadequate, or the plan can be satisfactory, but the performance can still be deficient. (Hollnagel, 1993)

Page 7: Automation to overcome human error: *true or illsion?

What is important to know about human error?

23-24/02/11

“To err is human…”(Cicero, I century BC)

“…to understand the reasons why humans err is science” (Hollnagel, 1993)

!   Erroneous acts are inevitable !   It is in our nature !   Can happen to anyone any time in multiple contexts !   Some are preventable: effort to design error-tolerant and error-recovery

!   Human error is what happens !   It is the “what” but not the “why”. Need to understand the whole system. !   Not all have disastrous consequences. Errors and accidents are remotely

related !   It takes many jointly factors to lead systems to failure

!   Human error cannot be the aim of investigations !   It can not be used as blaming tool towards operator at the sharp end (Fitt

´s and Champanis´ early work on design error) !   Study human error may increase understanding of the system. Source of

lessons learnt !   Identification of errors is influence by many factors (e.g., investigator

biases, external pressures)

Page 8: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 9: Automation to overcome human error: *true or illsion?

Cognitive science helps to classify error • Definition of taxonomic systems •  Interpretation of underlying psychological

mechanisms • Operator focused

Three major taxonomies • Based on schema activation theory (Norman,

1981) • Based on cognitive control (SRK Theory.

Rasmussen, 1986) •  Based on generic error modeling (Reason, 1990)

23-24/02/11

Human error taxonomies

Page 10: Automation to overcome human error: *true or illsion?

Norman (1981) classification of errors based on schema activation

23-24/02/11

!   Schemas (Neisser, 1976) !   schema are sensory-motor knowledge

structures stored in memory used to guide behavior: efficient and low energy

!   Hierarchy of schemas are triggered if particular conditions are satisfied

!   Mental models & information guide behavior !   Knowledge or the world directs behavior

and search of information !   Pieces of information are sampled to act !   Information serves to update internal

cognitive schemas

Page 11: Automation to overcome human error: *true or illsion?

Norman (1981) classification of errors is apparent to describe skilled behavior

• Misinterpretation of situation → wrong schema activation

• e.g., Mode error

Error in formation of

intention

• Due to similar trigger conditions → wrong schema activation

• E.g., similar sequence of actions, external data-driven wrong activation

Error in faulty activation of

schema

• Activation too early or too late • E.g., timing in execution

Error in faulty triggering of

schemas

23-24/02/11

Page 12: Automation to overcome human error: *true or illsion?

Rasmussen (1986) behaviour account for experience , skill and familiarity

23-24/02/11

Page 13: Automation to overcome human error: *true or illsion?

Characteristics of SRK behavior

23-24/02/11

Page 14: Automation to overcome human error: *true or illsion?

Rasmussen (1986) errors account for experience , skill and familiarity

•  Attentional failure to monitor progress •  Forgetfulness •  Misrecognition of events (perceptual)

Skilled-based errors

• Misapplication of good rules (tend to apply rules after pattern matching)

• Application of bad rules (use of inadequate shortcats) Ruled-based

errors

• Lack of knowledge and high load of memory • Incomplete/incorrect mental models of the problem • Confirmation Bias (People tend to seek information that confirms the chosen course of action and to avoid test to disconfirm the choice)

Knowledge- based errors

23-24/02/11

Page 15: Automation to overcome human error: *true or illsion?

Reason (1990) generic error modeling

23-24/02/11

Skill-based level

an action not in accord with the your intentions: a good plan but poor execution  

failure to carry out any action at all, tied to failure of memory  

Action intentionally goes as human plans, but the plan is wrong. This is a planning failure. These are errors of judgment, inference, and the like  

Page 16: Automation to overcome human error: *true or illsion?

Error distribution according to Reason

23-24/02/11

!   Humans are prone to slip & lapses with familiar tasks !   61% of errors are skill-based !   Increased skills do not guarantee error-free

performance, just different type of errors

!   Humans are prone to mistakes when tasks become difficult !   28% of errors are rule-based !   11% of errors are knowledge-based that require novel

reasoning from principles.

Approximate data obtained averaging three studies (Reason, 1990)

Page 17: Automation to overcome human error: *true or illsion?

Humans are error prone, but…is that all?

!   Operator (sharp end) are not responsible of system disasters, just because they are the last one and more visible.

!   Distinction between !   Active errors: error associated with the performance of

the front-line operators, i.e. pilots, air traffic controllers, control rooms crews, etc

!   Latent errors: related to activities removed in time and space form the direct control interface, i.e. designers, managers, maintenance, supervisors

•  (Reason 1997)

Page 18: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 19: Automation to overcome human error: *true or illsion?

Treatment of human error depends on the accident model in use and biases of analyst

23-24/02/11

  In searching for a cause, investigators tend to !   focus on the sequence of events in accidents (accident model) !   explain why operators missed actions to prevent the accident (hindsight) !   Stop in the most unreliable and least understood part (e.g., human) !   Be guided by confirmation bias

(Hollnagel 2004, Dekker 2006)

Assume: System is basically safe,

and major contributor is

human

Analyze where humans were

involved

Find one error (or variable

performance) and assign that

as a cause

Page 20: Automation to overcome human error: *true or illsion?

Sequential model (simple linear) • Decomposable in parts (probability of parts-fault tree) • Chain of events (domino effect) • Humans are another chain • Route cause

Epidemiological model (complex linear) • Decomposable in parts • Latent failures (management and/or design) • Pathogens activated by other factors/errors • Degradation of barriers/defenses

Systemic model •  Decomposition does not work for socio-technical systems →  emergent properties

• Socio-technical systems are non-linear •  Accidents result from unexpected combinations (resonance) of normal performance variability

• Safety requires constant ability to anticipate to future events avoiding resonance

23-24/02/11

Three main types of accident models

Chees model (Reason, 1997)

Domino Theory (Heinrich, 1931)

(Hollnagel 2004, Dekker 2006)

FRAM (Hollnagel, 2000)

Page 21: Automation to overcome human error: *true or illsion?

System are too complex • Not all situations are predictable and specified • Accidents are due to unexpected combination

of performance (coincidence rather than chain)

• Safety is built through constant ability to anticipate future events (monitoring and dumping)

Performance variability

• Human adds variability but it is the essence of success of systems (adaptable to unknown), but the failure as well (expectations built in the design)

23-24/02/11

Systemic view of socio-technical systems highlights humans as assets

Page 22: Automation to overcome human error: *true or illsion?

Learning from past accident/incident

!   Great source of lessons to be learnt…not of facts to blame

!   Careful considerations to keep in mind: !   Most people involved in accidents are not stupid nor

reckless. They do what it makes sense at that time. !   Be aware of possible influencing situational factors !   Be aware of concurrences of factors (variability of

performance) !   Be aware of the hindsight bias of the retrospective analyst.

Hindsight bias: Possession of output knowledge profoundly influence the way we analyze and judge past events. It might impose a deterministic logic on the observer about the unfolding events that the individual at the incident time would have not had.

Page 23: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 24: Automation to overcome human error: *true or illsion?

Mid-air collision in Uberlienguen

!   Uberlinguen (2002) !   B757 and Tu-154 collided

!   German airspace, under Zurich control. 71 people were killed.

!   Only one controller was in charge of two positions during a night shift !   Two separated displays !   Telephone and STCA under maintenance.

!   ATC clearance in opposition to what TCAS indicates and pilot followed !   ATC detected late the conflict between both

aircraft, and instructed T-154 to descend. !   The TCAS on board the T-154 and B757

instructed the pilots to climb and descend respectively.

!   The T-154 pilot opted to obey controller orders and began a descent to FL 350 where it collided with the B757. B757 had followed its own TCAS advisory to descend.

Page 25: Automation to overcome human error: *true or illsion?

Slide 25

Motivation of case study (Operational Errors co-occurring with TCAS RA)

•  Classify operational errors and contextual factors in ATC in search of trends and consistency in the classification.

•  Assumption: classification of errors provides understanding of system performance and organization context → better understanding of work possible variability limits

•  Caution: concept of cause, as raised by Dekker (2004) is not addressed

•  Focus on one circumstance associated to OE: presence of TCAS RA

•  TCAS is an effective safety system, but…

  It might disrupt the controller’s SA (Brooker, 2004; Wickens et al, 1998)

 Amplified by the fact that changes FL (number in datablock)

  It might create inconsistent pilot and controller responses (Rome et al., 2006, Wickens et al, 1998)

  Understand procedural and informational context of OEs co-occurring with TCAS RAs.

Page 26: Automation to overcome human error: *true or illsion?

TCAS – expected behavior

!   For TCAS to work as designed, immediate and correct crew response to TCAS advisories is essential.

!   Regulation of TCAS: operational procedures and practices (FAA AC 120-55B)

Pilots: !   Should follow TCAS RA, unless doing so would jeopardize the safe

of operation. !   During an RA, do not maneuver contrary to the RA based solely

upon ATC instructions. !   S/he has to report any deviation from ATC clearance, as soon as

practicable after responding to the RA, and resume previous clearance after “clear of conflict”

Controllers: !   Will not knowingly issue instructions that are contrary to RA

guidance when they are aware that a TCAS maneuver is in progress.

Page 27: Automation to overcome human error: *true or illsion?

TCAS events timeline (“desired”)…a hints that humans have to adapt to technology

Adapted from: Brooker, P. (2004). Thinking about downlink of resolution advisories from airborne collision avoidance systems. Human Factors and Aerospace Safety 4 (1), 49-65.

Timeline ATC SA Impaired ATC aware deviation

Pilot notifies deviation

Pilot notifies Return to clearance

RA

Pilot follows RA & deviates from clearance

Chances to receive ATC clearance in opposition to RA

Controller provides traffic info, If workload permits

Controller is not responsible to provide separation

Clear of conflict

Page 28: Automation to overcome human error: *true or illsion?

Methods

!   Exploratory Study: mapping relationships in the data !   Tentative results pending of larger studies.

!   Analysis of errors based on preliminary and final Air Traffic Controller Reports (FAA Operational Error Detection Program)

!   Comprising two studies/datasets: !   Taxonomic Study: classification of OE incident initial reports (Jan-

Jun 04 period: 480 OE reports) !   Classification of OEs based on FAA investigation. !   Relevance of coordination, training, proximity, time on position.

!   Case Study: OEs with presence of TCAS RAs. Final reports (Jan-Jun 2004 & 2005: 62 reports) !   Use of same classification. !   Characterization of the TCAS RA events. Human response.

Page 29: Automation to overcome human error: *true or illsion?

Slide 29

Results Study 1: Taxonomic Study

Operational Errors Reports Operational Error Classification

Terminal Radar 162 (33.96%) 250 (30.9%)

ARTCC 318 (66.04%) 560 (69.1%)

TOTAL 480 810

Page 30: Automation to overcome human error: *true or illsion?

Slide 30

No effect on proximity due to experience of operator

•  No statistical support to claim that proximity is higher with developmental (i.e., trainee)

Page 31: Automation to overcome human error: *true or illsion?

Slide 31

Error Severity and Frequency by Time on Shift

•  Not found statistical significance in the distribution of frequencies (60 min.)

•  Not been able to claim that errors are more likely after change of shift. •  No evidence that errors were more severe in the first minutes after

taking over control (X2 (10,N=373)=7.27, p=0.700)

(Chi-square X2 (11,N=388)=6.575, p=0.832)

Page 32: Automation to overcome human error: *true or illsion?

Slide 32

OEs co-occurring with TCAS

Jan-Jun 04 Jan-Jun 05

Terminal Radar 8 (30.8%) 34 (43.6%)

ARTCC 18 (69.2%) 44 (56.4%)

TOTAL 26 78

Page 33: Automation to overcome human error: *true or illsion?

Slide 33

Controller communication in TCAS situations varies from what is “expected”

Page 34: Automation to overcome human error: *true or illsion?

Slide 34

Controller´s commands in TCAS situations were given in the vertical plane

Page 35: Automation to overcome human error: *true or illsion?

Slide 35

ATC vertical commands after RA and flight deck report

Page 36: Automation to overcome human error: *true or illsion?

Slide 36

Deviations from “expected” controller behavior

Incomplete = missing any pilot’s message, missing callsign, TCAS direction or excessive delay Before and after refers to the action of controller in relation to the TCAS RA event. Traffic, heading, or altitude mean ATCO gave traffic info, or change heading, or altitude

Page 37: Automation to overcome human error: *true or illsion?

Highlights on the chain of events during TCAS RA encounters in OE reports

!   Controllers issued clearances after TCAS RA in the vertical plane in 13 situations (21 %).

!   Controllers received incomplete information in 26 situations (43.5%) and no information in 3 (5%). Opportunities for wrong decisions.

!   Controllers issued vertical clearances after TCAS RA and incomplete pilot’s reports in 12 situations (19,4 %).

!   opposite altitude clearance in 3 reports (4.8%) !   Pilot reports were all late after TCAS RA and controller clearance

!   Data suggests that it is more likely to receive an opposite clearance if the controller receive incomplete pilot information.

Page 38: Automation to overcome human error: *true or illsion?

Conclusions on study

!   Value of systematic characterization of errors !   OE classification would allow prioritization of actions !   Help to understand system behavior (together with humans)

!   Contextual factors: !   Potentially staffing organizational issue (Planner controller)

!   Error reports concurrent with TCAS RA: !   OEs with similar patterns to full dataset !   Not consistent pilot-controller behavior (deficient information/

actions) → variability of performance !   Incomplete/late information increases chances of vertical

clearances incompatibles with RA direction

Page 39: Automation to overcome human error: *true or illsion?

Potential Actions that are being considered

!   Increase training recreating TCAS RA situations !   Under stress situation, abnormal events trigger more familiar

responses (i.e., issue vertical clearance). Traditional solution.

!   Revisit downlinking RAs (more automation) !   Not obvious solution, with important implications

!   Draw too much controller attention !   TCAS RA is not the most relevant information, but the pilot deviation

from clearance !   Controller’s responsibility and liability implications

!   Aircraft following RAs without pilot intervention (more automation) !   Not obvious solution either: Potential for mode errors

Page 40: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 41: Automation to overcome human error: *true or illsion?

!   Change the role of the operators in the system: new opportunities for error

!   Sources for error are distributed/changed, but not eliminated !   Increase opeartor´s demand (system faster, complex, workload,

memory) !   Rely on capabilities where human are not good at (e.g.,monitoring

without being in the loop) !   Deskill people because the opportunities to practice decrease

!   Sometimes ill-adapted !   Require new skills and more knowledge on the operator !   Remove operator, thus facing unexpected situations (e.g.,

adapting procedures) are difficult/imposible to cope. !   Operator is skillful at judging when and how to adapt performance/

procedure

Automation is not bad per se, but potential issues with automation may lead to…

Page 42: Automation to overcome human error: *true or illsion?

Automation of socio-technical systems

What´s human error?

Taxonomies of human error

A case study: OEs with TCAS interjection

Accident models

Take home messages

References and readings

Page 43: Automation to overcome human error: *true or illsion?

Readings and references (1)

23-24/02/11

!   Besnard, D. Greathead, D., & Baxter, G. (2004). When mental models go wrong. Co-occurrences in dynamic, critical systems. International Journal of human Computer Studies, 60, 117-128.

!   BFU (2004). Uberlingen midair collision. Investigation Report AX001-1.2/02. Braunschweig, Germany: German Federal Bureau of Aircraft Accidents Investigation.

!   Brooker, P. (2004). Thinking about downlink of resolution advisories from airborne collision avoidance systems. Human Factors and Aerospace, 4 (1), 49-65.

!   Brooker, P. (2005). STCA, TCAS, airproxes and collision risk. The Journal of Navigation, 58, 389-404.

!   Dekker, S. W. A. (2002) Reconstructing human contributions to accidents: the new view on error and performance. Journal of Safety Research, 33, pp. 371-385.

!   Dekker, S. (2006). The field guide to understanding human error. Brookfield, VT: Ashgate.

!   Endsley, M & Rodgers, M (1997). Distribution of Attention, Situation Awareness, and Workload in a Passive Air Traffic Control Task: Implications for operational errors and automation. DOT/FAA/AM-9713.

!   Eurocontrol (2003). Review of ACAS RA downlink: An assessment of the technical feasibility and operational usefulness of providing ACAS RA awareness on CWP. Brussels, Belgium.

!   FAA (2000). Introduction to TCAS II version7. Washington, DC: US Department of Transportation Garcia-Chico, J. L. (2006). A Human Factors Analysis of Operational Errors in ATC. The TCAS Case Study. Master’s thesis, San Jose State University, CA.

Page 44: Automation to overcome human error: *true or illsion?

Readings and references (2)

23-24/02/11

!   Hollnagel, E. (1993). The phenotype of erroneous actions. International Journal of Man-Machines Studies, 39, 1-32.

!   Hollnagel, E. (2004) Barrieres and accident prevention. Brookfield, VT: Ashgate. !   Neisser, U. (1976) Cognition and Reality: Principles and Implications of Cognitive

Psychology. Freeman, San Francisco. !   Norman, A. D. (1981). Categorization of slips. Psychological review, 88 (1), 1-15. !   Nunes, A. & Laursen, T. (2004). Identifying the factors that led to the Ueberlingen mid-

air collision: implications for overall system safety. Proceedings of the 48th Annual Meeting of the Human Factors and Ergonomics Society (pp. 20-24). New Orleans, LA.

!   Parasuraman, R. (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39 (2), 230-253.

!   Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human interaction. IEEE Transactions on Systems, Man, and Cybernetics, 30 (3), 286-297

!   Pounds, J., & Ferrante, A. S. (2005). FAA strategies for reducing operational error causal factors. In B. Kirwan, M. Rodgerds, & D. Schafer (Eds.), Human factors impact in air traffic management (pp. 89-105). Aldershot, UK: Ashgate.

!   Pritchett, A. R. (2001). Reviewing the role of cockpit alerting systems. Human Factors and Aerospace Safety, 1 (1), 5-38.

!   Rasmussen, J. (1982). Human errors: A taxonomy for describing human malfunction in industrial installations. Journal of Occupational Accidents, 4, 311-33.

!   Rasmussen, J. (1986) Information Processing and Human–Machine Interaction. North-Holland, Amsterdam.

Page 45: Automation to overcome human error: *true or illsion?

Readings and references (3)

23-24/02/11

!   Reason, J. T. (1990). Human Error. Cambridge University Press, Cambridge U.K., 1990. !   Reason, J. T. (1997). Managing the risks of organizational accidents. Aldershot,

England: Ashgate Publishing Company. !   Shorrock, S.T. & Kirwan, B. (1988). The development of TRACEr Technique for

Retrospective Analysis of Cognitive Errors in ATM. 2nd Conference on Engineering Psychology and Cognitive Ergonomics (pp. 28-30), Oxford.

!   Wickens, C. D., & Hollands, J. G. (2000). Engineering Psychology and Human Performance (3rd ed.). New Jersey, NJ: Prentice Hall

!   Wickens, C. D., Mavor, M., Parasuraman, R., & McGee, J. (1998). The future of air traffic control: Human operators and automation. Washington, DC: National Academy of Science.

!   Wiegmann, D. A., & Shappell, S. A. (2003). A human error approach to aviation accident analysis: The human factors analysis and classification system. Aldershot, UK: Ashgate.

!   Woods, D.D. & Cook, R.I. (2002). Nine steps to move forward from error. Cognition, Technology, and Work, 4, 137-144

Page 46: Automation to overcome human error: *true or illsion?

Centro de Referencia I+D+i ATM

23-24/02/11 Aspectos fundamentales de la Validación en SESAR