8
Int. J. Man-Machine Studies (1987) 27, 463-470 Cognitive aids in process environments: prostheses or tools? JAMES REASON Department of Psychology, University of Manchester, Manchester M13 9PL, U.K. Human fallibility in one form or another is the major contributor to catastrophic failures in complex and hazardous process environments. Few would disagree with this assertion, especially in the aftermath of Chernobyl. Nor would many quarrel with the claim that human operators need more help in operating such systems, particularly during disturbances. Where opinion divides, however, is on such questions as: * Why this help is needed, * Who should have it, and * What forms it should take. 1. Two ways of looking at cognitive aids 1.1. THE SANGUINE VIEW The optimists believe that the problem of operator error in process environments will ultimately have a technical solution. The same exponential growth in computer technology that made centralized, supervisory control possible in the first place can also provide the "cognitive tools"--felicitous extensions of normal brain-power-- that will enable operators to deal successfully with its current problems of complexity and opacity. Such optimists, particularly if they are systems designers, might also wish to claim that, in any case, human involvement in such systems (and hence the attendant risk of dangerous operator errors) will gradually decline as the possible scenarios of failure are better understood and further non-human methods of coping with them are devised. 1.2. A DARKER VIEW A more pessimistic view, and the one espoused here, is that most operator errors arise from a mismatch between the properties of the system as a whole and the characteristics of human information processing. System designers have unwittingly created a work situation in which many of the normally adaptive characteristics of human cognition are transformed into dangerous liabilities. And---the argument continues---since the problem is fundamental to the design of such systems, and since (short of closure) these installations are likely to be with us for many years to come, the only immediate remedy is to provide the human participants, both operators and maintenance personnel, with cognitive prostheses (or, in plainer English, mental "crutches") that will help to compensate for some of these artificially-enhanced error tendencies. 2. Analysing the human error problem This section will focus upon analysing the nature of the human error problem in process environments. I will look briefly at three issues: (1) fundamental difficulties 463 00207373/87/050463 + 08503.00/0 1987 Academic Press Limited

Cognitive aids in process environments: prostheses or tools?

Embed Size (px)

Citation preview

Page 1: Cognitive aids in process environments: prostheses or tools?

Int. J. Man-Machine Studies (1987) 27, 463-470

Cognitive aids in process environments: prostheses or tools? JAMES REASON

Department of Psychology, University of Manchester, Manchester M13 9PL, U.K.

Human fallibility in one form or another is the major contributor to catastrophic failures in complex and hazardous process environments. Few would disagree with this assertion, especially in the aftermath of Chernobyl. Nor would many quarrel with the claim that human operators need more help in operating such systems, particularly during disturbances. Where opinion divides, however, is on such questions as:

* Why this help is needed, * Who should have it, and * What forms it should take.

1. Two ways of looking at cognitive aids

1.1. THE SANGUINE VIEW

The optimists believe that the problem of operator error in process environments will ultimately have a technical solution. The same exponential growth in computer technology that made centralized, supervisory control possible in the first place can also provide the "cognitive tools"--felicitous extensions of normal brain-power-- that will enable operators to deal successfully with its current problems of complexity and opacity. Such optimists, particularly if they are systems designers, might also wish to claim that, in any case, human involvement in such systems (and hence the attendant risk of dangerous operator errors) will gradually decline as the possible scenarios of failure are better understood and further non-human methods of coping with them are devised.

1.2. A DARKER VIEW

A more pessimistic view, and the one espoused here, is that most operator errors arise from a mismatch between the properties of the system as a whole and the characteristics of human information processing. System designers have unwittingly created a work situation in which many of the normally adaptive characteristics of human cognition are transformed into dangerous liabilities. And---the argument continues---since the problem is fundamental to the design of such systems, and since (short of closure) these installations are likely to be with us for many years to come, the only immediate remedy is to provide the human participants, both operators and maintenance personnel, with cognitive prostheses (or, in plainer English, mental "crutches") that will help to compensate for some of these artificially-enhanced error tendencies.

2. Analysing the human error problem This section will focus upon analysing the nature of the human error problem in process environments. I will look briefly at three issues: (1) fundamental difficulties

463

00207373/87/050463 + 08503.00/0 �9 1987 Academic Press Limited

Page 2: Cognitive aids in process environments: prostheses or tools?

464 J. REASON

associated with human supervisory control, (2) the way in which complex systems fail, and (3) the distinction between active and passive error forms. Subsequently, I will discuss the implications of this analysis for planning our future efforts at error reduction and containment.

2.1. THE "CATCH-22" OF HUMAN SUPERVISORY CONTROL

Human cognition is supremely good at modelling the regularities of its previous transactions with specific environments, and in using these stored representations as a basis for the automatic control of subsequent perception and action. It does this in the reasonable expectation that the recurrences of the past provide a fair guide to the likelihoods of the future. And, in the course of "normal" everyday life, that is generally the case. But it is manifestly not true for the operators of complex process systems.

Despite some claims and many appearances to the contrary, the basic task of the process controller is to cope with emergencies. Operators are there because system

,~ designers cannot foresee all possible scenarios of failure, and hence are not able to provide engineered safety measures for every contingency.

In addition to their cosmetic value, human beings owe their inclusion in hazardous systems to their unique, knowledge-based ability to carry out "on-line" problem- solving in novel situations. Ironically--and notwithstanding the Apollo 13 astronauts and other exceptionally talented "jury-riggers"--they are not especially good at it; at least not in the conditions usually prevailing during system emergencies. One reason for this is that stressed human beings are strongly disposed to employ the effortless, parallel, pre-programmed operations of highly specialized low-level processors and their associated heuristics. These stored routines are shaped by personal history and reflect the recurring patterns of past experience.

The first part of the "catch" is thus revealed: Why do we have operators in complex systems? To cope with emergencies. What will they actually employ to deal with these problems? Stored routines based on previous interactions with a specific environment. What, for the most part, is their experience within the control room? Tweaking the plant while it is operating within safe limits.

One apparent solution to this problem would be to spend a large part of an operator's shift time drilling him in the diagnostic and recovery lessons of previous system emergencies. And this brings us to the second part of the "catch": It is in the nature of complex, tightly-coupled, highly interactive and partially understood process systems to spring nasty surprises. Even if it were possible to build Ulr--through simulation or game-playing--an extensive repertoire of recovery routines within operating crews, there is no guarantee that they would be relevant, other than in a very general sense, to some future event. As case studies repeatedly show, accidents may begin in a conventional way, but they rarely proceed along predictable lines. Each incident is a truly novel event in which past experience counts for little, and where the plant is returned to a safe state by a mixture of good luck and laborious, resource-limited, knowledge-based processing. Error is in- evitable. Whereas in the more forgiving circumstances of "normal" life, learning from one's mistakes is usually a beneficial process, in the control rooms of chemical or nuclear power plants, such educative experiences can have unacceptably catastrophic consequences.

Page 3: Cognitive aids in process environments: prostheses or tools?

COGNITIVE AIDS IN PROCESS ENVIRONMENTS 465

2.2. HOW COMPLEX SYSTEMS FAIL

Perhaps the most important lesson to be learned from past accidents is that the principal cause tends to be neither the isolated malfunctioning of a major component nor a single gross blunder, but the unanticipated and largely unforesee- able concatenation of several small failures, both mechanical and human (see Bignall, Peters & Pym, 1977; Turner, 1978; Rolt, 1978; Perrow, 1984; Kasputin, 1986). Each failure alone could probably be tolerated by the system. What produces the calamitous outcome is their unnoticed and often mysterious interaction.

Small failures of either human or mechanical origin are omnipresent within complex man-made systems in much the same way as the human body always contains within it a variety of toxic substances and pathogenic agencies. For the most part, they are either tolerated or kept in check by protective measures. But every now and again a set of circumstances occurs which permits these "resident pathogens" to thwart the defences, thus making the system vulnerable to threats that could otherwise have been withstood. The more complex, centralized, interactive, tightly-coupled and opaque the system, the more liable it is to what Perrow (1984) has called 'normal accidents' and Wagenaar (1987) has termed "impossible accidents".

2.3. ACTIVE VERSUS PASSIVE HUMAN FAILURES

To accommodate human error types within this "pathogen" view of systems failure, it is useful to follow Rasmussen and Pedersen's (1982) distinction between active and passive failures. The former are operator errors that initiate a major system breakdown, or are committed during attempts to recover from such a breakdown. Passive failures are errors in design, construction, installation, planning or main- tenance which create some latent condition, or "resident pathogen", within the system.

A recent survey of the nuclear industry (Rasmussen, 1980) revealed: (a) that passive errors are a far more common than active errors; and (b) that, of these, omissions of isolated actions in maintenance-related activities comprise the largest single category. There is no reason to suppose that these findings are unique to nuclear power plant incidents. Although control room operators are the "stars" of system emergencies, they are often merely the inheritors of problems created "off-stage" by maintenance, installation and construction personnel. These people generally have a much greater opportunity for subverting safety devices, and for seeding the system with the minor failures that subsequently merge into dangerous combinations.

3. Some conclusions and implications

1. The "Catch-22" of human supervisory control in current installations makes active errors inevitable during attempts to recover from major plant disturbances. This conclusion is amply borne out by the case study evidence (see Reason, 1986a). The implications of this for the design of cognitive aids are two-fold:

(a) In the short term, we must provide operators of existing systems with aids that will improve their chances of detecting and recovering from these unavoidable errors.

Page 4: Cognitive aids in process environments: prostheses or tools?

466 J. REASON

(b) In the longer term, it is necessary to go back to the drawing board and create a new generation of systems with the basic properties of human cognition firmly in mind from the outset. For this, system designers need to be provided with a set of "working approximations" (see Card, Moran & Newell, 1983) concerning the basic properties of human cognition.

2. The history of complex system failures indicates that they arise from the catastrophic combination of trivial failures. This poses two distinct problems for the reliability community:

(a) It must develop a way of gauging the likelihood of such an accident within any given plant. Being a complex function of several interwoven factors (e.g. miscom- munication, inadequate training, poor design, faulty installation, botched main-

'tenance, etc), these "impossible" accidents are not amenable to conventional probabilistic risk assessments.

(b) It must also find a way of identifying both the major sources of these pathogens and the often unlikely channels along which they breed. Effective preventive action can only be based upon an appreciation of the true nature of complex system pathology. And this, it is believed, will derive more from holistic analyses of human-system interaction than from the dubious combination of failure probabilities for individual components or actions.

3. The error surveys show that passive failures---the "resident pathogens" in the system---constitute a greater risk to the safety of complex systems than the active errors committed by operators. Further, the nature of the most prevalent of these passive errors---simple omissions--suggests that our current emphasis on intelligent decision support systems may be misplaced.

Leaving aside the issue of whether such devices are feasible, or even desirable, we must not allow the lure of longer-term "high-tech" solutions to blind us to what can be achieved immediately by the application of simple, well-understood, "off-the- shelf" remedies (e.g. the proper application of old-fashioned ergonomics). The best cognitive aids---good visibility, intelligible instruments, shopping lists, diaries, spreadsheets, calculators, e t c J a r e more noted for their utility and availability than for their technological 'sex appeal'. And, in the case of a large class of passive errors, one obvious counter-measure is both "low-tech" and readily available: Provide mechanical maintenance personnel with memory aids. The nature of such memory aids and the manner in which they might be implemented on lapheld computers has been discussed at length elsewhere (Reason, 1985).

4. A way forward

In the preceding analysis, I distinguished between (a) active and passive errors; and (b) between short-term and long-term strategies for tackling the problem of human error in process environments. It would be convenient at this point to try to encapsulate these distinctions in a tentative prescription for future action.

4.1. SHORT-TERM MEASURES

In dealing with human error in existing process plants, the weight of evidence strongly suggests that we would be better advised to concentrate our efforts upon

Page 5: Cognitive aids in process environments: prostheses or tools?

COGNITIVE AIDS IN PROCESS ENVIRONMENTS 467

minimizing passive rather than active errors. This means focusing upon maintenance personnel rather than on control room operators, and providing memory aids now rather than awaiting the development of intelligent decision support systems at some time in the future. Aside from anything else, it behoves us to avoid the "Star Wars" (SDI) trap of putting most of one's eggs into an untried and uncertain technological basket, no matter how intellectually and commercially attractive that basket may appear.

Where the current generation of control room crews most need help is in deciding whether one or more of their number has made a slip, or whether they are currently pursuing some mistaken plan of action. In short, they need detection and recovery aids. And it is possible that this end might best be served by providing operators with a better appreciation of (a) the failure characteristics of complex systems; and (b) basic human error tendencies. This could be achieved by using a combination of simple instruction and realistic simulation (e.g. DYLAM), embodying not only the dynamic properties of the plant but also some approximate model of the human operator.

The evidence both from actual and from simulated NPP emergencies indicates that while crews are reasonably accurate in their initial-state diagnoses, they frequently fail to appreciate how this starting condition changes during the course of attempted recovery (Woods, 1984). Instead of adjusting their assessments to match the changing conditions of the plant, they tend to remain "fixated" upon their original hypothesis. When finally forced to abandon this initial view, their subsequent behaviour is characterized by a succession of narrow concerns in which their limited attention is captured by one "hot issue" after another. Help is clearly needed in alerting crews to both their natural human tendency to stick with early diagnoses (confirmation bias), and the dynamic properties of the destabilized plant.

4.2. LONGER-TERM MEASURES

In order to combat the active errors, derived in large measure from the human-system mismatch within existing installations, it is necessary to provide designers with a better appreciation of the strengths and weaknesses of human cognition. Perhaps the best known attempt along these lines was Fitts List (Fitts, 1951). Since the issue is central to the present paper, it would be instructive to devote a brief space to recalling what Fitts was trying to achieve, and why the venture failed.

Fitts produced a two-column list in which one column was headed "man" and the other "machine", and where the properties of each were cross-compared with regard to speed, power, consistency, memory, computational power, and so on. By comparing the relative strengths and weaknesses of men and machines, Fitts sought to provide a rational basis for allocating functions between them in the increasingly complex man-machine systems of the post-war years. At the time, it was seen as providing an elegant solution to long-standing design problems: first identify the functions of a proposed system, then assign to the human and mechanical components only those tasks for which each was best suited. Several textbooks of the new discipline of human engineering gave it top billing, and were quick to offer extensions upon its basic theme. But then disenchantment set in.

Page 6: Cognitive aids in process environments: prostheses or tools?

468 J. REASON

Nehemiah Jordan (1968) has indicated some of the reasons why Fitts List fell from favour. The problem was not with the facts of the list--they were correct enough within the limits of their periodmbut with the underlying notion of comparing humans and machines. Men and machines, Jordan argued, are not comparable, they are complementary: " . . . if we try to abstract the underlying commonalities (of the items in Fitts L i s t ) . . . we find that they really make one point and only one point. Men are flexible but cannot be relied upon tb perform in a consistent manner, whereas machines can be depended upon to perform consistently but they have no flexibility whatsoever. This can be summarized simply, and seemingly tritely, by saying that men are good at doing what machines are not good at doing and machines are good at doing that at which men are not good at doing" (Jordan, 1968). Thus, the credibility of Fitts List foundered on a simple paradox: If a task could be described exactly (i.e. in mathematical terms), then a machine should perform it; if not, it could only be tackled using the ill-defined flexibility of a human being. This, then, was the design philosophy that governed the conception of many of the systems still operating today, and lies at the heart of the "Catch-22" described earlier.

Much has changed in the intervening period. Machines are clearly a great deal smarter now, and perform many activities that were hitherto exclusively the province of humans. We also know much more about the basic features of cognition. Taken together, these developments provide grounds for re-evaluating the positions taken both by Fitts and by his principal critics.

While it is still true that not even the smartest of present-day "intelligent" devices can outstrip human beings in adaptable problem solving, cognitive research over the past decade has demonstrated that our much-vaunted capacity for flexible reasoning is a flawed instrument. Given the time and encouragement to explore all aspects of a problem space, we often come up with satisfactory answers--but not before we have made a variety of fairly predictable mistakes.

Where real progress has been made is in our improved understanding of how these errors originate. They, like other varieties of recurrent error, are rooted in a tendency to over-utilize what is probably the most conspicuous achievement of human cognition: its ability to simplify complex informational tasks by resorting to pre-established routines, heuristics and short-cuts. This is dependent upon our remarkable ability to match stored representations to environmental "calling conditions", and to resolve conflicts between partially matched structures using a simple "most used, most likely" rule. The natural bias towards favouring the most frequent of the contextually appropriate contenders is a consequence of the cognitive system's automatic facility for keeping a rough running tally of how often a particular event or object has been encountered in the past.

The same Nehemiah Jordan also made the observation that whereas machines botch up, humans degrade gracefully. Human cognition is extremely good at "making do" with incomplete or inadequate information. This under-specification of mental operations can arise for many different reasons: incomplete or "noisy" sensory inputs, patchy knowledge, insufficient attention to ongoing action, "spillage" from working memory, fragmentary retrieval cues and stressors that pre-empt higher-level control mechanisms (see Reason, 1986b). Notwithstanding these possible varieties of under-specification, their consequences are remarkably

Page 7: Cognitive aids in process environments: prostheses or tools?

COGNITIVE AIDS IN PROCESS ENVIRONMENTS 469

consistent: what emerge are perceptions, plans, diagnoses, thoughts, recollections and actions that were commonplace in previous dealings with that particular environment. In short, the system gambles in favour of the most frequent past outcome. This "frequency-gambling" heuristic, together with "similarity-matching" (relating cues to stored events on the basis of shared features), constitute the primitives of the cognitive system. Add to these the powerful and pervasive "confirmation bias", and we are in a position to predict the qualitative forms of most human errors.

This brief review of the current cognitive scene leads to an obvious question. Are we now in a position to do any better than Paul Fitts and his disciples did in the post-war years? Can we "give away" to designers a more comprehensive and accurate account of human cognition than was then available? The answer is a qualified "yes". It is qualified because there is not yet a single agreed theoretical framework for cognition (though there are broad areas of agreement). It is affirmative because, even without this consensus, many of its fundamental properties have now been identified. And, as Norman (1985) and others (see Hinton & Anderson, 1981; Anderson, 1983; Baars, 1983; McClelland & Rumelhart, 1985) have pointed out, the picture that emerges is of a cognitive system which, though extremely good at internalizing (as stored knowledge) the complexity of the world it inhabits, is, in essence, driven by a limited number of relatively simple computa- tional principles.

References

ANDERSON, J. R. (1983). The Architecture of Cognition. Cambridge, MA: Harvard University Press.

BAARS, B. J. (1983). Conscious contents provide the nervous system with coherent global information. In: DAVlDSON, R. J., SCHWARTZ, G. E. & SrIAPmo, D. Eds, Consciousness and Self-Regulation, Vol. 3. New York: Plenum.

BIGNALL, V., PETERS, G. & P/M, C. (1977). Catastrophic Failures. Milton Keynes: Open University Press.

CARD, S. K., MORAN, T. P. • NEWELL, A. (1983). The Psychology of Human-Computer Interaction. Hillsdale, NJ: Erlbaum Associates.

Frrrs, P. M. (1951). Human engineering for an effective air navigation and traffic control system. Washington, D.C.: National Research Council.

HINTON, G. E. & ANDEgSON, J. A. (1981). Parallel Models of Associative Memory. Hillsdale, NJ: Erlbaum Associates.

JORDAN, N. (1968). Themes in Speculative Psychology, p. 203. London: Tavistock. KASPUTIN, R. (1986). Application of Team Concept~Systems Approach to Investigation of

Major Mishaps. Pre-workshop draft. NATO Advanced Research Workshop, Bad Windsheim, 18-22 August.

McCLELLAND, J. L. & RUMELHART, D. E. (1985). Distributed memory and the repre- sentation of general and specific information. Journal of Experimental Psychology: General, 114, 159-188.

NORMAN, D. A. (1985). New views of information processing: Implications for intelligent decision support systems. In: HOLLNA~EL, E., MANClm G. & WooDs D. Eds. Intelligent Decision Aids in Process Environments. San Miniato, Italy: NATO Advanced Study Institute preprints.

PERROW, C. (1984). Normal Accidents: Living With High-Risk Technologies. New York: Basic Books.

Page 8: Cognitive aids in process environments: prostheses or tools?

470 j. REASON

RASMUSSEN, J. (1980). What can be learned from human error reports? In: DUNCAN, K., GRUNEBERG, M. & WALLIS, D. Eds. Changes in Working Life. London: John Wiley.

RASMUSSEN, J. & PEDERSEN, O. M. (1982). Formalized search strategies for human risk contributions: a framework for further development. Riso-M-2351. Roskilde, Denmark: Riso National Laboratory.

REASON, J. T. (1986a). Catastrophic combinations of trivial errors. In: Cox, Ed., The Psychology of Occupational Safety and Accidents. London: Taylor & Francis. In press.

REASON, J. T. (1986b) Cognitive under-specification: its varieties and consequences. In: BAAR'S, B. Ed., The Psychology of Error: A Window on the Mind. New York: Plenum. In press.

REASON, J. T. (1985). Maintenance-related omissions: a major source of performance problems in nuclear power plant operations. Unpublished paper.

ROLT, L. T. C. (1978). Red For Danger, London: Pan Books. TURNER, B. A. (1978). Man-Made Disasters, London: Wykeham Publications. WAGENAAR, W. • GROENEWEG, J. (1987). Accidents at sea: Multiple causes and impossible

consequences. International Journal of Man-Machine Studies, 27~ 586-597. WooDs, D. D. (1984). Some results on operator performance in emergency events, Institute

of Chemical Engineers Symposium Series No. 90: 21-13.