109
Technical Report UMTRI-93-12 Measures and Methods Used to Assess the Safety and Usability of Driver Information Systems Paul Green April, 1993 (revised March, 1994, August, 1994) FHWA-RD-94-088

Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Technical Report UMTRI-93-12

Measures and Methods Used to Assess the Safety and Usability

of Driver Information Systems

Paul Green

April, 1993(revised March, 1994,

August, 1994)FHWA-RD-94-088

Page 2: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

FOREWORD

NOTICE

This document is disseminated under the sponsorship of the Department ofTransportation in the interest of information exchange. The United States Governmentassumes no liability for its contents or use thereof. This report does not constitute astandard, specification, or regulation.

The United States Government does not endorse products or manufacturers.Trademarks or manufacturers' names appear herein only because they are consideredessential to the object of the document.

The contents of this report reflect the views of the author, who is responsible for thefacts and accuracy of the data presented herein. The contents do not reflect the officialpolicy of the Department of Transportation.

Page 3: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Paul Green

1. Report No. 2. Government Accession No. 3. Recipient's Catalog No.

5. Report Date

6. Performing Organization Code

8. Performing Organization Report No.

10. Work Unit No. (TRAIS)

11. Contract or Grant No.

14. Sponsoring Agency Code

13. Type of Report and Period Covered

19. Security Classif. (of this report) 20. Security Classif. (of this page) 21. No. of Pages 22. Price

18. Distribution Statement17. Key Words

4. Title and Subtitle

7. Author(s)

9. Performing Organization Name and Address

12. Sponsoring Agency Name and Address

15. Supplementary Notes

16. Abstract

Technical Report Documentation Page

The University of Michigan Transportation Research Institute2901 Baxter Road, Ann Arbor, Michigan 48109-2150

UMTRI-93-12

unclassified

account 080066

MEASURES AND METHODS USED TO ASSESS THE SAFETY AND USABILITY OF DRIVER INFORMATION SYSTEMS

IVHS, human factors engineering, ergonomics, navigation, driverinformation systems, test methods

DTFH61-89-C-00044

This report concerns in-car systems that may be used to present navigation, hazard warning, vehicle monitoring, traffic, and other information to drivers in cars of the future. It describes in detail measurements researchers have made to determine if those systems are safe and easy to use. In particular, it summarizes previous reviews by DRIVE task forces, Zaidel, and Robertson and Southall, as well as several key research efforts on this topic (work by Senders, Wierwille, Godthelp, Zwahlen, Quimby, Noy, Allen, and others). Measures that appear most promising for safety and usability tests of driver information systems include the standard deviation of lane position, speed, speed variance, and the mean and frequency of driver eye fixations to displays and mirrors. In some cases, laboratory measures (errors, etc.) may also be useful. Also of interest are time-to-collision and time-to-line crossing, although hardware for readily measuring them in real time is not available. Of lesser utility are workload estimates (SWAT, TLX). Secondary task measures and physiological measures are very weak predictors of safety and usability. To assess usability, application-specific measures (e.g., the number of wrong turns made in using a navigation system) should also be collected.

i

115

No restrictions. This document is available to the public through the National Technical Information Service, Springfield, Virginia 22161.

Final ReportSeptember 1991- May 1993

Office of Safety and Traffic Operations R&DFederal Highway Administration6300 Georgetown PikeMcLean, Virginia 22101-2296

This research was jointly funded by the Federal Highway Administrationand the National Highway Traffic Safety Administration. Contracting Officer's Technical Representative (COTR): Nazemeh Sobhi, HSR-30.

FHWA-RD-94-058

unclassified

Page 4: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

ii

PREFACE

PROJECT OVERVIEW

The United States Department of Transportation (DOT), through its Intelligent Vehicle-Highway Systems (IVHS) program, is aiming to develop solutions to the most pressingproblems of highway travel. The goal is to reduce congestion and improve trafficoperations, reduce accidents, and reduce air pollution from vehicles by applyingcomputer and communications technology to highway transportation. If these systemsare to succeed in solving the nation's transportation problems, they must be safe andeasy to use, with features that enhance the experience of driving. The University ofMichigan Transportation Research Institute (UMTRI), under contract to DOT, carried outa project to help develop IVHS-related driver information systems for cars of the future.This project concerns the driver interface, the controls and displays that the driverinteracts with, as well as their presentation logic and sequencing.

The project had three objectives:

• Provide human factors guidelines for the design of in-vehicle information systems.

• Provide methods for testing the safety and ease of use of those systems.

• Develop a model that predicts driver performance in using those systems.

Although only passenger cars were considered in the study, the results apply to lighttrucks, minivans, and vans as well because the driver population and likely use aresimilar to cars. Another significant constraint was that only able-bodied drivers wereconsidered. Disabled and impaired drivers are likely to be the focus of future DOTresearch.

A complete list of the driver interface project reports and other publications is included inthe final overview report, 1 of 16 reports that document the project.[1] (See also Green,Serafin, Williams, and Paelke, 1991 for an overview.)[2] To put this report in context, theproject began with a literature review and focus groups examining driver reactions toadvanced instrumentation.[3,4,5] Subsequently, the extent to which various driverinformation systems might reduce accidents, improve traffic operations, and satisfydriver needs and wants, was analyzed.[6,7] That analysis led to the selection of twosystems for detailed examination (traffic information and car phones) and contractualrequirements stipulated three others (route guidance, road hazard warning, and vehiclemonitoring) likely to appear in future vehicles.

Each of the five systems selected was examined separately in a sequence ofexperiments. In a typical sequence, patrons at a local driver licensing office wereshown mockups of interfaces, and driver understanding of the interfaces andpreferences for them was investigated. Interface alternatives were then compared inlaboratory experiments involving response time, performance on driving simulators, andpart-task simulations. The results for each system are described in a separate report.(See references 8, 910111213through 14.) To check the validity of those results, several on-

Page 5: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

iii

road experiments were conducted in which performance and preference data for thevarious interface designs were obtained.[15,16]

Concurrently, UMTRI developed test methods and evaluation protocols, UMTRI andBolt Beranek and Newman (BBN) developed design guidelines, and BBN worked on thedevelopment of a model to predict driver performance while using in-vehicle informationsystems. (See references 171819 through 20.)

Many of the reports from this project were originally dated May, 1993, the contractualend date of the project whereby reports were to be delivered. However, the reportswere actually drafted when the research was conducted, more than two years earlier forthe literature review and feature evaluation, and a year earlier for the laboratoryresearch and methodological evaluations. While some effort was made to reflectknowledge gained as part of this project, the contract plan did not call for rewritingreports to reflect recent findings.

THIS REPORT

This report is one of two concerning the testing of the safety and ease of use of driverinterfaces. It also touches upon issues relating to comfort, convenience, andconfidence, but there is very little information in the literature on those issues as theyrelate to driver interfaces.

The bulk of this report is devoted to a review of the methods and measures forassessing safety and ease of use of IVHS-related driver information systems. Becauseof dissemination constraints, it is quite likely that coverage of the DRIVE andPROMETHEUS programs is incomplete. Very little is known in the U.S. about work inJapan. Participants in programs in Europe and Japan are encouraged to send theauthor copies of reports and papers that are pertinent to this review.

This report is written for scientists conducting automotive human factors research,though some practitioners interested in evaluation may find this report to be of interest.Those scientists may be working in academia, industry, or for government agencies.Within the Department of Transportation, both the Federal Highway Administration(FHWA) and the National Highway Traffic Safety Administration (NHTSA) are interestedin this research, with NHTSA having expressed the greatest interest. Accordingly, thisreport emphasizes safety issues.

A secondary audience is human factors scientists with expertise in defenseapplications, but little knowledge of automotive applications. Their interest is the resultof federal policy decisions to foster defense conversion to civilian applications. Toprovide context for defense scientists, reviews of key studies have been included in thisreport, as well as a detailed tabular summary of all the studies on navigation systemsand related topics.

Serving as a companion to this report is a subsequent report that describes suggestedassessment protocols (Green, 1993).[18] That subsequent report is written primarily forpractitioners interested in conducting assessments of driver interfaces, many of whomwill be automotive human factors engineers. It is likely that human factors scientists will

Page 6: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

iv

have a keen interest in that report as well. These two reports were produced asseparate documents because the audiences were different and to expedite release ofthe material.

Page 7: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

v

insert metric page here

Page 8: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

vi

TABLE OF CONTENTS

INTRODUCTION .................................................................................1BACKGROUND..................................................................................................1PURPOSE AND SCOPE OF THIS REPORT................................................1

HOW DO PEOPLE DRIVE CARS? .........................................................5

PREVIOUS REVIEWS SIMILAR TO THIS ONE.........................................9DRIVE SAFETY TASK FORCE (1991) ..........................................................9DRIVE 2 HARDIE PROJECT (STEVENS AND PAUZIE, 1992) ...............11ZAIDEL (1991)...................................................................................................12ROBERTSON AND SOUTHALL (1992)........................................................14SUMMARY.........................................................................................................14

WHAT ARE SOME OF THE KEY STUDIES OF DRIVER INFORMATION SYSTEMS? .....................................................................................17

SENDERS' OCCLUSION STUDIES ..............................................................17HICKS AND WIERWILLE (1979) - COMPARISON OF WORKLOAD MEASURES.....................................................................................................19GODTHELP'S RESEARCH ON TIME-TO-LINE CROSSING....................20ZWHALEN'S WORK ON OCCLUSION.........................................................22QUIMBY (1988) - UNSAFE DRIVER ACTIONS ..........................................35VAN DER HORST AND GODTHELP (1989) - TIME-TO-COLLISION.....36WIERWILLE AND HIS COLLEAGUES - IN-VEHICLE STUDIES OF NAVIGATION............................................................................................37BOS, GREEN, AND KERST (1988) - LABORATORY ASSESSMENTOF DISPLAYS ...................................................................................................42NOY'S (1987) REVIEW OF SECONDARY TASK METHODS..................42NOY (1989, 1990) - ALTERNATIVE METHODS FOR ASSESSING WORKLOAD....................................................................................................46HARDEE, JOHNSON, KUIPER, AND THOMAS (1990).............................49ALLEN, STEIN, ROSENTHAL, ZIEDMAN, TORRES, AND HALATI (1991) - LABORATORY SIMULATION...............................50VERWEY (1991)................................................................................................51DINGUS, HULSE, KRAGE, SZCZUBLEWSKI, AND BERRY (1991).......56TRAVTEK EVALUATION.................................................................................57CLOSING COMMENT ON MEASURES AND METHODS.........................58

TABULAR SUMMARIES OF METHODS AND MEASURES .......................59

WHICH MEASURES HAVE BEEN USED TO ASSESS DRIVER INFORMATION SYSTEMS? ...............................................................81

Page 9: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

vii

TABLE OF CONTENTS (continued)

WHICH MEASURES SHOULD BE CONSIDERED FOR FUTURE ASSESSMENT PROTOCOLS? ...........................................................83

SELECTION CRITERIA ...................................................................................83OUTPUT MEASURES - LATERAL CONTROL............................................85OUTPUT MEASURES - SPEED CONTROL................................................86OUTPUT MEASURES - SECONDARY TASKS...........................................86OUTPUT MEASURES - OVERALL................................................................87INPUT MEASURES - LATERAL CONTROL................................................89INPUT MEASURES - SPEED CONTROL ....................................................90INPUT MEASURES - SECONDARY TASKS...............................................90INPUT MEASURES - DRIVER VISION.........................................................91

CONCLUSIONS.................................................................................93

REFERENCES...................................................................................95

Page 10: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

viii

LIST OF FIGURES

1. Simplified conceptual model of driving. .............................................................62. Relationship between occlusion time and speed for two drivers..................183. Speed vs. occlusion time for the race track. ....................................................194. Standard deviations of vehicle displacements for no steering

and no visual input. ..............................................................................................235. Standard deviations versus distance traveled.................................................256. Comparison of measured and predicted standard deviations.......................267. Uncorrected standard deviation of lateral lane position.[44] .........................288. Direct looks made by drivers to the radio and climate control. .....................299. Proposed design guide for in-vehicle displays. ...............................................3010. Standard deviation of lateral position and task completion times for the

Plymouth Turismo. ...............................................................................................3211. Standard deviation of lateral position and task completion times for the

Pontiac Wagon. ....................................................................................................3312. Glance time data. .................................................................................................3913. Time-to-Line Crossing (s) as a Function of Road Curvature Radius (m)....4714. Visual/attentional performance as a function of road curvature....................4815. Three-frame sequence of the pedestrian detection task................................4916. Secondary task performance for various tasks. ..............................................5317. Number of mirror glances for various tasks. ....................................................5418. Number of display glances for various tasks. ..................................................5419. SAR and secondary task.....................................................................................55

Page 11: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

ix

LIST OF TABLES

1. Test environments................................................................................................102. Task force's appraisal of various test environments.......................................103. Possible experimental measures.......................................................................114. Zaidel’s list of potential measures. ....................................................................135. Values of occlusion and time-to-line-crossing .................................................216. Recomputed standard deviations. .....................................................................277. Standard deviation of lateral position................................................................348. Number and severity of different UDA's. ..........................................................369. Total display glance time for each task.............................................................3810. Number of lane exceedences and mean duration. .........................................4011. Secondary task list...............................................................................................4312. Auxiliary task decrements in dual task conditions...........................................4713. Tasks examined. ..................................................................................................5114. Navigation task completion times. .....................................................................5715. Methodological studies........................................................................................6016. Information gathering studies.............................................................................6717. Interface evaluation studies................................................................................7318. Input measures of driving behavior and performance....................................8119. Output measures of driving behavior and performance.................................82

Page 12: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

1

INTRODUCTION

BACKGROUND

The car of the future could be quite different from that on the road today. While it willstill have a steering wheel, brake, and accelerator, and controls for secondary functions(radio, climate, etc.), a host of new systems will either be introduced or see expandeduse. These systems include navigation, traffic information, collision avoidance, etc.

In the recent past, electronic technologies have been added to cars in the unrealizedhope that such technologies would see widespread use [electronic displays in general,voice output, HUDs (Head-Up Displays, etc.)]. In implementing these technologies,significant human factors problems have arisen, in addition to problems with cost andreliability. If Intelligent Vehicle-Highway Systems (IVHS) are to succeed, those systemsmust be safe and easy to use, and provide useful information.

PURPOSE AND SCOPE OF THIS REPORT

To develop safe and easy to use systems one needs:

1. A set of research-based human factors design guidelines/requirements.

2. Analytical and simulation procedures that can be used to predict driverperformance with alternative designs.

3. Methods for testing and evaluating alternatives.

This report explores the third point, test methods. The human factors test that is mostappropriate depends upon the intended use of the desired information. Potential usesinclude:

• Identifying the strengths and weaknesses (problems) associated with adesign.

• Exploring alternative designs.

• Determining how problems might be solved.

• Determining how common and severe the problems are.

• Determining if a system is fit (safe) for use.

The perception is that the DOT, and especially NHTSA, has traditionally emphasizedsafety and system effectiveness, but given less attention to ease of use. Evaluationsshould assess both the current level of performance and problems that need to becorrected. It is important to provide incentives to improve systems, not just determineminimum acceptability. In recognition of those broader needs, the contract called forquantifying “the influence of in-vehicle systems on driver safety...the effectiveness withwhich information is transferred...to drivers,” and “assess[ing] driver comfort,convenience, and confidence when using these systems.” A further goal was to selectmeasures and test procedures “to assess the safety of drivers’ performance while using

Page 13: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

2

in-vehicle systems.” Further, it was desired to determine which levels of performanceare unsafe. Establishing these levels is extremely difficult to do.

Therefore, this report examines the following questions:

1. How have basic and applied studies been conducted that relate to the safetyand usability of driver information systems?

2. What are some of the key studies and what has been learned from them?

3. Which measures have been used to assess the safety and usability of driverinformation systems?

4. Which measures should be considered for future assessment protocols?

Selecting the appropriate test protocol and integrating measures of interest into testprotocols is covered in a subsequent report.[18] Details of how measurements are to becollected, because they depend on the protocol selected, are also covered in thatreport.

To address these four questions, a section has been included describing how peopledrive cars and how that process has been modeled as a part of this project. Use ofmodels is one method for identifying key measures and their relationships. Followingthat section are reviews of previous reviews. Several are quite insightful and provideschemes for grouping measures and approaches.

Most of the report is an in-depth review of the literature, in particular 15 key papers,reports, and programs. While many human factors scientists will be familiar with someof them, detailed knowledge of all of them is unlikely. Finally, to provide breadth to thereview, a larger set of references is summarized in several tables. The focus is ongeneral methodological studies and specific interface evaluations.

This report ends with a summary emphasizing those measures that reflect safety andease of use.

For the most part, the discussion of measures is fairly general. However, it isrecognized that a key application of this report is to five functions that have beenchosen for further evaluation— (1) navigation, (2) traffic information, (3) cellular phone,(4) vehicle monitoring, and (5) the In-Vehicle Safety Advisory Warning System(IVSAWS), with navigation receiving greater attention that the other functions. InIVSAWS, radio transmitters are attached to road hazards (vehicles involved in anaccident, police cars in a chase, etc.). Drivers nearby receive either visual or auditorywarnings from an in-vehicle receiver.

This review of existing work does not specifically discuss other driver informationsystems [e.g., in-car signing, motorist services, and entertainment (radio, CD, cassettetape player, TV)], though many of the ideas are germane to the evaluation of thosesystems. Furthermore, systems for vehicle control (braking, steering, headway/speedmaintenance, performance limits, such as rollover and traction, collision avoidance

Page 14: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

3

(back up, blind spot, long range obstacle detection, etc.), and driver performancemonitoring are not considered, as they are outside of the scope of this project. Thosesystems all have interfaces that will communicate information to drivers, and, hence,their development and evaluation should benefit from the ideas presented here.

Lastly, readers are reminded that the ideas presented here focus on humanperformance, not systems safety, crash biomechanics, or other topics. So, the failure ofa central traffic control computer, or the consequences of electrical shorts, or problemsassociated with occupant impact during a crash are beyond the bounds of this research.

Page 15: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific
Page 16: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

5

HOW DO PEOPLE DRIVE CARS?

To evaluate the effectiveness of driver information systems, one must understand thecontext in which they operate. Considering driving’s significant impact on society (thetypical adult probably spends an hour driving daily; motor vehicles are the leading killerof young adults), how people actually drive is not well understood. Most of the researchhas focused on what happens to people when they are involved in accidents and othermatters pertaining to crashworthiness, not what happens beforehand (pre-crash).Further, very little is known about what behavior constitutes normal driving.

Driving consists of a set of tasks and activities requiring perception, cognition, motorresponse, planning, and task selection. The latter activity is particularly important, asthe driver must often choose between attending to the roadway cues needed for vehiclecontrol and other information sources competing for visual attention (e.g., rearviewmirror, climate control, advanced in-vehicle display).

These activities are organized in a hierarchical manner as shown in the simplifiedconceptual model of the driving task presented in figure 1.[19] The top-level activityconsists of setting overall goals (e.g., drive from point A to point B in the shortest time).A variety of subgoals, or "maneuvers,” are formulated and satisfied over time in order toachieve the top-level goals. Maneuvers relating to automobile control include therelatively high-level tasks that determine the intended path of the automobile (e.g., pullinto traffic, drive in the current lane, change lanes, turn right at the next intersection). Ingeneral, maneuver selection is a rule-based process, whereas maneuver execution isskill-based.

Having defined the maneuvers to be performed, the driver must then select the lower-level task to be attended to at any given instant. The task that is always competing forattention is that of vehicle control. It consists of maintaining lateral position and eitherspeed or headway in a manner that allows the intended maneuver to be carried outsafely and efficiently. Other tasks, which may or may not be adjuncts to the vehiclecontrol task, will compete on an intermittent basis at frequencies that vary widely fromtask to task.

In general, each low-level task includes perceptual ("obtain information"), cognitive("process and plan"), and motor ("execute response") components. These processesmay be considered and performed concurrently for some tasks—especially the task ofcontinuous vehicle control. For other tasks, such as reading a message on anadvanced in-vehicle monitor and then turning it off once read, the perceptual andresponse activities are separated in time sufficiently to be considered sequentialactivities.

At the very least, task selection involves determining which low-level tasks needattention. For tasks that are interruptable (such as talking on a phone), task selectionmay also include the selection of the appropriate process (perception, cognition, orexecution) as determined by the status of the task when last attended.

Page 17: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

6

Set Goals

Select Maneuver

Select Task

ObtainInformation

ProcessInformation

ObtainInformation

ObtainInformation

ProcessInformation

ProcessInformation

ExecuteResponse

ExecuteResponse

ExecuteResponse

VehicleControl

AuxiliaryTask

AuxiliaryTask

Figure 1. Simplified conceptual model of driving.

The task to be selected at any given instant is presumed to be based on a number offactors, including the perceived criticality to safety, the times to complete the variouscompeting tasks, the penalties for not immediately attending to the tasks, driverpreferences, and so on. The perceived priorities of the competing tasks will vary withtime.

It is often assumed that because most tasks compete for visual attention, they can onlybe attended to sequentially. There is some empirical evidence, backed by the multiple-resource theory of Wickens, that certain tasks can be performed concurrently.21 Thecompetition among two or more tasks has the potential to cause performancedegradation in one or more tasks, either because (1) performance of one or more tasksis delayed, (2) one or more tasks are dropped from the task queue, or (3) cognitiveresources must be shared among tasks performed concurrently.

Nevertheless, when competing tasks cause an unacceptable degradation inperformance—in particular, when safety is noticeably compromised—the driver isconsidered "overloaded.” Whether or not task overload occurs depends on the number

Page 18: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

7

of tasks competing for attention, the nature of these tasks, and the instantaneous stateof the world. The driver may be able to deal with a number of items competing forattention if the driving task is relatively benign and if other tasks are easily rescheduledor are otherwise relatively undemanding. On the other hand, a single task in addition tothe driving task may impose an unacceptable workload if the driving task is inherentlydifficult (e.g., driving a mountain road, driving in poor visibility) and if the auxiliary task isrelatively frequent, of long duration, and not easily rescheduled. It is important toremember that given sufficient time, drivers make rational decisions. Drivers considerthe extent to which attending to competing tasks and task components compromisesafety.

This discussion makes several key points about driving that are critical to the analysis ofthe safety and ease of use of in-vehicle information systems. First, the extent to whicha task interferes with driving depends primarily on the extent to which it is visual,because driving is primarily a visual task, though the motor demands are significant.Interference can also be caused by competition for cognitive and motor resources aswell, or by to their aggregate effect. Hence, interference depends on the degree towhich a multiplicity of conflicts occur. This also complicates the measurement processas assessing the extent of interference will require multiple measures.

Second, the management of driving is intelligent; simply because there is an in-vehicledemand doesn't necessarily mean it will interfere with driving. However, the addition ofa task may load some drivers to their limit, with additional tasks leading to a degradationof driver performance. Thus, in some situations, measurements of driver performancemay show no effects of adding tasks, even though they are present (e.g., the eliminationof reserve capacity).

Third, because the tasks are managed, tasks can be delayed (resulting in longerresponse times) or completed, but not as well as normally (resulting in increased errorrates). It is difficult to know which outcome might occur (or which measure to collect) inadvance, complicating the measurement process.

Fourth, the task management strategy generally adopted results in graceful degradationin the face of overload. Hence, identifying a single point at which a human-machinesystem transitions from safe to unsafe behavior will not be obvious.

Fifth, while safety and ease of use are connected, there may be instances where asystem could have a minor impact on safety but is difficult to use and ineffective. Forexample, in heavy traffic drivers might forgo the task of paying attention to a particularnavigation system if its tasks demands are high.

Thus, assessing the safety and ease of use of an in-vehicle information system will bedifficult because one is considering a complex, adaptive system responding to externaldemands that vary as a function of time. Multiple measures of driver performance willbe required, with the appropriate measures varying with the external task. Theparticular measures that will show degradation will depend on the in-vehicle task, withdegradation likely to be gradual.

Page 19: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

8

The point to be made is that there is no well understood, general model of drivingbehavior. To effectively select methods and measures to evaluate the specific impact ofAdvanced Traveler Information System (ATIS) technologies on driver safety andperformance, the known factors that influence the driving task should be understoodand should be explainable by a general model. This document is an important step incollecting that information and providing insight in to what methods have beensuccessful in the research arena. Unfortunately, the connections of driving behaviorand performance to an explicit conceptual model of the driving task are not strong asthe model is still being developed.

Page 20: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

9

PREVIOUS REVIEWS SIMILAR TO THIS ONE

This report is not the first to review evaluation methods and measures for in-vehicleinformation systems. The three key efforts, the DRIVE task forces, Zaidel's review, andthe work of Robertson and Southall are covered in the section that follows.[22,23,24] In aprevious report completed as part of this project methods and measures are alsodiscussed, though that topic is not the central theme.[3] For related information on thistopic also see reference 25. The purpose of the section that follows is to give a sense ofthe types of measures that have been used.

DRIVE SAFETY TASK FORCE (1991)

This report was written by a committee, by correspondence after a short meeting.[26,27]

The work was coordinated by the DRIVE Central Office. Contributing were severalindividuals well known for their contributions to driving research: Tom Buckley and PeterJesty (University of Leeds Computer Studies Department), Oliver Carsten (University ofLeeds Institute of Transport Studies), Bill Fincham (University of London), Stig Franzen(Saab), Chris Hyden (University of Lund), Andrew Parkes (HUSAT-Loughborough),Bunter Reichart (BMW), and Talib Rothengatter (Traffic Research Center, Groningen).This report does not contain a listing of experimental measures, experimental protocolsfor assessing safety or ease of use, or, as alluded to by its title, design guidelines.Rather, it provides an overview of safety and human factors issues, and of the likelyconsequences if those issues are ignored.

The scope of the report is apparently much broader than driver information systems asit concerns all aspects of what are referred to as Advanced Transport Telematics (ATT).While ATT is not formally defined, its domain is apparent from the examples given.

The DRIVE report considers driver performance, safety, and drivercomfort/convenience. Safety and human factors matters are classified into threecategories: systems safety, man-machine interaction, and traffic system. The TaskForce defines systems safety as problems that “arise as a result of a design fault orsystem malfunction."[22] For example, a traffic signal might show green to both roads atan intersection. The emphasis appears to be on approaches (e.g., fault tree analyses)in which there are a limited number of actions and consequences, and where theirprobabilities are readily identified. Man-machine interaction concerns “the usability of asystem.” Here the emphasis is on perceiving and understanding information andimmediate reactions to it. Demanding secondary tasks associated with using in-vehicleinformation systems can lead to overload and distraction from the primary task, controlof the vehicle. Traffic safety is a more global category which includes adversebehavioral changes in drivers. For example, if the automobile is equipped with an icy-road warning system, the absence of a warning might induce drivers to drive faster thanthey otherwise would have, under the assumption that a warning would be present ifconditions were hazardous.

Primarily, those who will benefit from reading the DRIVE report are administrators whoneed a very general overview of what the issues are (particularly those of systemssafety), and newcomers to the field who need to know about some of the likely

Page 21: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

10

problems. Those actively engaged in IVHS human factors research may find the reportlacking in detail.

The DRIVE report contains several thought-provoking figures and tables in the man-machine interaction section. The first (table 1) concerns the context in which data are tobe collected, which should be quite familiar to those engaged in research on this topic.Noteworthy is the distinction between real road test trials at a micro and macro level.The micro level refers to tests of single vehicles, the macro level to fleets. In the UnitedStates, macro tests are often referred to as operational field trials.

Table 1. Test environments.[22]

Table 2 shows how well suited the Task Force believes each test context is for variousevaluation dimensions. The check marks are never explained, but it appears they implya higher level of suitability than the x's. The evaluation criteria are only generallydefined in the DRIVE report.

• Real Road Field Trials (Macro)• Real Road Test Trials (Micro)• Test Track Studies• Dynamic Vehicle Simulations• Static Vehicle Simulations• Part Task Evaluations

Increasingcontrol ofvariablesandreplication

Increasingconfidencethat datacorrespondto realphonomena

Page 22: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

11

Table 2. Task force's appraisal of various test environments.[22]

TEST ENVIRONMENTEVALUATIONCRITERIA

Real RoadFieldTrials

RealRoadTestTrials

TestTracks

DynamicSimulators

StaticSimulators

Part TaskEvaluations

SystemPerformance

√ √ x x x x

UserWorkload

√ √ √ √ √ x

UserAcceptance

√ √ x x x x

Adaptation √ √ x x x x

SystemEffects

√ x x x x x

TaskMatch

√ √ x x x x

Ease ofLearning

√ √ √ √ √ √

Easeof Use

√ √ √ √ √ √

Table 3 lists some of the possible experimental measures. Notice that the list is quiteextensive. Although not stated in the report, the task level distinctions are based onwork by Michon, the behavior distinctions from work of Rasmussen. Strategic drivingtasks are those associated with planning a complete trip, such as determining the routefrom Detroit to Ann Arbor. Tactical tasks are at a lower level and have a shorter timeframe, and for example, concern the rules for changing lanes on an expressway.Operational tasks are associated with moment to moment control of the vehicle, such asthe feedback-control loop for keeping the vehicle in the lane. Finally, reactive tasks areresponses of the driver to higher level tasks. Reactive tasks are not voluntary, as areother classes of tasks.

Readers interested in evaluation should examine the original report for further details.

Page 23: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

12

Table 3. Possible experimental measures.[22]

Driving TaskLevel

DriverBehavior

Tools Measurement orMeasurement Category

Strategic KnowledgeBased

Interviews

Verbal Protocols

Questionnaire/Survey

StructuredUnstructuredPost HocConcurrent

Tactical Rule Based Vehicle Dynamics

Visual Behavior

Observation

SpeedAccelerationDecelerationPatternsGlance durationGlance frequencyConflict studiesTraffic flows critical incidents

Operational Skill Based Control Actions Steering wheel rotation/reversalPedal actuationGear selection

Reactive Autonomic Psychophysiological GSR, EKG, EEG,EOG, EMG, HR,HR variability, Adrenalinesecretion

DRIVE 2 HARDIE PROJECT (STEVENS AND PAUZIE, 1992)

This project, which began in January 1992, is a 3-year program to developrecommendations for the presentation of messages to drivers based on(1) understandability, usability, and safety while driving; (2) harmonization of text andsymbols; (3) compatibility of different systems; and (4) harmonization with externallypresented information.[28] Partners in this effort include the Transport ResearchLaboratory (UK), INRETS (France), Transportokonomisk Institutt (Norway), HUSAT(UK), Universidad Politecnica de Madrid (Spain), British Aerospace (UK), and TUV-Bayern eV (Germany), organizations well know for their contributions to transportationresearch. Tasks include reviewing existing standards, developing a theoreticalframework and experimental methods for assessing in-vehicle driver interfaces,performing evaluations of demonstrator interfaces, and developing recommendationsfor information presentation.

Stevens and Pauzie found that most standards concern information presentation inoffices, not in motor vehicles. In terms of European standards (their focus), the authorsidentify constraints from UK Construction Standard 109 and the Spanish telephoneregulations. The 109 standard allows the use of video only for presentation of

Page 24: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

13

information about the state of the vehicle and its equipment, its location, route anddestination, and information to help see the road. Systems such as electronic yellowpages may fall outside these limits. Spanish telephone regulations ban the use of hand-held phones while driving.

Further details will emerge from this project as it progresses.

ZAIDEL (1991)

This is certainly the most comprehensive review of research concerning methods forevaluating human factors and advanced driver information systems to date.[29] Thepurpose of the report is to identify critical, generic human factors issues relating toadvanced driver information systems and to propose a research agenda. This includesreviewing the literature relevant to Advanced Driver Information System (ADIS)evaluation; developing an evaluation framework; identifying relevant road, task, andmeasurement variables; discussing research needs; and describing example studies.The report is exhaustive in its coverage of the issues, including considerable work fromDRIVE and PROMETHEUS, and is thoughtful in interpreting the results. It is lesscomprehensive in proposing specific research. It is clearly a report those involved withdriver information systems must read. Zaidel's review of the literature will not bereiterated here. However, his ideas concerning evaluation methods are summarized inthe section that follows.

With regard to research emphasis, Zaidel states the following: “There are presentlyneither conceptual reasons nor methodologic solutions to justify investing the largeeffort needed to simulate emergency situations.”...[The] “navigation task appears to be agood overall cover-task for an evaluation scenario, even if the ADIS [Advanced DriverInformation System] is not directly concerned with navigation."[29]

While describing many types of evaluation procedures (using rapid prototyping,videotape-based tasks, simple driving simulators, etc.), Zaidel favors carrying out on-the-road studies, with traffic level as a variable. He views traffic negotiation as beingparticularly important. In particular, he proposes the idea of using the subject’s ownvehicle and adding to it both the ADIS system and the instrumentation. The advantageof this approach is that the subject is already accustomed to his or her own vehicle andmay behave more naturally. While this is an extremely interesting idea, there aresignificant challenges in terms of power and cooling requirements, and of interfacing theinstrumentation to the vehicle (steering wheel, brake, accelerator). If the set ofmeasures was limited, this may be possible, though the instrumentation would certainlynot be inconspicuous. He realizes this approach is ideal but not always achievable.“The engineering development stage of the interface device, personal andorganizational research preferences, and practical consideration influence the choice ofthe evaluation method as much as theoretical considerations.”[29]

Zaidel comments that measures commonly fall into three categories: those that reflectsimple vehicle control and guidance aspects of driving (traditional measures of drivingperformance such a speed and lane position), those that reveal time-sharing of ADISwith other driving functions (such as video records and verbal reports), and those thatrelate to the quality of the driving and the driving safety envelope (judgments obtained

Page 25: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

14

from driving instructors). Table 4 shows a more complete listing of these and othermeasures.

Table 4. Zaidel’s list of potential measures.[29]

Information TypeInformation Source Driver State Task Processing Quality of DrivingSensors Physiological

Control InputEye movementSpeed varianceTask performanceHeadway

Lane keepingTLCObstacle avoidance

Driver Report Self-evaluationTLXSWAT

Interface designPrioritiesDifficultiesSequencing

Error recoveryDistractionIncidentsSituation awareness

Expert Observer StressLoadInattention

Control errorsProcedural errorsTask sharingStrategies

Safety envelopeError recoveryAnticipationSituation awareness

Traffic Data Gap acceptanceSpeed distribution

Accidents

Note: TLX is the NASA Task Loading Index. SWAT is the Subjective WorkloadEvaluation Technique.

Of these measures, Zaidel believes that lane exceedance, glance frequency andduration, verbal comments, SWAT or NASA TLX scores, task completion times anderrors, steering wheel motions (as indicators of attention), and expert judgments of thequality of driving, are the key measures to collect.

For those unfamiliar with the literature, SWAT is a method for rating workload on threedimensions: time load, mental effort load, and psychological stress.[30,31] To assessworkload, a scale is developed in which operators provide an assessment of theirperception of workload at three levels on each of the scales. They then rate theworkload of specific tasks on each scale, from which numerical scores are developed.

TLX is a weighted score derived from subjective ratings on six scales (mental demand,physical demand, temporal demand, effort, performance, and frustration level).[32] Eachscale has 20 equal intervals and verbal anchors (e.g., low and high). Scale weights arederived from paired comparisons of scales in which subjects indicate which of each paircontributed more to workload. The overall workload score is derived from the individualscale scores and the scale weights.[33]

Zaidel proposes some interesting ideas concerning development of research methods.In particular, he identifies the need for research on (1) methods to assess safetyenvelopes, (2) a driving-specific TLX scale, (3) rules for grouping traffic situations, (4)measures of driving workload (both momentary and overall), (5) the measurement of the

Page 26: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

15

quality of driving, and (6) protocol analysis of task processing. These topics are worthyof further investigation.

ROBERTSON AND SOUTHALL (1992)

This report was commissioned by the Department of Transport (UK) to determine thefeasibility of developing a Code of Practice for the designers, manufacturers, and usersof in-vehicle driver information systems.[24] This project was carried out under contractto the Institute for Consumer Ergonomics (ICE). To identify the relevant literature, theauthors searched several computer data bases, then made inquiries to several researchorganizations. The reference list and bibliography are comprehensive and containmany reports of which U.S. researchers may not be aware.

Those organizations were also asked questions regarding a Code of Practice. Whilethe effort was apparently broad in attempting to get a wide range of opinions andperspectives, a statistical summary of those responses was not provided. From thoseefforts, ICE concluded that there was currently not sufficient data to set objective safetyperformance standards (e.g., eyes-off-the-road time), though a code of good practicecould be developed based on current knowledge. Accordingly, protocols for developingperformance standards should be given priority. Among the research organizations,there was almost unanimous support for a Code, but it should have a European andpossibly global application with the flexibility to consider future research. It was clearthe Code should be legally enforceable. The evaluation procedure should considersimulator versus road trials, individual differences (age, sex, vision, hearing, experiencewith displays), and the driving context (traffic, vehicle speed, lighting, weather).

SUMMARY

Thus, of the previous reviews, the work of Zaidel is the most insightful and the mostcomplete. He also identifies the key measures to collect: lane exceedance, glancefrequency and duration, verbal comments, SWAT or NASA TLX scores, task completiontimes and errors, steering wheel motions (as indicators of attention), and expertjudgments of the quality of driving. However, in considering that recommendation,readers should bear in mind that the measure that is appropriate may depend upon thetype of task for which the IVHS device is to serve as an aid: strategic, tactical, oroperational.

Page 27: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific
Page 28: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

17

WHAT ARE SOME OF THE KEY STUDIES OF DRIVER INFORMATION SYSTEMS?

While the previous section gives a sense of the types of measures one might use, it isimportant to have a detailed understanding of the measures that have been used.Following is a description of several key experiments pertaining to driver interfaceevaluation. Studies are listed in chronological order. (A more extensive listing appearsin the tabular summaries in the section following this one.)

This section is intended primarily as a tutorial for human factors scientists withbackgrounds in aviation or computing and as a refresher to scientists engaged inautomotive human factors. While more general research on attention, time-sharing, andsecondary task procedures is interesting background information, it is not reviewedhere.[34,35] Human factors scientists should be familiar with that material.

SENDERS' OCCLUSION STUDIES

A particularly interesting method for assessing the attentional demand of driving wasdeveloped by BBN.[36] (See also reference 37 for a nearly identical technical report, andreference 38 for a more detailed, earlier report.) Their method limited the amount of timethe driver could look at the road. A special helmet was devised consisting of amechanism to lower a translucent face shield (the type used on protective helmets) thatoccluded the driver's view of the road. There were two experimental conditions: (1) theexperimenter controlled lowering of the shield while the driver controlled his speed and(2) the driver controlled the shield while the speed was constant. The purpose of theresearch was to develop an empirical model relating the percentage of time the drivercould look at the road to speed and other measures. (See reference 37 for details ofthe model.)

The research program consisted of two experiments, one conducted on an unusedsection of I-594 in Massachusetts with few curves, the second on a closed circuit sportscar racing course. The 2.6 km (1.6 mi) track had 10 turns ranging from virtually straightto hairpin. In each case, there were two experimental conditions; either the occlusion(1.0 - 9.0 s) and viewing time (0.25, 0.5, 1.0 s) were fixed and the driver varied thespeed, or the speed and viewing time were fixed and the driver varied how often theroad was viewed. The test vehicle was a modified 1965 Dodge Polara sedan.

Five people served as subjects, including two authors of the occlusion report. Only oneperson served as a subject in all four experiments.

Figure 2 shows the relationship between occlusion time and terminal speed for twosubjects. The term “calculated” in the figure refers to predictions of a model of driverbehavior. (See reference 36 for details.) The data are for look times of 0.50 and 0.25 s,respectively. Both look times led to functions of the same general shape. There waslittle difference in the function when look times were extended to 1.0 s. The results areindeed remarkable. For example, using the data from the left panel in figure 2, onesubject drove at roughly 97 km/h (60 mi/h) looking for 0.5 s every 2 s (for situationswhen there was no traffic).

Page 29: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

18

Figure 2. Relationship between occlusion time and speed for two drivers.[37]

Figure 3 shows speed vs. occlusion times for a look time of 0.5 s. Notice that individualdifferences were large. Further complicating matters is that occlusion times also variedwith the radius of curves (larger radii had longer occlusion times) and their angularchange (larger changes had shorter occlusion times).

Figure 3. Speed vs. occlusion time for the race track.[37]

The Senders work makes two key contributions. First, it proposes and demonstratesthe utility of the visual occlusion method as a means for examining driver workload.Useful data can be obtained by either fixing look time, occlusion time, or driving speed.

Second, this work presents a rigorous quantitative expression (not discussed here) thatcan be used to assess attentional demands. While other researchers have used theocclusion method, there have been few applications of the expression. Limited use ofthis approach is most likely due to the risk it presents to experiment participants andother motorists. Limited use the expression has occurred because it is difficult to relateto physical attributes of the road, traffic, or the driver.

HICKS AND WIERWILLE (1979) - COMPARISON OF WORKLOAD MEASURES

This study is one of the few to examine the merits of alternative methods of assessingthe workload of driving based on the physical attributes of the road.[39] The paper alsoincludes a reasonably comprehensive review of workload.

In this experiment, 30 college students drove the Virginia Tech Driving Simulator. It hada 6 degree-of-freedom, single channel visual display, a 4 degree-of-freedom motionbase and 4-channel audio. The cab was open and the visual scene was austere. Whiledriving at a simulated 89 km/h (55 mi/h), the effects of simulated crosswinds wereimposed on the driving task. There were three levels of workload—low, medium, andhigh—achieved by varying the distance between the center of pressure of thecrosswinds and the vehicle center of gravity. After considerable practice, data werecollected for 90-s test trials. The three dependent measures were lateral deviation, yawangle deviation, and the number of two-degree steering wheel reversals/unit time.

Five sets of secondary task measures were collected from subjects. In the digit-readingtask, participants were shown a random sequence of digits by a seven-segment LED

Page 30: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

19

mounted in front of them. Initially, baseline data were collected to determine thepresentation rate at which each participant could read 80 percent of the digits showncorrectly (1.5 to 4.25 presentations/second) when the digit reading task was done alone.Subsequently, that task was performed concurrently with driving. In the heart-ratecondition, an ear-mounted plythysmograph was used to measure heart rate, from whichheart-rate variability was computed. In the rating scale condition, participants ratedcrosswind effects for each session on two 11-point scales (A = extremely harsh andtroublesome, K = extremely small or imperceptible) and attentional demand(A = extremely high attention needed, K = extremely low attention needed). In theocclusion condition, each time the driver said "now," the road scene was presented for200 ms.

The primary analysis of the data was an Analysis of Variance (ANOVA) of workloadwhere, for each of the three dependent measures, workload was defined as:

100 * (1 - ( dual task performance ) ) (1)

single task performance

In terms of the primary task measures, workload variation led to significant differencesin all three dependent measures. These differences were slightly more pronounced insteering wheel reversals than in yaw deviation, but more so than in lateral deviation.This suggests that steering wheel reversal is the most sensitive measure of drivingworkload of those examined.

In terms of the secondary tasks, only rating scale data reflected the differences inworkload (crosswinds location). There were no differences between the two ratingsobtained. Digit reading performance, heart-rate variability, and occlusion times were notaffected by workload. As Hicks and Wierwille explain, there were many possiblereasons why these measures did not show significant effects, such as lack ofexperience with the equipment, small sample sizes, etc. For example, when theocclusion method was used, participants were willing to traverse more of the lane thanthey might have for a real road.

From these results, Hicks and Wierwille recommend "primary task measures and ratingscale measures (as constructed here) should be used in assessing driver workload,particularly if it is of a psychomotor nature."[39] Here the primary measures weresteering wheel reversals, yaw deviation, and lateral deviation, and the rating scalemeasures were of attentional demand. The particular measures selected for eachcontext require some thought. In this experiment, people were driving on basically astraight road and their course was perturbed by wind gusts. The immediate effect ofsuch is to cause the vehicle to yaw suddenly, not move laterally, and hence yaw angleshould reflect workload, as it did in the experiment. Further, the effect of crosswinds indriving are felt (as a torque on the steering wheel) before they are seen, but it is unclearwhat force feedback cues were provided in the simulation.

GODTHELP'S RESEARCH ON TIME-TO-LINE CROSSING

Godthelp, Milgram, and Blaauw (1984) describe an evaluation of TLC or time-to-line-crossing, a measure of driving strategy.[40] In reviewing control models of driving,

Page 31: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

20

Godthelp, et al. note that most models assume drivers behave as error-correctingmechanisms who continually attend to the steering task. In contrast, Godthelp believesdrivers behave as intermittent controllers, sampling the road scene, making corrections,and then not sampling for a while. Godthelp and others suggest that in makingsampling decisions, the time-to-line-crossing is a critical factor. The time-to-line-crossing is how long it would take a vehicle to reach either lane edge if the steeringwheel is not moved. At each moment the vehicle is assumed to have some headingerror and may not be in the center of the lane. In some sense, this represents a marginof safety that drivers maintain.

Godthelp, et al. had six drivers steer an instrumented car on a 2-km (1.2-mi) section ofan unused section of a 4-lane divided highway. While driving, they wore a bicyclehelmet with a translucent visor that could be raised for 0.55 s looks whenever the hornbutton was struck. For each run the speed [20 to 120 km/h (12 to 75 mi/h)] was heldfixed by a cruise-control-like device. Steering wheel angle, yaw rate, and lateralposition were sampled at 4 Hz.

Godthelp, et al. found that the time-to-line-crossing (TLC) decreased as speedincreased, with the 15 percent TLC being about 0.3-0.4 s greater than the occlusiontimes over the range of speeds examined. (See table 5.) Further, the data show thatover the range of speeds examined, the time-to-line-crossing at the end (TLCe) of eachocclusion period was about 1.57 times the occlusion duration. This suggests a constantrelative safety margin (constant fraction of time) maintained by the drivers. Sincelooking away from the road has the same effect as occluding vision of the road ahead, itseems reasonable to propose that TLCe values (divided by 1.57) for roads mightprovide estimates of the time available to view in-vehicle displays.

Table 5. Values of occlusion and time-to-line-crossing.[40]

Speed(km/h)

Speed(mi/h)

T Occlusion(s)

15% TLC TLCe(s)

TLCe/T Occlusion

20 12 5.32 6.7 8.88 1.6740 25 4.23 4.5 6.33 1.4960 37 3.45 3.9 5.32 1.5480 50 3.15 3.5 4.77 1.51100 62 2.67 3.1 4.35 1.63120 75 2.38 2.9 3.74 1.57

Godthelp (1988) describes another experiment to strengthen the concept of alternatingbetween open- and closed-loop driving.[41] The same instrumented vehicle, andprobably the same highway [with 3.5-m (11.5-ft) lanes] from the previous experimentwas used. The dependent measures were also the same; so, too, was the number ofdrivers. Only 3 speeds were examined—20, 60, and 100 km/h.

Participants were instructed to drive the car normally until a tone was presented. At thattime, they were to ignore the path error and stop steering until the vehicle reached thepoint at which it could just be comfortable correcting the heading to avoid reaching thelane boundary.

Page 32: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

21

From these data, two measures of time-to-line-crossing were obtained. TLC's, the valuefor when the steering wheel was turned, and TLCmin, the smallest value of TLCobtained just after the correction began. The mean values of both of these measuresremained fairly constant with speed (TLCs=1.3 s, TLCmin=1.1 s) and were the same forboth lane boundaries (left and right). Also remaining constant was the minimum lateraldistance to the lane boundary (about 15 cm). Other dependent measures varied withforward speed: the lateral distance at which corrections were made increased, as didthe mean lateral speed.

Thus, in this experiment, drivers resumed steering when there was a constant amountof time left before the vehicle reached the lane boundary, implying a temporal safetymargin of just over 1 s.

ZWHALEN'S WORK ON OCCLUSION

Zwahlen and his colleagues have carried out several studies that are relevant to thesafety evaluations of in-vehicle displays. Zwahlen and Balasubramanian (1974)examined steering behavior when there was no visual input.[42] The rationale for thisresearch is that drivers looking at an in-vehicle display are obviously not looking at (orpaying attention to) the road ahead. For all practical purposes, not allowing drivers tolook at the road and keeping their eyes closed should lead to identical drivingperformance since road-related visual input is required to steer. The purpose ofZwahlen's research was to determine acceptable eye fixation behavior, namely howlong and how often drivers could look away from the road, and still adequately maintainlateral position.

Two 23-year old students drove a 1965 Volkswagen sedan and a 1971 AMCAmbassador down an airport runway. A container of liquid dye attached to the rearbumper, dripped regularly onto the pavement when the car was driven. Subjects droveat 16 to 64 km/h (10 to 40 mi/h). When they reached a starting point, they closed theireyes until they had traveled either 153 m (500 ft) or had reached the side of therunway. There were 47 runs with no visual input and 13 with no steering control. (Thesubjects took their hands off the wheel.) Path deviations were recorded every 4.6 m(15 ft) to the nearest 1.2 cm (1/2 in).

Originally, Zwahlen proposed that uncertainty of the lateral position of the vehicle (thestandard deviation in lateral position) could be derived from the work of Senders andothers, as shown below.

sy = k *V2 * T1.5 (2)

Substituting V = D/T

sy = k * D2/T.5 (3)

where: sy = standard deviation to vehicle displacementfrom the centerline

Page 33: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

22

k = constantV = velocityT = occlusion timeD = distance traveled at constant speed while occluded

Using the experimental data for 74 km/h (46 mi/h) and 64 m (210 ft), k = 0.0004. Plotsof occlusion distance versus standard deviation of road position suggest that for thisvalue of k, the standard deviation at each distance is greater at higher speeds, which isthe opposite of what Zwahlen’s data show.

A second analysis, based on the steering model of Wier and McRuer described in theirpaper, assumes the initial heading error is negligible. (See the original paper fordetails.) Based on that analysis Zwahlen proposes the following expressions of thestandard deviation of lane position:

sy = k * D * T0.5 (4)

A summary of the experimental results for one person and one car appears in figure 4.From those data, Zwahlen and Balasubramanian proposed k = 0.025 (0.683 in metricunits) for steering with no visual input. As a check, using the data in that figure, a valueof 0.0272 was computed for D = 110m (360 ft), close to the reported value. For thatsame distance for no steering control, k was calculated to be 0.0467 from the figure.

Figure 4. Standard deviations of vehicle displacements forno steering and no visual input.[42]

The actual standard deviation as a function of time is larger than computed using theseequations which are for the zero-corrected case, this assumes drivers started in themiddle of the road. In fact, their initial positions were 13 to 18 cm (5 to 7 in) off-centerwith a standard deviation of 11 cm (4.5 in). No heading error was assumed.

To apply this approach to safety assessments, additional data on driver, vehicle androad surface differences are needed to compute the appropriate values of k. With thatinformation, safety criteria for inattention (at least for straight-road driving) could becomputed.

Zwahlen and DeBald (1986)

Zwahlen and DeBald had 12 people drive either a large or small car, again recordinglane error using liquid dye.[43] Mounted midway in the center console were either articleclippings or sections of a road map. Participants drove at 48 km/h (30 mi/h) and at thezero point either drove normally, with their eyes closed, or began reading text inside of

Page 34: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

23

the car. How drivers were to divide their attention in the reading task is not described,though it is assumed they were not to look at the road.

There were no significant differences in the variance of lateral deviation between carsfor the first 15 m (50 ft). Interestingly, there was also no difference in the lateralvariance between the eyes closed and reading conditions for distances up to 69 m(225 ft) (5 seconds at 30 mi/h). This suggests that for moderate distances, closingone's eyes has an identical effect on driving as looking inside the vehicle for information.Figure 5 shows the standard deviation of lateral deviation for the conditions examined.

Using the second model described previously, Zwahlen and DeBald estimate k = 0.076(averaging across car sizes) with a mean absolute error of 27 cm (10.7 in) for drivingwith eyes closed. (From the graphics presented, a value of 0.073 is estimated.) Thisestimate of k is about triple the earlier estimate. For reading text k = 0.041 with a meanabsolute error of 11 cm (4.4 in). (A check from the figures provided gives an estimate ofk of 0.042—quite close.) The k for reading text is about 50 percent of that for eyesclosed.

Figure 5. Standard deviations versus distance traveled.[43]

Figure 6 shows a comparison of the measured and estimated values for reading text.Notice the difference is a maximum at about 400 feet (about 25 cm (10 in) less than the140 cm (55 in) predicted). For engineering estimates, this 20-percent difference is a bitlarge. In applying the model, it is important to consider the critical range of the lanevariation when computing k. Also, the estimate for k is much lower [close to 0.03 fordistances something less than 92 m (300 ft)] is the initial mean error [20 cm (7.7 in)] isincluded in the calculation (that is, sy = k * D * T0.5 + 7.7).

Figure 6. Comparison of measured and predicted standard deviations.[43]

To put these numbers in context, Zwahlen gives the example of a 1.8-m (6-ft)-wide carin a 3.6-m (12-ft)-wide lane. A lateral deviation of 1 m (3 ft) would put the driver at theedge of the lane. At 48 mk/h (30 mi/h) for a reading task of 2, 4, and 6 s, respectively,the chance of leaving the lane (100 * probability) are shown in table 6. The chancescomputed are for two one-tailed tests, since lane departure can occur in either of twodirections. However, since there is an initial bias, the departure is most likely in thedirection of the bias, though probabilities must be calculated in both directions. The

Page 35: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

24

values reported in the paper do not agree with the author’s calculations, either using thereported value of k, or a somewhat smaller value that provides a better fit to standarddeviations at shorter distances. Clearly, the probability of departing from a lane (andconcern about claims of risk) are very dependent upon the estimates of k.

Table 6. Recomputed standard deviations.[43]

Lane W(ft)

DistanceD (ft)

TimeT (sec)

Sy (in)k=0.041

Sy (in)k=0.030

Chance ofDeparture

(%)(author,k=0.041)

Chance ofDeparture

(%)(author,k=0.030)

Chance ofDeparture

(%)(Zwahlen,

p. 260)10 88. 2 5.1 3.7 0.07 0.01 1.2510 176. 4 14.4 10.6 15.31 6.31 6.3010 264. 6 26.5 19.4 38.99 25.10 18.4112 88. 2 5.1 3.7 <0.01 <0.01 0.0412 176. 4 14.4 10.6 2.60 0.39 1.1012 264. 6 26.5 19.4 19.41 8.44 8.69

Note: Sy = k * D * T(.5)bias = 19 cm (7.7 in)

Zwahlen, Adams, and DeBald (1988)

This is the best known of Zwahlen’s recent studies.[44] He presented eight young driverswith a paper mockup of the CRT touch screen in the 1986 Buick Riviera. It wasmounted in a 1981 Oldsmobile Cutlass in a position comparable to the Riviera’s centerconsole or in a lower position. The "airport runway strip chart technique" was againused to record lane position. Drivers accelerated to 64 km/h (40 mi/h) and then usedthe simulated touch panel to turn the radio on, adjust the volume, etc. Drivers eitherlooked directly and continuously at the simulated panel until the task was completed, orcould look at the road ahead as necessary. Drivers' visual behavior was observed byan experimenter seated next to the driver. Given the description of fixation behaviorprovided in the original document, readers are left with the impression that "looks" wereclassified as either inside or outside of the vehicle. Hence, each look could consist ofmultiple fixations.

Figure 7 shows the uncorrected standard deviations of lateral lane position. Someaspects of this figure make sense, while others do not. In general, the standarddeviation of lane position error increases with time in the beginning of the run and then,as the in-vehicle task is completed, decreases. For all conditions drivers had completedtheir tasks after they had driven 244 m (800 ft). Notice that overall the standarddeviation for Condition B (high, not looking) was approaching 0.6 m (2 ft) in mid-run,but, for Condition D (low, not looking) was close to 0.3 m (1 ft). From a practicalperspective, a 0.3 m (1 ft) difference in deviation matters. But what is unexpected isthat the data for the looking conditions (A and C) were between B and D. They shouldbe less than B and D. Zwahlen, et al. reported that subjects usually glanced outside thevehicle after a page change (from radio to climate). It could be that because external

Page 36: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

25

demands were low (there was no other traffic) drivers attached a lower priority tosteering and it was no longer treated as a "protected" task.

Figure 7. Uncorrected standard deviation of lateral lane position.[44]

With regard to the in-vehicle task times, Zwahlen, et al. report they are normal withmeans and standard deviations of 5.02/0.98 and 8.39/1.63 s for the radio and climatecontrol tasks, respectively.

Figure 8 shows the number of looks to the displays for each task. There were nosignificant differences due to display location (2.79 looks for high, 2.67 for low for theradio, 1.42 for high vs. 1.46 for low for the climate control). There were four instanceswhere drivers did not look outside while operating the radio and three instances for theclimate control.

Figure 8. Direct looks made by drivers to the radio and climate control.[44]

Combining the data for all four conditions for the entire run [0 to 270 m (885 ft)],Zwahlen calculates a standard deviation of 42.2 cm (16.62 in). Using that value, heestimates that under ideal conditions (dry pavement, daytime, etc.) at 64 km/h (40 mi/h),there is a 3-percent chance a driver would deviate from a 3.6 m(12-ft) lane whileoperating a touch panel in a 1.8-m (6-ft) wide vehicle. (Note: On each side of thevehicle is a 1-m (3-ft or 36-in) clearance. Hence, 36/16.62 = 2.166 = z. According tothe normal distribution tables, the 1-tailed p is 0.015, so for the 2-tailed case theprobability is 3 percent.) For a 3-m (10-ft) lane, the probability of lane exceedence is 15percent.

What is wrong, then, with these data? In brief, the problem is with the task demands.Because there was no traffic, it is likely steering was given a lower priority than normal.Drivers gave priority to the internal task over lane maintenance, something they wouldnot do while driving on the highway. As described elsewhere, research from Wierwilleand his associates shows that drivers alter their attention allocation based on externaldemands, both anticipated and unanticipated. That does not mean, however, that thesedata should be discarded.

From these and other data, Zwahlen et al. propose the design guide shown in figure 9.The constraint on the number of fixations comes from his work on pavement markings,

Page 37: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

26

and from Senders’ classic study of the attentional demands of driving.[36,37] Specificallythey say the following:

The areas...are based partially on the conceptual model...which suggest[s] thatthe number of consecutive looks required to obtain a specific chunk ofinformation...while driving along a straight path needs to be limited to aboutthree, partially based on the results of ...[his work on edge markings, Senders etal.]...which indicated out of view times (rear view mirror, speedometer, etc.) orocclusion times in the range of 0.5 to 2.0 seconds for tangent sections, 0.32 to0.34 second for curve sections and partially upon experimental results from...[references 42 and 43]...and this study which indicate that the lateral laneposition standard deviations and lane exceedence values reach unacceptablevalues after 2 to 4 seconds of occlusion of visual road/traffic input or whenworking...inside a vehicle.[44]

2.0

1.8

1.6

1.4

1.2

1.0

0.8

0.6

0.4

0.2

0

1 2 3 4

REGION

REGION

ACCEPTABLE

GRAY

AREA

AV

ER

AG

E D

UR

AT

ION

PE

R L

OO

K IN

SID

EC

AR

AT

DIS

PLA

Y O

R C

ON

TR

OL

TOTAL SEQUENCE OF LOOKS REQUIRED INSIDECAR TO OBTAIN A SPECIFIC CHUNK OF

INFORMATION FROM A DISPLAY OR TO OPERATEA SPECIFIC CONTROL

No.of

Looks

secs.

VERY SIMPLE OR QUALITATIVE INFORMATIONNOT FEASIBLE

UNACCEPTABLE

Figure 9. Proposed design guide for in-vehicle displays.[44]

More specifically, here is the evidence. Extrapolating the data from Senders' famoushelmet study described earlier, to 105 km/h (65 mi/h) (the highest posted speed on U.S.roads) for straight roads, an occlusion time of 1.0 s is associated with a viewing time of0.5 s. This suggests that, for the easiest of driving contexts (a straight road in daylightwith no traffic and an experienced person), distractions from the road 1 s (with 1/2 s inbetween) is the most drivers will accept. (Or, this is the least the three drivers testedwould accept.)

Page 38: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

27

Zwahlen, Adams and Schwartz (1988)

In two related experiments, a total of 20 young drivers steered a car equipped with anautomatic transmission at 64 km/h (40 mi/h) down an airport runway instrumented in amanner identical to the previous studies.[45] A standard pushbutton phone (simulating acellular phone) was installed in either a 1982 Pontiac station wagon (part 1) or a 1985Plymouth Turismo (part 2). It is not clear what feedback each keypress provided, if thedialed number was displayed, and if there was a send/dial key to be pressed. Therewere four conditions (car phone mounted low/high on the dash vs. driver allowed/notallowed to look at the road while dialing). The 11-digit number to be dialed was on apiece of paper near the phone. Drivers did not correct dialing errors (as they would witha real cellular phone).

Figures 10 and 11 show standard deviations of lateral position and dialing taskcompletion times. These lane deviation data make more sense than those in theprevious study. In general, in terms of increasing lane variance, the order was look/highon the panel, look/low, no-look/high, no-look/low. Statistical tests of these differencesare not presented. Initial lateral standard deviations ranged from 13 to 18 cm (5.2 to 7.1in), with most of the deviations being close to 14 cm (5.5 in). Table 7 shows thestandard deviation of lateral deviation when dialing was completed. Notice that bothpositioning and look/no look affect these values. For 6 of the 8 cases shown, themaximum standard deviation was close to 0.6 m (2 ft), which would put the laneexceedence probability at 30 percent, a large value.

Page 39: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

28

Figure 10. Standard deviation of lateral position and taskcompletion times for the Plymouth Turismo.[45]

Figure 11. Standard deviation of lateral position and taskcompletion times for the Pontiac Wagon.[45]

Table 7. Standard deviation of lateral position.[45]

Look/High No-Look/High Look/Low No-Look/LowDialing Completedwagon 8.8 20.7 11.1 25.2small car 7.0 17.5 15.0 20.8Maximumwagon 16.8 27.7 23.0 38.1small car 16.6 27.2 25.3 40.7

With regard to the dialing task, the distributions appear normal, with an overall mean of9.1 s (or 0.83 s/digit) and a standard deviation of 1.72 s. Looking at the road increaseddialing times slightly but panel location had no effect (look/high = 9.5 s, no look/high =8.5, look/low = 9.8, no look/low = 8.6). For the wagon, the number of outside looks was2.9 times/dialing sequence (standard deviation of 1.1 looks) for the high location and 2.2times with a standard deviation of 1.4 looks for the low location. Dialing errors were fewand did not differentiate between conditions. There were three errors for each of threeconditions and two in the other.

Averaging across both vehicles, the standard deviation for lateral position was 39 cm(15.4 in) for the 197-m (675-ft) run. For a 3.7-m (12-ft) lane, Zwahlen claims that 1.9percent of the dialing tasks would result in a lane excursion under ideal conditions(daytime, dry pavement, straight road, etc.) at 65 km/h (40 mi/h). For a 3-m (10-ft) lanethe associated quantity is 11.9 percent. Zwahlen considers these amounts to beunacceptable to driver safety and the use of design enhancements to prevent phonedialing in curves or heavy traffic, the use of voice recognition input, and othermodifications. He also argues that this evidence can be used to support his design

Page 40: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

29

guide, though it is not clear how the results here are linked to the specific recommendedvalues in the design guide.

As with his previous studies, the main criticism of this research is the nature of the taskto be performed. Because there was no traffic or other significant external demands,steering was not a protected task and hence lane variation was much larger than itshould be while driving. It does, however, represent a worst case scenario, such as avery important phone call to which the driver devotes more attention than is reasonable.Wierwille's research, described elsewhere in this section, argues that drivers behave ina manner different from this experiment when in traffic.

Zwahlen’s experiments are covered in detail because they serve as the basis for hisdesign guide, the only quantitative one in the literature at the moment. Those data havebeen used as a basis for arguing against including phones, navigation systems, andother driver information systems in cars. "Based upon the results of this study, thedevelopment and introduction of sophisticated in-vehicle displays and/or touch panelsrequiring consecutive eye fixations of several seconds inside of the automobile shouldbe halted."[43] "The introduction of in-vehicle CRT touch panel controls which require anumber of consecutive eye fixations and a fairly stringent eye-hand-finger coordinationand touch accuracy for several seconds inside a moving vehicle should be reconsideredand delayed."[44] "The results of these two studies, especially the lane exceedenceprobabilities, appear to indicate that the use of cellular pushbutton telephones inautomobiles requiring finger touch inputs of relatively long number strings while driving,should be reconsidered and may be unacceptable from a driver safety point of view."[45]

QUIMBY (1988) - UNSAFE DRIVER ACTIONS

This experiment developed a scheme for classifying unsafe driving actions andexamined its application.[46] As noted earlier, it is an approach that Zaidel findsattractive. Forty-eight young and middle-aged Australians participated. After driving thetest vehicle (Mitsubishi Sigma) around the test facility grounds, participants drove a60 km (37 mi) route that took 90 to 100 minutes to complete. The route included majorand minor roads, local streets, and intersections with traffic signals and roundabouts.Subjects were guided by an experimenter in the front seat. Quimby, seated in the rear,identified unsafe driver actions as they occurred. While driving, subjects wereencouraged to comment on aspects that might be dangerous either to the driver orothers. Comments were recorded on audio tape. No other instrumentation forrecording driving was present.

For the purpose of this research, an unsafe driving action (UDA) was considered to be,“any action or lack of action on the part of the driver that increased their risk of anaccident”[46] Poor driving practices, e.g., failing to signal, were considered UDAs only ifother road users or vehicles were involved. There were 28 types of UDAs (e.g., too fastfor conditions, following too closely) combined into 15 categories.

Table 8 shows the number and severity levels of different types of UDA's. UDA's wereclassified as slight or serious, and as being conflict or nonconflict types. The associatedpercentages are shown in the table. Conflicts were situations "where the subject, or

Page 41: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

30

another driver, pedestrian or cyclist, had to take action, such as braking or steering, toavoid an accident occuring."[46]

Participants made a total of 2016 UDA's (roughly 42 per driver, 1 every 2 min, or 1every 0.7 km (0.4 mi) of driving). The number per driver ranged from 10 to 113. Mostcommon were following too closely, driving too fast for conditions, and positioning whileturning (e.g., being too far in the intersection at the start of a turn). Notice that therelative frequency with which particular UDA's lead to serious conflicts varies betweenUDA's.

These data were examined in considerable detail with regard to age, sex, personality,maneuver type, etc. The intent was to compare these data with accidents at severalintersections along the route. There was little correlation between the two sets of data,though the number of accidents used in the correlation was small. It was thought thatpart of the problem may have been that other behaviors should be recorded as well.

Thus, while this research provides an interesting insight into the development of themethod for certifying safety, further work is needed before it can be applied to theevaluation of in-vehicle systems. Some of that work is being carried out as part of theDRIVE program. Validation of the method is also required.

Table 8. Number and severity of different UDA's.[46]

Type of UDA #Committed

%Serious

%Conflicts

% SeriousConflicts

Too fast for conditions 328 10.4 4.0 0.6Following too closely 414 27.8 8.7 2.4Emerging into traffic 176 30.1 70.5 28.4Emerging when stopping 16 43.8 62.5 31.3Turning across approaching traffic 32 34.4 40.6 15.6Late through traffic signal 18 22.2 0.0 0.0Overtaking/passing vehicles 70 10.0 15.7 5.7Positioning going ahead 156 5.8 8.3 0.6Positioning while turning 351 5.4 4.8 0.0Observation/anticipation 178 10.7 33.7 6.7Erratic maneuver 44 9.1 27.3 6.8Giving/taking priority 72 13.9 27.8 4.2Rear observation 34 5.9 2.9 4.4Signaling 68 2.9 4.4 1.5Steering/hitting curb 59 1.7 0.0 0.0Total 2016 14.7 16.5 5.8

Page 42: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

31

VAN DER HORST AND GODTHELP (1989) - TIME-TO-COLLISION

Van Der Horst and Godthelp (1989) describe two techniques for collecting data ondriver behavior: an instrumented vehicle and video recording of traffic scenes.[47] Thecar contains potientiometers for recording steering wheel and throttle angle, switch torecord shift lever position, a pushbutton unit an experimenter can use to code driveractions, sensors for speed and distance travelled, and a lane tracker to determinelateral position. In addition, attentional demands can be assessed using a device toocclude the drivers vision momentarily, and physiological responses (heart rate,respiration rate, galvanic skin response, etc.) can be measured. This vehicle is used forevaluating experimental visual and auditory driver interfaces.

To record traffic scenes, the scientists at the Institute for Perception (TNO) mounted avideo camera about 4 m (13 ft) above the road surface, generally on a lamppost,balcony, or building close to a location of interest. In some situations there may havebeen more than one camera. Video frames were recorded on a VCR and time coded.

To analyze the data, tapes are played back frame by frame and the position of keyobjects is digitized manually. This is believed to be a very labor-intensive and time-consuming task. The on-screen coordinates are then transformed into road coordinatesvia computer software. This information is then used to compute predictions of vehiclevelocities and time-to-collision (TTC). TTC is defined as the time for two vehicles tocollide if they were to continue at their present speed and on the same path. Theoperational rationale behind this measure is to encourage drivers to attempt to maintaina protected space around their vehicles so that if an unexpected event should occur (forexample, a lead driver suddenly braking), they will generally have adequate time toreact and avoid a collision.

Van Der Horst and Godthelp, 1989, p. 79 note, “In general, only interactions with aminimum of 1.5 seconds are considered critical.[47] Trained observers are able toconsistently apply this threshold value.” However, Van Der Horst and Godthelp did notexplore the ability of trained observers to identify a range of values, something thatwould clearly be difficult to do on a continuous basis. Hence, digitized data is stillrequired. While the evidence on TTC is far from complete, Van Der Horst believes thatdrivers use this parameter in making maneuvering decisions.

Page 43: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

32

WIERWILLE AND HIS COLLEAGUES - IN-VEHICLE STUDIES OF NAVIGATION

Wierwille and students working with him have carried out several experiments thatprovide quantitative data on the usability of navigation systems and that suggest amethod for evaluating them. (See reference 3 for a more complete description.) Allexperiments involved an instrumented 1985 Cadillac Sedan DeVille fitted with an ETAKnavigator (with the smaller screen). The car was equipped with a camera to record theforward scene and a second aimed at the driver to determine the direction of gaze.Steering wheel movements, speed, and foot control use were recorded by a computer.Lane excursions were recorded manually. Experiments typically involved 20 to 30people varying in age. Test routes involved streets, two-lane State roads, andexpressways, taking about 20 min to drive. Prior to testing, drivers were givenextensive training in the use of the navigation system.

The first experiment, Tom Dingus’ dissertation, concerned the attentional demand of theETAK Navigator (a moving map display).[48] (See also references 49, and 50, 51.) Whiledriving, participants were verbally cued to perform certain tasks, such as reading thespeedometer, reading the time, adjusting the fanand reading the name of the next crossstreet. Table 9 shows the total glance times, which equal the sum of the individualglance times. Notice the long times associated with tasks associated with thenavigation system. Clearly, looking away from the road scene for an extended period oftime increases the risk of driving.

Page 44: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

33

Table 9. Total display glance time for each task.[48]

Task Mean Time Standard (s) Deviation (s)

under 1.0 sSpeed 0.78 0.65Following Traffic 0.98 0.60

1.0 to 2.5 sTime 1.04 0.56Vent 1.13 0.99Destination Direction 1.57 0.94Remaining Fuel 1.58 0.95Tone Controls 1.59 1.03Info. Lights 1.75 0.93Destination Distance 1.83 1.09Fan 1.95 1.29Balance Volume 2.23 1.50Sentinel 2.38 1.71Defrost 2.86 1.59Fuel Economy 2.87 1.09Correct Direction 2.96 1.86

2.5 to 4.0 sFuel Range 3.00 1.43Temperature 3.50 1.73Cassette Tape 3.23 1.55Heading 3.58 2.23

4.0 to 8.0 sZoom Level 4.00 2.17Cruise Control 4.82 3.80Power Mirror 5.71 2.78Tune Radio 7.60 3.41

over 8 sCross Street 8.63 4.86Roadway Distance 8.84 5.20Roadway Name 10.63 5.80

Those data have been replotted in figure 12. It is reasonable to argue that controls anddisplays should not be added to the vehicle that are worse than those that are providednow. Using that rationale, the mean glance durations should be less than 1.2 s.Further, the author believes that drivers find the number of glances required to manuallytune a radio, use of the power mirror, the cruise control, and possibly handling thecassette tape to be unacceptable. (There is no quantitative data to support this, only hisexpertise.) If that judgment is accepted, then the number of glances required should befour or less.

Page 45: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

34

76543210.6

0.8

1.0

1.2

1.4

1.6

1.8

# of Glances

Mean GlanceLength (sec)

Cross Street

Road Name

Road Distance

Tune Radio

Power Mirror

Cruise ControlCassette Tape

Zoom Level

Correct Direction

Temperature

HeadingDestinationDirection

Fuel Range

Defrost, Fuel EconomyFan

SentinelFuel Remaining

ToneBalance/Volume

Info LightsTime

Following Traffic

VentSpeed

DestinationDistance

Note: The labels in bold are navigation system functions.

Figure 12. Glance time data.[48]

Another perspective is that using controls and displays should not cause drivers to driftfrom the lane in which they are driving. Table 10 shows the lane excursion data fromthat experiment, a somewhat insensitive measure of safe driving behavior. Driversshould never leave the lane, but it is not clear whether that is an achievable criterion ina practical context.

Page 46: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

35

Table 10. Number of lane exceedences and mean duration.[48]

Task Total Numberof Lane

Exceedences(All Drivers)

Mean Duration(s)

Following traffic 0Time 0Speed 0Vent 0Destination distance 0Destination direction 0Turn signal 0Fan 1 0.46Remaining fuel 1 0.95Tone controls 1 0.97Correct direction 1 1.00Sentinel 2 0.28Balance 2 0.55Defrost 3 0.67Heading 3 0.62Info. Lights 3 0.83Fuel economy 3 2.25Zoom level 4 0.94Fuel range 5 0.84Temperature 8 0.65Cross street 8 0.93Roadway name 8 1.38Roadway distance 9 1.17Tune radio 10 1.86Cassette tape 13 0.99Power mirror 21 1.10

Also important was Hulse’s thesis, which examined how anticipated increases forattentional demand related to the steering task affected use of a navigation display.[52](See also references 53, 54, 55, 56, 57.) To examine demand, an expression wasdeveloped to estimate workload, also referred to as attentional demand (Q) fromroadway geometry. Q has a range of 0 to 100. The calculation is shown on thefollowing page. Details concerning the derivations of the workload calculations are notgiven in the open literature. (For example, why the sight distance factor is a function ofthe log of the inverse of the sight distance?)

Page 47: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

36

Q = 0.4A + 0.3B + 0.2C + 0.1D (5)

where:

A = 20 log2(500/Sd) (Sight Distance Factor) (6)

where Sd = sight distance (m)if Sd > 500, then A = 0if Sd < 15.6, then A=100

B = (100*Rmax) / R (Curvature Factor) (7)

where R = radius of curvatureRmax = maximum value of the radius of curvature

(set to 18.52 m (60.7 ft)the turn radius for a city street)

note: R = 360X / (2πa)

X = arc length along the curve (m)a = change in direction (degrees)

C = -40So + 100 (Lane Restriction Factor) (8)

where: So = distance of closest obstruction to road (m)(phone pole, fence, ditch, etc.)if So > 2.5, then C=0

D = -36.5W + 267 (Road Width Factor) (9)

where: W = road width for 2 lanes (m)if W > 7.3 (24 ft, 12 ft lanes), then D = 0if W < 4.57 (15 ft, 7.5 ft lanes), then D = 100

To calibrate these expressions, five graduate students studying human factorsengineering drove several road sections twice and then rated them. In the ratings,1 corresponded to being able to look away from the road for long periods (4 s or more),5 was for being able to look away for periods of 1 to 1.5 s, and 9 corresponded to notbeing able to look away at all. Traffic on the test roads varied from light to heavy.

The correlation between the subjective ratings of workload and the calculations of Qwas 0.72, with sight distance being the best predictor of the subjective rating.Correlations of those ratings with the percentage of time fixating objects related todriving and the percentage of time fixating the navigation display were low, typically0.10 to 0.20. This may be because of the detailed analysis (short road segments).

An interesting result of this series of studies is where drivers looked as a function oftraffic load and when incidents occurred. (“Incidents” included certain kinds of traffic atintersections, vehicles ahead changing lanes, etc.) In response to increasing traffic,

Page 48: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

37

drivers were more likely to look at the road center and for longer periods of time (andless likely to look at the navigation display). The same behavior was observed forincidents. Thus, drivers adapted to the imposed demands placed upon them in asensible manner.

BOS, GREEN, AND KERST (1988) - LABORATORY ASSESSMENT OF DISPLAYS

In contrast to these field studies are laboratory studies conducted by UMTRI to assessdisplay legibility.[58] (See reference 59 for a review of the literature.) After conductingseveral pilot experiments to fine tune the test methods, eight young drivers participatedin a response time experiment.

The basic task involved showing drivers (seated in a vehicle mockup) slides ofinstrument clusters in the location they would normally appear. Speedometers varied intheir size, contrast, illumination level, and location on the instrument panel. The drivers'task was to find the numeric speedometer and indicate, by pressing a button, if thespeed was over the limit [89 km/h (55 mi/h)]. Drivers either responded to speedometersalone or in concert with one of two other tasks (responding to arrows, driving asimulator). In the arrows condition, drivers fixated at a screen well in front of the vehicleon which an arrow might appear. If it did, they pressed one of two keys (left or right).For trials on which an arrow slide was not presented, drivers looked from the far screento the instrument panel and responded to the panel slide. In the third condition,subjects drove a simulator and, at random times, responded to slides that appeared onthe instrument panel. The pattern of results from the driving and arrows conditions werequite similar. Response times when there was no additional task tended to significantlyunderpredict response time in other task combinations when the viewing conditionswere poor (e.g., low contrast). Further, the variability in the response times tospeedometer slides within designs in the arrows condition was less than in the drivingcondition, so the more sensitive arrows condition was used for subsequent studies ofinstrumentation.[60]

Thus, what emerged from this research was a laboratory method for testing displaylegibility. This research highlighted the importance of driver accommodation to the roadscene as a factor that influenced the time to read instrumentation. Thus, if instrumentpanel display legibility is to be assessed, an external task must be provided. In addition,visual search for the relevant display was an important factor. All of the speedometerstested met minimum legibility standards. However, increasing the size of speedometerdigits to several times the minimum led to large reductions in response time, since theseincreases made it easy for drivers to find the speedometer.

NOY'S (1987) REVIEW OF SECONDARY TASK METHODS

This report is an exhaustive review of secondary-task methods as they apply toevaluating intelligent automotive display systems.[61] The reasoning behind thesecondary (or subsidiary) task method is that carrying out a task or collection of tasksrequires some fraction of one's attentional capacity. To assess how much is required,one is asked to complete another task (of less importance) in concert with the existingor primary task. Ideally, the secondary task should not affect performance of the

Page 49: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

38

primary task. The score on the secondary task is indicative of the amount of processingresources available to carry out additional work.

Shown in table 11 is an abridged version of Noy's tabular summary of the literature. Increating this table, the original sources cited by Noy, which have additional details, werenot retrieved. Studies were primarily concerned with driver workload and driver fatigue.Some of the experiments did not employ tracking or steering-like activity as a primarytask. These data do not suggest a single ideal secondary task. Popular tasks includedvisual detection, response time (RT) to tones, response time to lights on the instrumentpanel (IP), various short-term memory tasks (such as digit shadowing), cognitive tasksinvolving mental arithmetic, and, in one case, antonym naming.

Table 11. Secondary task list.[61]

Study Secondary Task Issue Effect onDriving

Effect onSecondary Task

Boadle, 1976 detect light on IP vigilance interference unknownBrown &Poulton,1961

auditory digitshadowing

generalmethodology

unaffected sensitive to trafficdensity

mental addition unaffected sensitive to trafficdensity

Brown, 1965 auditory digitmonitoring(odd-evensequences)

nature ofsecondary task

intrusive more sensitivethan memory task

detect repeatedletter in series

intrusive

Brown, et al.,1967

voice intervalproduction task

compare 2 tasks slight intrusion not stated

visual detection not stated performanceimproved with time

Brown, et al.,1969

auditorygrammaticaltransforms

interference withtelephone use

little effect oncontrol skills,affected gapjudgment

interference withperception, notmotor skills

Dobbins,1963

visual detection vigilance no effect no effect

Drory, 1985 visual choice RT vigilance andrest periods

unclear unclear

readspeedometer

improvedperformance

unclear

Fagerstron &Lisper, 1977

RT to tone vigilance not stated affected by carradio listening,experience, etc.

Harms, 1986 mentalsubtraction ofauditory digits

generalmethodology

RT inverselycorrelated withspeed

RT increased inhigh accidentareas

Page 50: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

39

Table 11. Secondary task list (continued).

Study Secondary Task Issue Effect onDriving

Effect onSecondary Task

Heimstra,1970

visual detectionRT

effect of stressfrom electricshocks

not stated not stated

detect meterdeflections

not stated not stated

detect brightnesschange

not stated not stated

Hicks andWierwille,1979

read randomnumbers

comparison ofworkloadtechniques

no intrusion not sensitive

Hoffman &Joubert,1966

auditory digitshadowing

effects of vehicledynamics

not stated correlated withnumber of coneshit

Johnston andCole, 1976

visual detection effect ofdistractions(advertising)

not examined affected by signs

Laurell &Lisper, 1978

auditory RT generalmethodology

not stated RT correlated withobstacle detection

Lisper, et al.,1986

auditory RT fatigue not stated RT increased overtime

Lisper, et al.,1979

auditory RT diurnal variation not stated slight change withtime of day

Lisper, et al.,1973

auditory RT physiologicalindicators

not stated RT better thanphysiologicalmeasures

McDonald &Ellis, 1975

visual digitshadowing

attentionaldemands-curves& speed

not stated unclear

Quenalt,1977

mental addition(auditorypresentation)

careless vs.normal drivers

not examined road conditionsaffectedperformance

antonym naming road conditionsaffectedperformance

count tones road conditionsaffectedperformance

Riemersma,et al., 1977

report changesevery 20 km

sensitive to fatigue

detect coloredlight change

fatigue not stated sensitive to fatigue

Sanders &Noble, 1975

count targets generalmethodology

interference affected by # oftargets

Page 51: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

40

Table 11. Secondary task list (continued).

Study Secondary Task Issue Effect onDriving

Effect onSecondary Task

Snyder &Monty, 1986

operate radio, tripcomputer, climatecontrols

effect ofprogrammabledisplays

degraded lanekeeping andspeed control

not stated

StephensandMichaels,1964

identify targetword on sign

interferencebetween searchand tracking

degraded bysecondary task

not stated

Wetherell,1981

mental addition compare 6secondary tasks

secondarytasks intrudedfor women only

no single taskoutstanding

grammaticaltransformsauditory detectionof digit in streamrecall of routeinstructionsgenerate randomdigitsSternbergauditory memorytask

Wierwille etal., 1977

read randomdigits

vehicle handlingparameters

not stated sensitive to somesteeringparameters

Wierwille &Guttman,1978

read randomdigits

generalmethodology

intrusion less sensitive thanprimary taskperformance

Zeitlin &Finkelman,1975

random digitgeneration

generalmethodology

no interference sensitive to taskdifficulty

digit shadowing no interference not sensitive totask difficulty

Zwahlen,1986

reading text reading text andocclusion

interference reading text andocclusiondegradedperformanceequally

Page 52: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

41

NOY (1989, 1990) - ALTERNATIVE METHODS FOR ASSESSING WORKLOAD

Noy’s recent work on methods for assessing driving workload was summarized in apaper and a technical report.[62.63] This very comprehensive effort builds upon theliterature review just described. Experiments were carried out using a moving-baseDC-8 flight simulator at the University of Toronto. The rear of the cab was configured torepresent a compact car. Near the center console was a 30 cm (12-in) monitor forpresenting an auxiliary task. Drivers wore head tracking transducers andElectrooculogram (EOG) electrodes to record head movements and eye fixations.

Drivers looked at a 30 by 40 degree scene with low detail (300 polygons) that updatedat 20 to 30 Hz. A two-lane winding road with a dashed centerline was shown. Theirtask was to maintain a constant headway from a lead truck and minimize lateral drift.Periodically the truck decelerated rapidly, in response to which the driver stopped. Atvarious times short vertical lines (6.5 mm) appeared on the in-vehicle, auxiliary displayin the presence of 8 mm vertical distracters. The task difficulty was varied by alteringthe number of distracters (1, 4, 8, or 12). The driver’s task was to identify whether ashorter line was present. At other times drivers were presented with a Sternbergmemory task for set sizes of 2, 3, 4, and 5 letters with 3 probes. For a set of four, thefollowing would occur. “W, T, F, R”, ... followed by a delay... “Yes or no, were any of thefollowing letters presented, A, F, E?” For both auxiliary tasks, response time was thedependent measure and the yes/no probabilities were equal.

In addition, there were six dependent measures related to driving performance(standard deviation of lane position, lane exceedence ratio, time-to-line crossing,headway, speed, and standard deviation of speed), three related to attentional demand(dwell time, look frequency, auxiliary display to road scene viewing ratio), and seven(time load index, mental demand, physical demand, temporal demand, effort,performance, and frustration) associated with TLX, the NASA Task Load Index, asubjective measure of workload.

Twenty students served as subjects. They were given 4 h of training in single and dualtask conditions prior to the experiment. The main experiment consisted of two 2-hsessions separated by a 30-min break.

Analysis of the data showed there were no significant differences in driving performancebetween the perception (line discrimination) and the memory tasks, so further analysisfocused on the perception task. The presence of the auxiliary task significantly affectedtime-to-line crossing, standard deviation of lane position, headway, and standarddeviation of speed. Table 12 shows task decrements ranked from left to right. Exceptfor headway, the level of task loading had no effect on driving performance.

Page 53: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

42

Table 12. Auxiliary task decrements in dual task conditions.[51]

Condition Std. Dev. ofLane Position

(m)

Std. Dev. ofSpeed(m/s)

Time-to-LineCrossing (s)

Headway(m)

LaneExceedence (%)

Driving 0.2 0.79 3.47 53.5 0.0Dual Task 0.25 0.94 2.90 56.7 0.02% Change +26. +19. -16. +6. +2.

Of the driving performance measures, only time-to-line crossing and standard deviationof lane position were significantly affected by the difficulty of the driving task. Figure 13shows the effect of road curvature on time-to-line crossing.

Note: Difficulty refers to the secondary perception task present.Baseline refers to the condition where a secondary task was not present.Str. represents the straight road condition (no curvature).

Figure 13. Time-to-Line Crossing (s) as a Function of Road Curvature Radius (m).[51]

With regard to the visual/attentional measures, both dwell time and look frequency wereaffected by the curve radius. (See figure 14.) In response to greater external demands,drivers looked at the auxiliary display less often with the viewing ratio display (thepercentage of run time spent looking at the in-vehicle display) changing from 50percent, in the low driving load conditions, to 20 percent, in the high load conditions.

Note: Mem and Perc refer to the memory and perceptual secondary tasks.

Figure 14. Visual/attentional performance as a function of road curvature.[51]

One of the more interesting results concerns partitioning the driving performancemeasures based on where the driver was looking at any moment. In fact, standarddeviation of lane position, standard deviation of speed, and exceedence ratio were alllarger, and time-to-line crossing was smaller, when the driver was looking inside ratherthan outside. That is, drivers tended to look at the inside display only when it wasrelatively safe to do so.

Page 54: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

43

Results concerning TLX appear only in the technical report. Only two of the ratingscales (mental demand and physical demand), and to a lesser degree, the weightedcomposite TLX rating, were significantly linked with driving task difficulty.

Thus, these data show that driving task difficulty (here curve radius) had a larger effecton performance and attention than auxiliary task difficulty. According to Noy, the visualattentional variables were more sensitive to experimental manipulations than primarydriving variables. Noy believes that drivers “were able to maintain primary taskperformance within the desired bounds by judiciously modulating their scanningbehavior in response to changing task requirements.”[53] Of the driving performancemeasures, standard deviation of lane position, standard deviation of speed, and time-to-line crossing were most noticeably affected by the addition of a secondary task andshould be examined in future driver performance studies. In some cases, time-to-linecrossing may be difficult to obtain.

With regard to the auxiliary tasks, the presence of any of them significantly degradeddriving task performance and, to the extent they mimic future in-vehicle displays, that isa matter of concern. Potentially of greater concern is not their effect on the meanperformance level, but their effect on increased performance variance. Also important,however, is the extent to which drivers adapt to the addition of secondary tasks, and thestrategy they choose to execute these tasks (for example, only when drivingperformance is relatively good).

HARDEE, JOHNSON, KUIPER, AND THOMAS (1990)

This paper describes an approach that General Motors has used to evaluate instrumentpanel controls.[64] Detailed performance data are not provided. Individual driversseated in a mockup carried out three tasks concurrently. In the speed regulation task,drivers monitor an analog speedometer perturbed by a random signal. Drivers maintaina constant speed using the accelerator.

In the pedestrian detection task, drivers are shown a video scene of a single lane roadlined with pedestrians. (See figure 15.) At random times, a pedestrian enters the roadfor 50 ms. The drivers task is to indicate if that occurs from the left or right.

Figure 15. Three-frame sequence of the pedestrian detection task.[64]

The third task performed involves operation of instrument panel controls. Auditorycommands to operate controls (wiper, lights, radio, etc.) are given, and driver behavioris videotaped.

In a typical experiment drivers are given practice in the speed regulation and pedestriandetection tasks both independently and together. Following is practice in the instrument

Page 55: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

44

panel task until a preset criterion is reached. The test conditions involve several 5-minblocks in which all tasks are performed concurrently. Dependent measures include themean speed, the number of pedestrians detected, the task completion times for eachcontrol, hands-off-the-wheel time, and the frequency and types of errors associated witheach control.

Hardee et al. describe two methods for analyzing the data. In the video method, tapesof the experiment are played back frame-by-frame for analysis, so times are accurate tothe nearest 1/30 of a second. Experiments of this type tend to be easy to set up andcounterbalance. Analysis, however, can be very time consuming. In the automatedmethod the subject wears a conductive wrist strap interfaced to a computer, so the timehis hand leaves the steering wheel can be determined. In addition, all of the controlsmust be interfaced to a computer, which takes time to accomplish. Also, in this method,a quad splitter is used to show a time-synchronized view of the pedestrian detectionscene, the subject’s eye movements, and his interaction with the controls. Based ondiscussions with GM personnel, both of these approaches are regularly used to analyzecontrol designs.

ALLEN, STEIN, ROSENTHAL, ZIEDMAN, TORRES, AND HALATI (1991) -LABORATORY SIMULATION

This method simulates driver behavior while trying to avoid nonrecurring congestion bydiverting with the aid of an in-vehicle traffic information/navigation system.[65] Driverswere shown slides of road scenes (Golden State Freeway in Orange County,California), which included an instrument panel image showing speed and time. At thesame time, they were shown a simulated traffic information display on a CRT.Computer generated auditory feedback of engine sounds, wind, and road noise wasalso provided. Across scenarios, the speed shown and the level of traffic congestionvaried. The manner in which the slides were produced was quite clever. Scenes ofnearly vacant freeways were shot. Separately, pictures of vehicles were shot and thensuperimposed in the scene in the proper location, so the scenes could be photographedagain. This resulted in a series of scenes that were identical except for the number ofvehicles shown. Producing these slides was painstaking work.[66]

Four different in-vehicle units were examined. They included a static map systemsimilar to ETAK (no congestion or guidance information), a static map with congestionlevel information, a dynamic map with a highlighted alternative route and auditorymessages on traffic, and an arrow-based route guidance system. Further details on thedesign of the interfaces is contained in reference 66.

Each participant utilized only one of the interface designs, but for three different delaylevels. All subjects were given training in the operation of the navigation system. In testtrials, slides were shown at an unknown rate. When subjects wanted to divert, in mostcases they keyed in the numbers of the nodes (shown on a special map) through whichtheir vehicles would pass.

There were significant differences between display types in terms of how early in a tripdrivers would divert. Drivers with route guidance interfaces tended to divert first,followed by those with the dynamic map, followed by others.

Page 56: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

45

What is interesting about this method is that it provided useful behavioral informationthat could serve as input to a traffic simulation model. It did so in a way that providedcontrol over the traffic congestion level and had considerable fidelity with real driverdecisions. Developing the task scenarios was nontrivial, especially the photographicwork.

VERWEY (1991)

This experiment examined the factors that contribute to driving workload.[67] The resultscould be applied to design a dialogue controller for the driver-vehicle interface thatavoids overload.

The driving task involved a 3.5-km (2.2-mi) four-lane motorway and 1.4 km (0.9 mi) ofrural road in the Netherlands. Participants drove at normal speeds. For the 10-mindrive, six segments were analyzed. (See table 13.) Segments were 120 to 800 m long.Trips began at specific times in the morning and afternoon, corresponding to times oflow, moderate, and high density traffic.

Table 13. Tasks examined.[67]

Road Task4-lane motorway (100 km/h) merge and exit4-lane motorway (100 km/h) drive in right lane of straight section2-lane road (100 km/h) drive around traffic circlerural road (80 km/h) turn right a cross streetrural road (80 km/h) turn left just before a curverural road (80 km/h) drive on straight section

Driving took place under four test conditions: 1 no-task (control) condition and threesecondary task conditions. Those secondary tasks consisted of a visual detection task,a visual addition task, and an auditory addition task. In the visual detection task,subjects said the Dutch equivalent of yes when a display mounted high on the centerconsole was illuminated. This occurred for 0.75 s every 2 to 4 s. In the visual additiontask, subjects added 12 to the number shown on the console display and spoke theanswer. In the auditory addition task, the number was presented auditorally (for 1 to1.5 s) instead of visually. Baseline secondary task data were obtained for two 2-minsessions with the car stopped.

The test vehicle was a Volvo 240 station wagon with a manual transmission. A roof-mounted camera recorded the forward scene. A dashboard-mounted camera recordeddriver eye movements. Foot control and steering wheel inputs, speed, and lane positionwere sampled at 4 Hz.

Serving as subjects were 24 drivers familiar with the road course. Half had beenlicensed for less than a year; the others were experienced drivers.

Measures examined in this experiment included reduction of secondary task scores dueto driving (the difference between the baseline and driving conditions), eye glance

Page 57: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

46

measures (glance frequencies and durations to four locations), and nine drivingperformance measures. Of the nine driving measures, three were collected for allsegments of the test route: speed, standard deviation of speed, and steering wheelaction rate (SAR). SAR is the number of wheel movements per s. According to Verwey(1993) the movement threshold was 5 degrees/s. The other six measures were specificto maneuvers such as merging or right turns: time to merge, distance required, speedafter merging, and time, distance and speed of braking before intersection. Eyefixations were reduced manually in real time by an experimenter pressing a button toindicate the object fixated (right, left, interior mirror, display).

Approximately 2600 responses were obtained for each of the three secondary tasks.Error rates were 34 percent (visual detection), 40 percent (visual addition), and 22percent (auditory addition). Figure 16 shows the relative performance on each of thethree secondary tasks for various road segments for both inexperienced andexperienced drivers. There were significant differences between task types, withperformance for each task type depending upon the driving situation, but not the trafficdensity.

The primary differences in driving demands seem to be related to visual input, with boththe visual detection and visual addition secondary tasks reflecting large differencesbetween driving situations. Further, statistical analysis revealed a significant interactionof secondary tasks with driver experience, with inexperienced drivers showing muchgreater degradations for some task-driving situation combinations.

These results indicate that for experienced drivers, the primary limitation is visual load,not cognitive load. The visual tasks vary most widely with the driving situation whereasthe auditory task did not vary very much. However, for the inexperienced drivers,cognitive loading is a somewhat of a problem, as well, since all tasks (including auditoryaddition) vary with the driving situation. Verwey suggests that the absence ofinterference with auditory addition for experienced drivers may be because the basicdriving skills are so highly automated.

Note: While not fully explained in the paper, it appears that relative performance refersto the error rate relative to that of auditory addition secondary task performed whiledriving on a straight motorway.

Figure 16. Secondary task performance for various tasks.[67]

With regard to eye movements, there were differences due to driving situations causingdrivers to glance in different locations. For example, drivers looked more to the left forleft turns and more to the right for right turns. Quite noteworthy, were differences in thenumber of glances to the mirror and display as a function of the driving situation. (Seefigures 17 and 18.) When mirror usage was not central to driving (e.g., tasks other than

Page 58: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

47

merging), the number of mirror glances was sensitive to workload, with more mirrorglances occurring for easier driving situations. Verwey suggests that mirror usage isconsidered as less important than other visual tasks and is one of the first tasks shedwhen workload increases.

Figure 17. Number of mirror glances for various tasks.[67]

Figure 18. Number of display glances for various tasks.[67]

Patterns for glance durations were similar to those for glance frequency, though theywere not examined in the same detail (in part because there were fewer significantdifferences). In brief, the easier the driving task, the longer the average fixation.

Many of the driving performance measures were affected by the driving task (thefrequency of accelerator, brake and clutch depressions, driving speed, the standarddeviation of speed, etc.). None of these performance variables was affected by thesecondary task, though brake pedal frequency increased and mean speed decreasedwith traffic density. Verwey notes that the lack of secondary task effects supports amultiple resource theory of driving (with separate resources for visual, motor, andcognitive processing) for experienced drivers. Verwey also claims that single resourcetheory best supports the inexperienced drivers' data (because differences in drivingsituations influenced performance in all secondary tasks). According to Levison (1993),an alternative explanation is that "multiple resource theory applies to both classes ofdrivers, but that the cognitive resource must be shared among all tasks—in this case,driving and the auditory side task.[68] (This assumption is made by the Integrated DriverModel.[19]) Assume that the driving task imposes a lower demand on cognitiveresources for experienced drivers than inexperienced drivers because of the increase ininformation processing that accompanies learning. Because the experienced drivershave more "spare capacity" than the inexperienced drivers, the experienced drivers candevote the necessary cognitive resources to the audition task, whereas theinexperienced drivers come up short. Thus, one might claim that differences betweendriver classes are due to the efficiency of information processing associated with thedriving task, and not to differences in the number of independent resources." Hence,the multiple resource theory of driving could explain the performance of bothinexperienced and experienced drivers.

Page 59: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

48

Steering action rate was affected by several factors and was apparently sensitive toworkload. Experienced drivers moved the steering wheel less often than inexperienceddrivers (0.42 vs. 0.46 times/s). The rate also varied with the secondary task, especiallyfor tasks with visual demands. (See figure 19.)

Figure 19. SAR and secondary task.[67]

The following conclusions emerged from this research.

1. Verwey claims the behavior of experienced drivers may be best explained bymultiple resource theory, while inexperienced drivers fit single channel theory.Levison claims multiple resource theory explains the performance of both classesof drivers.

2. Driving workload was primarily determined by the driving scenario, not the trafficdensity.

3. The primary limitation of driving is visual, and hence, tasks that contain a purelyvisual load (e.g., peripheral detection) are most likely to be sensitive to drivingdemands. That is, limitations are more perceptual than cognitive in nature.

4. Steering action rate may be a useful online measure of driving workload.

5. Glance frequency appears to be more sensitive to workload than glance duration.

6. When mirror use is discretionary, it seems to be sensitive to driving workload.

DINGUS, HULSE, KRAGE, SZCZUBLEWSKI, AND BERRY (1991)

This paper concerned the evaluation of predrive functions associated with the TravTekinterface, a turn arrow and voice-based navigation system recently examined in anoperational field test in Orlando, Florida.[69,70,71] During the development of theinterface, subjects were given a wide variety of tasks to complete. The TravTekinterface was shown on a touch screen CRT connected to a Macintosh computer. Thegraphics were generated in SuperCard. In addition, a video camera was placed behindthe subject to record the general nature of the interaction.

A total of 72 people participated in this experiment, equally distributed among three agegroups and both sexes. There were seven basic tasks—select an unfamiliar address,select a previously stored destination, determine street names where congestion ispresent, store a destination and route for future reference, use the yellow pages toselect a business, set voice messaging options, and summon emergency service.Tasks were reasonably complex, involving as many as 7 separate screens and 5 to 20

Page 60: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

49

button presses. The experimenter read a description of the task while subjects saw themain menu. The computer recorded each touch screen button press. At the end of theexperiment, subjects were given a survey and rated the difficulty of each task.

Table 14 shows the task completion times. In general, these times were correlated withthe number of errors, but a correlation coefficient is not given. Most of the taskcompletion times were associated with recovery from errors, not with choosing aninefficient access method.

Table 14. Navigation task completion times.[69]

Task Time(s)

select an unfamiliar address 130select a previously stored destination 50determine area street names where traffic congestion is present 240store a destination and route for future reference 160use a yellow pages feature to select a business 95set voice messaging options to a desired destination 40summon emergency service 40

The importance of this study is that it demonstrates a laboratory method for assessingthe usability of driver interfaces, showing the utility of both performance and subjectivedata. This research was made possible by recent advances in rapid prototypingtools.[72]

TRAVTEK EVALUATION

This research is quite different from those mentioned previously. Rather thanevaluating a system or alternative systems by a few drivers in the laboratory or on theroad, this project involved approximately 100 cars, potentially thousands of drivers, anda real, functioning traffic information system. The only other operational test in theUnited States, Pathfinder, conducted in Los Angeles, was far less ambitious than theTravTek project.[73] A detailed overview of the TravTek test protocol is given inreferences 74 and 75. For an overview of the evaluation of other related projects, seereferences 76 and 77.

In brief, 10 experimental approaches were examined—(1) a field study with rental users,(2) a field study with local users, (3) a yoked driving study, (4) the Orlando test networkstudy, (5) the camera car study, (6) a survey of rental and local users, (7) debriefing andinterviews in several studies, (8) traffic probe studies, (9) modeling and analyses, and(10) a global evaluation. The first five approaches emi/hasize human performance andbehavior.[75]

The rental car users study involved people who rent TravTek cars from Avis and amatched control group. Within the TravTek group, there were three subgroups (allfunctions, navigation and service function but no real-time data, service functions only).All drivers participated in the questionnaire study, and a subset were interviewed. Data

Page 61: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

50

concerning vehicle location, heading, speed, and stops were automatically timestamped and logged, as were all keypresses associated with the TravTek interface.

The local users study was similar to the rental car study, except that people familiar withOrlando had the test vehicles for several months. The dependent variables were thesame.

In the yoked driving study, hired drivers were assigned to the three versions of theTravTek interface. Pairs of drivers (one without a Travtek interface, one with a TravTekinterface) went from specified origins to specified destinations at the same time, soweather, congestion, etc. were matched. The focus of this approach was on the timeand distance savings associated with real-time traffic information.

The Orlando Test Network Study was similar to the yoked experiment, except that therewere a network of routes between origin-destination pairs. Hired subjects drovevehicles with either a hard copy map, a route guidance system, a moving map with aroute guidance system, a moving information map, or voice guidance. Trip times andexperimenter ratings were the primary dependent measures.

The Camera Car Study provided a detailed analysis of driver performance. Videocameras were focused on the road scene, the driver, and the outside lane line.Dependent variables included eye glance measures, speed variance, and laneexcursions. Supplemental information was provided from an accompanyingexperimenter’s log. Drivers performed predrive functions, baseline tasks (e.g., usecellular phone) and drove the Orlando test network. Each driver participated in fourconditions -- hard copy map, moving map with route overlay, route guidance, and voiceguidance.

This operational test was extremely thorough and yielded valuable data concerning therelationship between human performance measures (e.g., glance times), behaviormeasures (e.g., route choices), and safety (e.g., accidents).

CLOSING COMMENT ON MEASURES AND METHODS

As a whole, these studies show that the research community is far from reaching aconsensus either on the research protocol to be used or on the appropriate measures.Particularly lacking are any attempts to replicate research results. In other areas ofscience, "truth" is established when many independent investigators examine a questionusing similar methods and reach the same conclusion. Such work has not been carriedout in driving science, in part because resources are so scarce that replication has beenavoided.

Page 62: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

51

TABULAR SUMMARIES OF METHODS AND MEASURES

Now that the reader has a sense of the methods and techniques used, it is appropriateto look for trends. In the following tables are all the studies, to the author's knowledge,that examine the use of navigation systems, as well as more general studies oftimesharing. While relevant, studies of the use of cellular phones have not beenincluded. Those studies appear in other reports funded by this project .[11] The reportslisted in the following tables have been partitioned into three categories: methodologicalexperiments, general studies seeking information regarding interface design, andspecific interface experiments. Many of the experiments examined fit into severalcategories; however, each experiment was placed into only one category, for simplicity.

Table 15 concerns methodological studies. Included here are some basic studiesconcerning timesharing, experiments that concern the sensitivity of various humanperformance characteristics, and related issues. To keep the scope reasonable, onlystudies that included driving-like tasks as one of the timesharing activities have beenincluded. Had the scope been expanded to include general tracking studies, the listwould have been enormous. Just as rare were operational tests, in part because of themillions of dollars they often cost to execute.

Of the 23 items listed in table 15, 10 were conducted on the road, 5 were conducted in adriving simulator, 2 were conducted in both contexts, 4 were conducted in part tasksimulators, and only 1 was a true laboratory experiment. In some sense, the value forthe on-road category can be misleading as 4 of those studies (conducted by Zwahlen)took place on an abandoned airport runway. None of these experiments wereconducted on test tracks, most likely because of the cost of access. Somewhatunexpected to the author, was the small number of laboratory studies. This may havebeen due to the selection criteria, not the absence of useful material in the literature.

There was no consistent pattern for the choice of independent or dependent measures,or in the results. However, it was clear that for the simulator and on-road experiment,the number of dependent measures collected was large with a half dozen being typical.Dependent measures included response times and error rates for in-vehicle tasks, heartrate variability, eye fixations frequencies and durations, tracking-time-off-target, steeringwheel angle statistics, time to line crossing, mean speed, lateral deviation, ratings ofattentional demand, and the number and severity of unsafe driving actions.

Page 63: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 15. Methodological studies.

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Bos, Green,and Kerst(1988)[58]

lab response time, errorrate

digit size,taskcondition

either look at IP andpress 1 or 2 buttons(speeding/notspeeding), look ascreen for left/rightarrow, if not arrowpress button, orRT+driving simulator

RT+ arrows gave best data

Bouis, Voss,Geiser, andHaller(1979)[78]

part taskdrivingsimulator

response time forsecondary task, RT tounexpected signal

displayformat-visual,auditory,combined

track moving targetand hold varyingspeed while pressingbuttons in response tolights

shorter RT's for combineddisplays, shorter RT's for lightsand tones that are continuous(vs. intermittent)

Brouwer,Ickenroth,VanWolffelaar,and Ponds(1990)[79]

in parttasksimulator

similar to Ponds,Brouwer, and vanWolffelaar (1988)

driver age don't have yet don't have yet

Daimon(1992)[80]

on road heart rate variability,thinking aloudcomments, eyefixation durations andfrequencies

map or navsystem

drive route using navsystem or map

lower peak power (in heart ratevariability frequency spectrum)with map, fewer eye fixations tomap

Daimon(1992)[80]

simulator tone identification time(secondary task-0, 1,or 2 tones lagged),time off target intracking, eye fixationduration andfrequency

secondarytask

drive and respond tosecondary tasks andnavigation display

longer fixation times but fewerfixations with paper map, lag 2response time was most likelyto show map-nav systemdifference

Page 64: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 15. Methodological studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Godthelp(1988)[41]

on road Steering wheel angle,yaw rate, and lateralposition, TLC

speed when tone presented,stop steering thenresume at latestmoment

(TLCs=1.3 secs, TLCmin=1.1sec,constant was the minimumlateral distance to the laneboundary (about 15 cm)

Godthelp,Milgram, &Blaauw(1984)[40]

onunusedhighway

Steering wheel angle,yaw rate, and lateralposition, TLC

speed drive and press buttonfor 0.55 s look(occlusion)

time-to-line-crossing at the end(TLCe) of each occlusion periodwas about 1.57 times theocclusion duration, TLC wasmeasure of safety

Grant andWierwille(1992)[81]

on roadand inlab

task duration timingmethod (on-road, real-time, slo-mo),task duration(short, med,long)

watch real driver carryout reach for controlsand time withstopwatch, watch tapein real time in lab,watch tape at 1/6speed

slow motion led to larger timeestimates, real time led tosmaller time estimates, on-roadled to no biases

Hardee,Johnston,Kuiper, andThomas(1990)[64]

part tasksimulator

mean speed, thenumber of pedestriansdetected, the taskcompletion times foreach control, and thefrequency and types oferrors associated witheach control

control ofinterest

maintain speed, detectpedestrians on road,use control

none given

Page 65: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 15. Methodological studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Hicks andWierwille(1979) [39]

VPIdrivingsimulator

lateral deviation, yawangle deviation, andtwo-degree steeringwheel reversals/unittime, rated attentionaldemand (extremelylow to extremely high)

varycrosswinds(workload)

drive either whilereading random digits(secondary task), withocclusion (200 mslooks), or while heartrate was recorded

lateral dev, yaw and reversalsall affected by workload,reversals was most affected,ratings affected by workload,occlusion, secondary taskperformance, heart rate notaffected

KorukawaandWierwille(1990)[82]

insimulatorand onroad

task completion time,hand-off wheel time,duration and # ofglances to road and IP

task (17-press buttonon radio, turnknob, etc.)

steer car or simulatorand reach for controlon command

simulator times were close to in-car times, crosswind reduced #glances to IP and increased toroad

MacAdam(1992)[83 ]

drivingsimulator

lateral deviation,heading angle,standard deviation ofwheel angle, yaw rate,lateral acceleration,mean time on sidetasks

side task,choice ofdependentmeasure

steer on straight roadand wind side gustswhile performingsecond task (RT tosingle letter, 2-choiceRT with letters, 2 digitaddition)

standard deviation of lateralposition and heading (yaw)angle were most sensitive toside tasks, other measureswere insensitive

Noy (1989,1990)[63]

simulator standard deviation oflane position, laneexceedence ratio, timeto line crossing,headway, speed, andstandard deviation ofspeed, + secondarytask (various RT's), +TLX, dwell time andlook frequency

secondarytask present,driving taskdifficulty

while driving eitherdetection of lines(perceptual task),Sternberg memorytask, or driving alone

perceptual and memory taskhad similar effects on driving,TLC and lane variance affectedby driving task difficulty, dwelltime and look frequencyaffected by driving load, TLXlinked to task difficulty, driversexecuted secondary tasks whendriving was good

Page 66: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 15. Methodological studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Ponds,Brouwer,and vanWolffelaar(1988)[84 ]

in parttasksimulator

tracking time-on-target, % correct in dotcounting

driver age(21 to 37, 40to 58, 61 to80)

steer on straight roadto counteract sidewinds, count # dotsshown on screen

most differences were betweenold and other drivers, elderlydrivers had decreased ability todivide attention

Quimby(1988)[46]

on road number and severityof unsafe drivingactions

age, sex,maneuvertype, etc.

just drive little correlation with accidentson route

Senders,Kristofferson, Levison,Dietrich, andWard(1967)[36,37]

on road(unusedInterstate or testtrack)

viewing time, speed(varied withexperiment)

speed, radiusof curve,road type(varied withexperiment)

occlusion-either pressbutton to raise faceshield or select drivingspeed

either varying speed or looktime lead to same results,typical look/no look of 0.5/2.0 sat 97 km/h (60 mi/hh

Verwey(1991)[67]

on road glance frequencies &durations to variousplaces, speed,standard deviation ofspeed, steering wheelaction rate; 6measures werespecific to merging orright turns-time tomerge, distancerequired, speed aftermerging, and time,distance and speed ofbraking beforeintersection

secondarytask (visualdetection,visualaddition,auditoryaddition),driving alone

driving or driving +visual detection,visual addition,auditory addition

visual tasks interfere most,fewer mirror glances with hi taskdemands; glance frequencymore sensitive than duration;driving performance measuresaffected by driving task (freq ofaccelerator, brake & clutchdepressions, driving speed, std.dev. of speed, . Brake pedalfrequency increased and meanspeed decreased w/ trafficdensity; SAR sensitive toworkload, other drivingperformance variablesunaffected

Page 67: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 15. Methodological studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Verwey(1989)[85 ]

part tasksimulator

% of time off track,RMS tracking error,RT to nav information

stimulusmodality (textvs. arrows),familiar vs.unfamiliarintersections

countersteer randommovements ofprojected slide; shownguidance info and thenroad scene, indicatedwhich way to go bypressing button

least tracking error with auditorynav, more with arrows and mostwith text, RT shows samepattern favoring auditoryguidance

Walker,Alicandri,Sedney, andRoberts(1990)[86 ]

(see alsoWalker,Alicandri,Sedney, andRoberts,1991,1992)[87,88]

insimulator

heart rate, reactiontime to gaugechanges, speed(minimum, mean,variance, skew),lateral deviation

auditory vs.visualnavigationsystem,complexity ofsystem (3levels),nature ofloading(perceptual,cognitive,psychomotor)

drive route followingadvice of navigationsystem

heart rate was not sensitive toloading of nav devicedifferences, some differences inRT due to navigation devices(longer for complex types),significant differences in speed(slower for more complexdesigns, especially complexvisual), lateral positionmeasures (average andvariance of deviation) wereunaffected by the navigationdevice

ZwahlenandBalasubramanian(1974)[42]

unusedairportrunway

path error steer/nosteer

either close eyes anddrive or do that andtake hands off wheel

sy=k *D * T0.5

k=0.025 (0.683 in metric units)for steering with no visual input

Zwahlenand DeBald(1986)[43]

unusedairportrunway

path error eyes closedor reading

drive with eyes closedor read article or roadmap

eyes closed and reading gavesimilar results, k=0.076

Page 68: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 15. Methodological studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Zwahlen,Adams andSchwartz(1988)[44]

unusedairportrunway

path error panellocation,look/no lookto road

use simulated carphone

both factors affected laneexceedence probability

Zwahlen,Adams, andDeBald(1988)[43]

unusedairportrunway

path error use simulated touchpanel to turn radio on,etc., looked steadily atpanel until taskcompleted or road asneeded

task times of 5.02 and 8.39 s forradio and climate control, 3%excursions estimated for 3.7-m(12-ft) lane, developed eyefixation design guide

Page 69: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

58

Table 16 shows studies examining interface design; for example, the nature oflandmarks that drivers might find valuable. These studies are quite different from thosein the previous category in that surveys and verbal protocols can be used to determinedriver information needs. While these studies are less concerned with evaluation thansome of the methodological work, they are nonetheless important in that they reflecthow driver information systems should be designed.

Of the 15 studies listed in table 16, 8 were conducted on the road, 4 using surveys, 1was conducted in the laboratory, 1 was conducted using a driving simulator, and 1 isunknown. Again, on-the-road experiments are most common.

Dependent measures explored included ratings of information quality, how pleasedparticipants were with the interface, ratings of workload, response times and error ratesfor in-vehicle tasks, mean speed, speed variance, eye fixation frequencies anddurations, recall percentages, and navigation errors. Most noteworthy about thedependent measures chosen is their variety.

Page 70: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 16. Information gathering studies.

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Alm (1990)-exp 2[89]

survey directions to adestination, map

3 origin/destinationcombos

give directions to adestination and draw amap

landmarks, paths, and nodes allmentioned often, districts(except with map) and edgesrarely used, common landmarksinclude traffic lights, highwaysigns, shops, bridges, gasstations

Alm(undated)-exp 1[90]

survey directions to adestination

3 origin/destinationcombos

describe how to getthere

landmarks, paths, and nodes allmentioned often, districts andedges rarely used, commonlandmarks were traffic lightsand signs, buildings, andparking lots, references wereegocentric (not local or global)

Alm andBerlin(undated)[91]

on road ratings of informationquality (1 through 7),preferences for moreinformation, ease ofremembering info,ease of followinginstructions

amount ofinfo given (1,2, or 3 choicepoints)

drive to destinationwhile guided verbally

give info about 2 choice pointswhen the time between choicesis less than 10 s, otherwise 1;need some what to repeatmessage

Alm,Nilsson,Jarmark,Savelid, andHennings(undated)[92]

on road(easyroute)

how pleased theywere with the system,how easy it was touse, how distractingthe information was,etc., workload (TLX),

landmarks(present/ absent)

drive to destinationwhile guided verballyand visually

landmarks made the systemeasier to use and werepreferred

Page 71: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 16. Information gathering studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Davis(1989b)[93]-(see alsoDavis,1989a)[94]

on road problems in navigation intersectiontype

drive route usingnavigation aid

directions were modified(problems with closely spaceturns, timing, etc.)

Eberhard(1968)[95]

survey % responding - show film and givepeople surveyafterwards

94% said good idea but only43% would buy, wanted HUDand both arrows and words forlane changes

Green andWilliams(1992)[8]

in lab response time, errorrate

nav displaylocation(HUD vs. IP),displayformat (plan,perspective,aerial), roadgraphicdesign (solidvs. outline)

look at road sceneslide and press buttonif navigation systemshows same ordifferent intersectiongeometry

HUD better than IP, aerial andplan view better thanperspective, solid slightly betterthan outline.

Page 72: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 16. Information gathering studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Hook(1991)[96]

(see alsoHook andKarlgren(1991)[97];Brown,Gustavsson,Hook,Lindevall,and Waern(1991)[98],Waern1992)[99]

interviews directions to adestination

resident vs.tourist

describe how to getthere

landmarks were important,descriptions use different roadhierarchies, descriptions fromtwo groups differed in detail androute recommended

Kuiken,Miltenburg,and Winsum(1992)[100]

simulator speed, speedvariance, headway,gap acceptance whenpassing, occurrence ofincidents, trip time,SWAT mental load, #of accidents, # ofnavigation errors

drivingwithoutassistance,driving withnonintegratedapplications,driving withintegratedapplications

drive route as guidedby navigation system,place calls whiledriving

in only 1 of 7 scenarios waslevel of support significant, 1accident in no supportcondition, 3 in nonintegratedsupport condition, no differencein navigation errors betweenconditions

Page 73: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 16. Information gathering studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Labiale(1989)[101 ]

on road eye fixation,navigation errors,steering wheelmovement, recall %,vehicle

map format(only road onitinerary andcross streets,network,maps + textdirections,maps +auditoryguidance; allcombinedwith labelsfor some orall roads

drive route, eitherwhen moving orstopped shown mapwith route, recall 30 slater

no difference between andstopped, written guidance hadhighest recall, bare mapsworst.; relative # of errors forroad names than turn direction(left/right) varied with design,driver preferred map withauditory guidance, mean mapglance time=1.28 s, 10.6glances/30 s trial, driverdecrease speed by 15 km/h(9mi/h) when reading maps ormaps with written guidance; useauditory when driving, textdirections when stopped

Labiale(1990)[102]- exp 1(see alsoLabiale,Mamberti,Baez,Conus, andAupetit,(1988)[103]

on road % of info unitsrecalled, preferences,eye fixation data,steering wheelmovements, speed

# of units oftraffic info,format (visualtext, singleauditory,repeatedauditory)

while driving traffic infomessage is presented,recall it 30 s later

no differences due to format,inverse relationship between #of items and recall, auditoryinformation was preferred,fixation durations increase with# of info units (1.18 to 1.35secs), visual displays affectedcourse control more thanauditory, more likely to reducespeed with visual message thanauditory one

Page 74: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 16. Information gathering studies (continued).

Experiment Domain Dependant variable Independentvariable

Task Results/comments

Labiale,(1990)[102] -exp 2 (seealso Labiale,Mamberti,Baez,Conus, andAupeti,1988)[103]

on road % of info unitsrecalled, preferences,eye fixation data

map plusauditory orvisualguidance, 1or 3 turns

while driving routeguidance message ispresented, recall it30 s later

significant advantage for visual3 turn case only, auditoryguidance was preferred,auditory format had fewer (8.6vs. 10.9) and shorter fixations(1.25 vs. 1.5 s)

Labiale(undated)[104]

on road keyboard use time,screen viewing time,route selection time

number ofroute nodes

enter several routesand select best onewhile driving

time to enter route and evaluateit was 86 s for 1st, 55 s foralternative, therefore use whenvehicle is stationary

Schraagen(1990)[105]

on road navigation errors, # oftimes landmarks, etc.are mentioned

navigationability,enlargedstreet namesat turns

study map then thinkaloud as 4 routes aredriven

enlarged road names at turnpoints led to fewer navigationerrors, poor navigatorsmemorized fewer turns andspent more time on streetnames, street names attendedto most (about 1/2 of time)followed by road signs,topological knowledge,landmarks and road signs

SperandioandDessaigne(1988)[106]

? (inFrench)

reading speed, recall visual orauditory withor withoutrepetition

? auditory messages moreconvenient, maps or graphicsimprove efficiency of visualmessages

Page 75: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

64

Table 17 shows the specific interface evaluation experiments, 19 studies in all. Theseare closest to the focus of this section. Of the 19 studies listed, 12 were conducted onthe road, 3 using true laboratory methods, 2 using a part task simulator, 1 using a fieldsurvey, and only 1 using a driving simulator. Hence, in contrast to the methodologicalstudies listed in table 15, most of the application-oriented experiments were conductedon the road. If the past is a predictor of the future, interface evaluations will generallybe conducted using instrumented vehicles.

There was not any consistency across studies of the dependent measures examined,ranging the gamut from various measures of steering wheel movements, laneexcursions, in-vehicle task completion times and errors, workload estimates, frequencyand duration of glances to various locations, violations of traffic laws, etc.

Contained in these 3 tables (tables 15, 16, and 17) are the studies known by the authorwhen this report was drafted, that related to in-vehicle information systems. With aknowledge of methods and techniques used for studies, as discussed in the first part ofthis report, these summaries of methods (road, lab, simulator, etc.) for each of theresearch types (methodological studies, interface design, interface evaluation) mayreveal important trends. Looking at tables 15 through 17 as a group, it is clear that on-road studies of driving predominate with 30 of the 57 being conducted on the road. Therelative fraction increases and the focus of the research moves from basic research toapplication. Perhaps this trend supports the concept that as research moves from basicto application, the design of the study should include more on-road data collection. Themethod selected for research done for in-vehicle information systems should thenconsider the operational impact of the technology being examined. If the IVHStechnology is to be used in conjunction with the operational driving task, then theresearch method should be one which allows data collection in the operational drivingenvironment. An isolated lab test of a car phone, which is to be used while driving,would not provide the desired data. An isolated lab test of a pre-trip planning tool,which would be used on the roadside, or in the home or office prior to driving, wouldprovide useful data.

Along with selection of research methods, research measures need to be considered. Itis noteworthy that in the research reported in tables 15 through 17 there is a lack of anyconsistent pattern in the selection of dependent measures. Across all three tables, thesame dependent measures are mentioned, however.

An examination of these measures appears the discussion section that follows and in asubsequent report.[18]

Page 76: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies.

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Allen, Stein,Rosenthal,Ziedman,Torres, andHalati(1991)[65]

part tasksimulator(slides ofroadscene &navdisplay onCRT)

% diverting at eachexit

display type -ETAK-likestatic map,static map w/congestioninfo, dynamicmap w/ altroute &auditory trafficmessages,arrow-basedrouteguidance

drive simulated tripand decide when todivert

route guidance led to earliestdiversion

Antin(1987)[107]

on road Steering wheelmovements, speed,foot control use(computer recorded),lane excursions(manual), direction ofgaze from camera,time to read ETAK

memorizedroute (thecontrolcondition), vs.map vs. ETAK

drive route frommemory, using map,or using ETAK

no difference between map andETAK, eye fixation frequenciesand durations used as keymeasures

(see also Antin, Dingus, Hulse,and Wierwille, 1986), Antin,Dingus, Hulse, and Wierwille,1990, Dingus, Antin, Hulse, andWierwille, 1986)

Page 77: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies (continued).

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Burgette(1991)[74]/Fleischman(1991)[75]-TravTekevaluation

on road trip time, route errors,times for variousinputs, etc. (varieswith experiment)

visual orauditoryformat, routetype & length,etc.

(1) field study w/rental users, (2) fieldstudy w/ local users,(3) yoked drivingstudy, (4) Orlandotest network study,(5) camera car study,(6) survey, (7)debriefing &interviews

in progress

Dingus(1988)[48]

on road Steering wheelmovements, speed,foot control use(computer recorded),lane excursions(manual), direction ofgaze from camera,time to read ETAK oruse existing control ondisplay on command

control (tone),display(speedo), orfunction (nextcross street)to select

read ETAK or useexisting control ondisplay on commandwhile driving

means and standard deviationsfor each dependent variable foreach item to use, few laneexcursions, workload measure,driver adapted in response toexternal demands

(see also Dingus, Antin, Hulse,and Wierwille, 1986), Wierwille,Antin, Dingus, and Hulse,1988), Dingus, Antin, Hulse,and Wierwille, 1989)

Dingus,Hulse,Krage,Szczublewski, and Berry(1991)[69]

Mac in lab task completion times task tocomplete

select address ordestination,determine st name,store dest., useyellow pages, setvoice options,summon emer service

task times of 40 to 240 s

Page 78: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies (continued).

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Hulse(1988)[52]

on road Steering wheelmovements, speed,foot control use(computer recorded)Lane excursions(manual), direction ofgaze from camera,time to read ETAK,workload estimates

attentionaldemand(anticipated,unanticipated)

drive route and useETAK to get there

use of navigator was responsiveto anticipated external demands(to minimize overload),correlation of workloadestimates with fixation % onroad and on display was low,good correlation betweenobjective and subjectiveworkload estimates

(see also Wierwille, Hulse,Fischer, and Dingus, 1987,Wierwille, Hulse, Fischer, andDingus, 1988; Hulse, Dingus,Fischer and Wierwille, 1989;Wierwille, Hulse, Fischer, andDingus, 1991)

Page 79: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies (continued).

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Labiale(1992)[108]

on road % correct recall map format(only road onitinerary andcross streets,network,maps + textdirections,maps +auditoryguidance; allcombined withlabels forsome or allroads), driverage

drive route; and recallmap 30 s after beingshown

recall was best withmap+written instructions,followed by simplified map,map+aural, map alone, agedifferences but no largeinteractions

McKnightandMcKnight(1992)[109]

laboratory-part tasksimulation

time looking atnavigation display, %of missed turns, % ofhazards missed,steering wheel andbrake position

driver age,navigationdisplay (staticarea map,strip map,strip map withposition,guidancearrows, stripmap withposition &arrows)

watch 25 minutevideotape of route,respond to hazardsby braking,accelerating, orturning, operate turnsignal to signal whenturn street is nextstreet

tone prior to turn helped, % ofmissed turns for guidancedisplay was half of others(including guidance + position),% of time spent looking forguidance was 1/3 of others, noeffect on failure to respond ofdisplay type, drivers preferredposition+guidance, guidancealone received a low rank

Page 80: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies (continued).

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Morita andOgawa(1992)[110]

on road brake and acceleratorapplicationsassociated with use, #glances to display,general impression

visual vs.auditoryguidance

drive route usingguidance system

fewer fixations when auditoryguidance added, foot controldata not obtained satisfactorily-not usable, timing of messageshad big affect on systemusability

Pauzie andMarin-Lamellet(1989)[111]

on road eye fixations,navigation errors

intersectiontype, papervs. map

drive route asdirected by map orarrow display

screen watched more than rearview mirror, nav displayrequired more time for olderdrivers (gives mean times andfrequencies for mirrors,landmarks, road, etc.)

Popp andFarber(1991)[112]

Mercedesdrivingsimulator

frequency andduration of glances todisplays, std deviationof lane position, speedvariance, mental load(heart rate)

location ofdisplay-clustervs. centerconsole,amount ofmap detail

drive two routes peripheral display requiredmore glances and they werelonger, lane variance wasgreater with central display, butno differences in speed, heartrate for 2 locations differed

Rothery,Thompson,and vonBuseck(1968)[113]

in lab RT symbol vs.text

show road scene,operate turn signal ifcalled for by message(keep left, exit left,etc.)

symbols better for exiting, textbetter for lane positioning

Rothery,Thompson,and vonBuseck(1968)[113]

in lab RT upper vs.lower case,addition ofarrows (Keepright, etc.)

move turn signal leverin correct direction

no significant differences.

Page 81: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies (continued).

Experiment Domain Dependent variable Independentvariable

Task Results/comments

Rothery,Thompson,and vonBuseck(1968)[113]

on road response time,preference ratings

micromap vs.text

approach traffic circle,view display(controlling duration),then drive route

words took less time and werepreferred

Staal(1987)[114]

field response to surveyquestions

- survey completedafter returning rentalcar with nav system

most thought ETAK was easierto use than a map, easy tolearn, and would be useful inmajor cities

Streeter,Vitello, andWonsiewicz(1985)[115]

on road navigation errors customizedroute maps,voiceguidance orboth

drive route usingnavigation aid

drivers who listened todirections drove fewer miles,took less time, and made 70%fewer errors than map users,performance with both wasbetween map alone and voicealone

Verwey, andJanssen(1988)[116]

on road violations of trafficlaws, driving time, # ofnav errors, mentalload (SWAT)

map vs.arrows vs.auditory, routecomplexity

drive three routesusing a navigationsystem

no differences in traffic ruleviolations, as traffic becameheavier the advantage ofelectronic systems increased,for complex routes driving timewas 30% less with electronicsystems, they made most errorswith maps and fewest withauditory guidance, subjects feltmore load with maps

Voss andHaller(1982)[117]

on road rating of workload,route decision errors,glance frequenciesand durations

graphicshown

driver route followinginstructions on visualdisplay (ALI system)

1-2 glances of 0.8 - 0.9 srequired to read display, notoverload

Page 82: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

Table 17. Interface evaluation studies (continued).

Experiment Domain Dependent variable Independentvariable

Task Results/comments

West,Kemp, andHack(1989)[118]

on road % correct of symbolsrecognized, % ofretrieval/entry taskscompleted, %completing test drives,% of drivers reportingdifficulty (commandlate, symbol unclear,etc.)

manual vs.verbalinstructions,day vs. night,professionalvs. domesticdrivers

instructed in use ofAutoguide, say whatsymbols meant, enterand retrievedestinations, drivefour routes

found Autoguide helpful, safe,and easy to use, just as usableat night, no problems withentering destinations as gridcoordinates, problems withinstruction timing could lead tohazards

Page 83: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific
Page 84: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

73

WHICH MEASURES HAVE BEEN USED TO ASSESSDRIVER INFORMATION SYSTEMS?

A summary of the measures that have been considered in the research examined forthis report follows. Measures can be divided into two types: measures of input, or whatthe driver does to the car, and measures of output, or what the car does as a result ofthe driver (performance).

Input control measures are relatively easy to obtain. They are summarized in table 18.Input control measures are divided into three categories, primary (related to real-timecontrol of the car), secondary (not related to real-time control), and overall measures ofdriver input. Primary measures include a variety of parameters related to the steeringwheel and accelerator use, including both means and standard deviations of positions,and statistical measures of movement. Secondary tasks include using a car phone oradjusting the radio, as well as task added to assess spare information handlingcapacity. Most would agree that the safety of the driving task should not becompromised by the addition of the secondary task. With this logic, the outputmeasures of the combined tasks would be merely looking at the driving performance,the same measures as for the primary task alone. Of the measures listed, consideringdriver vision as an input measure is not an ideal fit to this scheme. The rationale usedwas that measures of vision are indicators of input to the driver (as opposed to thevehicle) and hence is more appropriate to consider as an input rather than an output.

Table 18. Input measures of driving behavior and performance.

Category Subcategory MeasurePrimary task control input-lateral number of steering wheel movements per

unit time, number of steering wheelreversals, Steering wheel Reversal Rate(SRR), Steering wheel Action Rate (SAR),mean steering wheel angular change,variance of steering wheel angular changes

control input-longitudinal

mean throttle position, throttle variance,number of brake applications, mean brakepressure/brake application force, brakingpressure variance, number of clutchdepressions

Secondary task In-vehicle system use response time, error rate or percentagedetection performance response time (to brake lights), error rates

Overall driver vision frequency and duration of glances to road,mirrors, in-vehicle display

If the IVHS technology is used in conjunction with the driving task, input measures forthe driving task itself must not be neglected. Table 18 should be helpful in ensuring thatthe various input measured used by researchers to date have been considered. Again,these are distilled from an examination of all the studies listed in tables 15 through 17,and are felt to represent the major categories of input that should be considered tomeasure effects of advanced IVHS technologies on driver performance.

Page 85: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

74

Measures of control performance (output) that have been examined are summarized intable 19. For the primary driving task they include absolute measures of lane positionand yaw, their variance, and the first and s derivatives of yaw angle (rate andacceleration). Generally, lateral rate and lateral acceleration are not examined, thoughlateral acceleration (g) is believed to be an important determinant of driver comfort inlane change and turning maneuvers. For secondary tasks such as those considered inIVHS technologies, output measures are the consequences of slow responses ormissed warning signals.

Table 19. Output measures of driving behavior and performance.

Category Subcategory MeasurePrimary task vehicle response

(output)-lateralmean and mean absolute lateral deviation(path error), number of lane exceedencesand percent of time outside of lane, lateraldeviation variance (lane variance), lateralacceleration, yaw angle, yaw rate, yawacceleration

vehicle response(output) - longitudinal

mean speed, speed variance, meanacceleration/deceleration, number ofdecelerations exceeding a specified g level,mean headway, headway variance (rangerate)

Secondary task In-vehicle system use navigation errors, ratings of ease of usedetection performance number of pedestrians or lead vehicles

struck, etc.Overall crashes accidents, near misses

quality of driving QOD, unsafe driver actions, TTC, TLCworkload TLX, SWATtravel operations trip time, distance traveled, average speed,

number of stopsphysiological heart rate variability, GSR

Page 86: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

75

WHICH MEASURES SHOULD BE CONSIDERED FOR FUTUREASSESSMENT PROTOCOLS?

SELECTION CRITERIA

To select measures for assessment, a set of criteria is needed to guide the selectionprocess. There is a considerable parallel between the selection of the test protocol(discussed in the followon report) and the selection of particular dependent measuresdescribed here.[18] The criteria for the selection of dependent measures are as follows:

1. Indicativeness—The dependent measure reflects the underlying notion or hypothesiswhich the study is to address. It is important to clearly state the research question priorto selection of measures. As an example, safety and ease of use can be two quitedifferent research parameters, and may have different measures.

2. Sensitivity to design differences— This is an extension of indicativeness. Changes inthe product or service that have real impact should be measureable. This is necessaryfor engineering analyses.

3. Risk to drivers and experimenters—Driving can be dangerous. Measures that addunnecessary risk to the driving task should be avoided. Also needing consideration areminimizing pain and even embarrassment to subjects.

4. Ease of Measurement—In deciding when to collect data, the cost of the collectioneffort must be weighed against the benefits of the data. Easy to measure impliesminimal equipment and minimal software.

5. Analyzable—Some measures are either difficult to reduce because of the physicalformat of the data collected or the need for special statistical tools.

6. Repeatable—Replicability is a cornerstone of scientific methods and in establishingtruth. For measures to be repeatable, it is important to know which factors affectrepeatability and to be able to control those factors. For example, the radius ofcurvature affects the difficulty of driving a road and the associated workload. Hence,comparable (or preferrably identical) roads should be driven for comparisons of in-vehicle displays.

7. Acceptance by the scientific and engineering community—The results of research areto be applied by both designers and researchers. If the likely users of the experimentalresults do not accept the measurement protocol, they are unlikely to be convinced bythe results. Part of this involves understanding of the measurement itself. In the humanfactors domain, this has been a problem with the application of spatial frequency-basedmeasures of vision to the assessment of image quality. Users often do not understandthose measures.

8. Fits into an available experimental context—Driving measures tend to be contextspecific. For example, the measurements collected in a survey are usually quitedifferent from those collected while driving a vehicle on a test track. Hence, if for somereason a test track protocol has been selected, consideration of survey-related

Page 87: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

76

measures in most cases should be dropped. In other cases, the data collectioncapability may not exist. For example, on-the-road measurement of driver eye fixationscan be very informative, but collection of eye fixations requires a vehicle outfitted withvery special recording equipment, equipment that is not widely available.

The selection of measures, both of driver input and output (performance), should bedone with the above considerations. The choice of measures will to a great extentdetermine the worth of the research. As stated, to examine the indicativeness of ameasure, a clear description of the purpose of the measurement is required. As notedin the preface to this report, some of the goals of IVHS are to improve traffic operations,reduce accidents, and reduce air pollution from vehicles. The contract for this projectrefers to (1) quantifying the influence of safety, (2) quantifying the effectiveness ofinformation transfer, and (3) assessing driver comfort, convenience, and confidence.Hence, the qualities of interest, both to this project and to IVHS in general are:

• Safety—Reducing crashes.

• Operational—Being more efficient, saving time and energy, providing increasedcapacity and increased functionality.

• Enhancing the experience of driving—Making driving more enjoyable, even fun.This is the personal aspect of driving.

It is in the context of these various sets of goals and the selection criteria givenpreviously that the measures of interest will be considered.

Even considering the above in the design of the study, it should be understood thatselecting good measures of driving performance is not simplistic. It is clear that there isa need for multiple criteria, as suggested by the DRIVE Task Force and others.However, as noted by Robertson and Southall (1992), identifying exact levels of thosecriteria that represent safe (and acceptable) driving is premature given the current stateof knowledge.[24] Therefore, interpreting the meaning of a change in driver performanceis somewhat arbitrary. Short of creating a danger to other drivers on the road,secondary tasks which require some amount of driver attention are commonlyacceptable. The amount of attention that should be reserved for safe driving is left up tothe driver's personal judgment. With this qualification in mind, research can be doneexamining the relative impact of secondary tasks, but conclusions on the significance ofthese results will be subject to nonscientific interpretation. Looking to existing researchfor insights on the complicated problem of interpreting measures of performance,Wierwille has shown that people adapt to task demands. Those demands can eitheroverload drivers' overall capacity, or overload particular channels. The literaturesuggests the most common overload is of the visual channel.[67]

Having discussed matters that should be considered when selecting researchmeasures, the following section considers the various categories of output measuresand input measures likely to be used in IVHS research.

Page 88: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

77

OUTPUT MEASURES - LATERAL CONTROL

In spite of these limitations, the current state of knowledge provides considerable insightinto the selection of measures. One approach is to first consider measures that havethe most direct impact on consequences (the output measures of table 19). Clearly,primary task output measures are indicative of safety. If a vehicle is wandering in thelane, crashes are more likely. Variations in lateral position will also have an operationalpenalty by disrupting traffic flow, and make driving more difficult, making the drivingexperience less pleasant.

Of the measurements in this category, lane exceedences would seem to be an obviouschoice, since they represent collision opportunities. As Zwahlen has shown, giving thedriver an attention-demanding task can cause the driver to deviate from straight ahead.When that reaches extremes, drivers actually wander outside of their lanes (andpotentially collide with another vehicle or roadside object). However, lane exceedencesoccur infrequently, so that measure tends to be insensitive to differences in theattentional demands of various in-vehicle displays. As Wierwille has shown, laneexceedences are not well correlated with other driving performance measures.

As suggested by MacAdam (1992), the standard deviation of lateral position is a moresensitive measure than mean deviation from the center.[83] Noy (1989, 1990) alsofound that the standard deviation was affected by task difficulty.[62,63] In brief, whendrivers pay less attention to the control/steering task (due to fatigue or the attentionaldemands of in-vehicle displays), they make fewer path corrections, but the correctionsthey make are larger. This behavior is most directly reflected in the increase in lanevariance. To a lesser extent, this is also reflected in an increase in mean yaw angle.

Hence, measurements in this category are indicative of safety and operationalproblems, and at least at a surface level, this notion is accepted within the scientific andengineering communities. Exactly how these measures reflect the personal experienceof driving is unknown though clearly less lateral control is not desired. It certainly willmake drivers uncomfortable. The sensitivity of measures based on vehicle lateralposition to design differences varies widely with the particular measurement in thiscategory, with further research needed to determine the exact relationship. In itself,collecting these measurements poses no risk to drivers and experimenters. Howeversince the basic research to determine the association may require exploration of riskysituations, basic research may pose some risk to drivers.

As indicated previously, there are significant technological hurdles to be overcome inthe measurement of vehicle lateral position. Few researchers have collected measuresof this type, and data on repeatability are limited. Further, collection of this class ofmeasures typically requires an instrumented vehicle with a lane tracker[119] At this timethere are probably fewer than 10 vehicles in the world with that capability. In most lanetrackers, a video image is scanned for lane markers, and after geometrictransformations, the lateral position is determined. Only a few lane trackers candetermine yaw angle.

If only lane exceedences are desired, they can be obtained by periodically looking outthe window and manually recording position, or by post-test review of a forward scene

Page 89: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

78

videotape (or from a camera attached to the side of the car and aimed downward). In adriving simulator, lateral position (as well as yaw angle) is one of the results of vehicledynamics calculations and can readily be saved to a file.

OUTPUT MEASURES - SPEED CONTROL

If a vehicle is driving at a variable speed or too fast, crashes are more likely. While highspeeds are associated with greater throughput and a more pleasant driving experience,variability in speed reduces road throughput, and makes driving more difficult,diminishing the experience of driving. Thus, measures of speed control are indicative ofsafety, operational, and personal aspects of driving, though in a complex manner.

Both mean speed and speed variance may be affected by the use of in-vehicle displaysand have been shown to be affected by external demands, though additional researchto address the sensitivity of speed control measures is desired.[67] In general, whenpeople are given in-vehicle tasks with heavy attention demands, they tend to slow downto provide themselves with a greater safety margin. This is sometimes an unconsciousbehavior. Also, because they attend to speed less, their speed may be more variable,even likely to increase because mean speed and speed variance tend to be correlated.Obviously, in braking situations, rates and accelerations could be affected by taskdemands associated with in-vehicle displays; however, such measures concerntransient events, which, again, are more difficult to assess.

A consequence of choosing a particular speed while driving is the headway between thesubject's vehicle and a lead vehicle. Headway and headway variance are linked to thefrequency of rear-end collisions, and are therefore worth considering.

Measurements of speed control do not usually pose any special risk to drivers orexperimenters. In contemporary vehicles recording speed is quite easy with the speedsignal being an output of the electronic engine controller. Some filtering of the signalmay be required before it can be processed by a computer. The measurement ofacceleration requires somewhat more complex and expensive sensors, but the effort isonly somewhat greater than that for speed measures. Headway measurement iscomplex, requiring either custom-made radar-, laser-, or sonar-based sensors.Headway measurement is particularly difficult on curves. In the future, when vehiclesare outfitted with intelligent cruise control systems or collision avoidance systems,recording headway measures may be as straightforward as current methods forrecording speed. All of these measures of speed control are analyzable and accepted;though due to current sensor limitations, there are limits to the repeatability of headwaymeasurements. These limitations are not present for laboratory or simulatorexperiments.

OUTPUT MEASURES - SECONDARY TASKS

Secondary tasks include real in-vehicle tasks that may add to the driver's workload andartificial tasks used to assess the processing capacity remaining. Real tasks includedialing phones and using navigation systems, with their respective performancemeasures being the number of calls successfully completed and navigation errors.

Page 90: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

79

These measures are directly indicative of operational performance (in this case, theeffectiveness of information transfer).

Measures of detection performance, intended to assess spare capacity, have not beenused very often. In brief, the concept is that driving involves not only maintaining apath, but searching for objects of concern, and are indicative of safety margins. Thisincludes pedestrians that might dart into the path of a moving car, responding to thebrake lamps of a lead vehicle, looking for vehicles to cross one's path at intersections,etc.[64] Tasks can also be somewhat artificial such as pressing buttons when instrumentpanel gauges go out of tolerance, response time to single letters, two-choice responsetime with letters, two-digit addition, dot counting, etc.[83,84,86] There does not seem tobe a simple pattern to explain the results. Sometimes the secondary task is affected bythe presence of an in-vehicle task and sometimes it is not. The clearest perspectivecomes from the work of Noy, which emphasizes the importance of within-modalityinterference.

There is no standard method for collecting or analyzing secondary task data. Eachresearcher chooses a method compatible with the equipment and resources available tothem.

These measures should be sensitive to in-vehicle attentional demands because manyof the situations can be precursors to accidents, however their sensitivity to designvariations has not been established. The weaknesses of these measures is that theyare discrete. To assess an in-vehicle display, the timing of the event relative to in-vehicle system use (and the extent to which the event unfolds over time) is important.Further, while such events are relatively easy to schedule in a driving simulator, many ofthem (e.g., pedestrians crossing the vehicle's path) are difficult to safely execute on theroad, posing a risk to the driver, experimenter, and other road users. With somecreativity (such as using foam core outlines of pedestrians), risk can reduced, but thedevelopment of test facilities using such approaches may be costly. Also, becausethese events are unique, once they have occurred, their surprise value is gone and theirrepeated presentation diminishes their utility. Repeatability within individuals istherefore difficult to assess. The initial outcome, however, can be telling in terms of thesafety of a system.

The extent to which secondary task measures relate to drivers' comfort, convenience,and confidence in information systems is unknown. This topic has not been examinedin the literature.

OUTPUT MEASURES - OVERALL

In the past, unsafe designs have been identified by counting how often those designswere associated with accidents. However, many IVHS technologies have yet to beimplemented. Their association with crashes has yet to be established. Even after theyare implemented, most crash data bases do not provide a means for identifying if anIVHS device was present or being used prior to a crash, so establishing a connectionbetween devices and crashes will be difficult. The extent to which crash data reflectoperational benefits (e.g., ease of information transfer) or the quality of the experienceof driving are unknown.

Page 91: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

80

Looking at the selection of crash measures as a source of output information is anotherpossibility for researchers. Traditionally, such information has been contained in databases assembled by the Federal and State governments. Each data base has its ownstructure, though fairly routine statistical methods are used to identify relationshipsbetween variables. There are minimal drawbacks with analysis or acceptance of theresults. For IVHS applications, these analyses are not yet possible as the presence ofIVHS devices is not coded, so there is no data to analyze.

In searching for surrogate measures, subjective assessment of the quality of drivingcould be indicative of when crashes might occur. Subjective quality of driving has notbeen used to examine operational or individual performance on the road, and itssensitivity to interface design differences has not been examined. Quality of driving isnot a prime candidate for assessing ease of operation, driver comfort with in-vehiclesystems, or related matters. As another potential measure, it is commonly believed thatdriving experts, such as driver trainers, can identify dangerous acts that drivers perform.Those acts can be precursors to accidents. Quimby's (1988) work suggests that thecorrelation is not very good; nonetheless, common belief in the linkage persists.[46] Theweakness of this method is the reliance on trained observers and the difficulty ofcalibrating those observers to achieve repeatable results. Quantification of unsafedriving behaviors and their validation using simulation is needed. Thus, while theequipment needs for quality of driving assessments are minimal, there are manyunresolved questions about the data obtained from such evaluations.

Direct subjective assessment of driving workload is also a possibility (e.g., TLX, SWAT).SWAT and TLX were described in the initial section of this report in conjunction with thediscussion of Zaidel's 1991 report. Work by Wierwille and others suggests thatworkload ratings can be indicative of primary task (safety-related) demands, but itremains unclear what should be emphasized—average workload or peak workload.Workload ratings are sensitive to operational differences of in-vehicle devices but not assensitive as direct measures of driving performance. It is not known if they reflectdifferences related to the experience of driving. The workload literature is voluminous,clearly establishing that workload measures are analyzable, repeatable, and wellaccepted. Workload assessments can be conducted in a wide variety of contexts.

Summary measures of driving show promise of being useful for practical assessment ofthe safety of in-vehicle systems. Researchers at TNO have expressed interest in bothTLC and TTC.[40,41] (For a description of TLC and TTC, see earlier sections of thisreport describing Godthelp's research.) The driver's goal on a moment-to-moment basisis to minimize the opportunity for collisions; hence, TTC should be a measure of howsafely one is driving. The difficulty with TTC is that computing it requires a humananalysis of each video frame, computation of the trajectories of everything in the scene,and then predictions about potential conflicts with each object. These calculationsrequire such a considerable effort that few studies have examined these measures.The development of equipment to compute TTC on an ongoing basis should be apriority item.

TLC is somewhat easier to determine, in that it requires information only on a vehicle'slateral position, yaw angle and rate, and forward velocity and acceleration, as well as

Page 92: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

81

data on road curvature. This information can be obtained from a lane tracker and fromvehicle speed and acceleration sensors. In some cases it may be possible to obtain allthe needed data from an advanced video lane tracker. This computational capability isnot often available, which is one reason why TLC is often not used. Development ofhardware to determine TLC automatically is appropriate for technical development.TTC and TLC show considerable promise, but matters pertaining to analyzability,repeatability, and data collection hardware need to be addressed.

It seems likely that these summary measures could reflect ease of operation. If thedriver is distracted by the in-vehicle system, lane position will be more variable, resultingin decreases in TTC and TLC. Driver comfort with the system should also decrease.However, there is no data to address the operational and personal connections withthese summary measures.

Travel operations measures (trip times, number of turns, etc.) are accepted measures ofthe operational performance of an interface. In a secondary way, they are connectedto safety in that greater exposure to the road (more time on the road, more turns)provides more opportunity for crashes. Travel operations measures were widely usedin the TravTek project. They are straightforward to collect and analyze. Distance datamay come from manual reading of odometers or counting of wheel pulses from a speedsensor in a instrumented vehicle. In a simulator these data are directly available fromthe vehicle dynamics calculations. Data on turns may be manually recorded in real timeor post processed in a manual review of videotapes of test sessions.

Potential physiological indicators reported in driving studies include heart rate, thevariance of heart rate (arrhythmia), respiration rate, and galvanic skin response (GSR).Heart rate is generally not sensitive to measures of attentional demand, but ratevariability may be).[86,80] Physiological measures tend to be more common in studiesconducted in Japan, than in studies conducted in the United States and Europe. Ingeneral, physiological measures are most sensitive to the experience of driving and lesssensitive to operational differences and safety. These measures require considerableexperience to collect. Special instrumentation is also required to amplify and filtersignals. There is some debate as to how best to analyze this type of data.

INPUT MEASURES - LATERAL CONTROL

Driving task execution measures concern the actions the driver carries out to sense andmaneuver the car on a moment-to-moment basis. Measures of interest include meanand variance of steering wheel angle, steering wheel reversals, and the spectrum ofsteering wheel input. Of the steering wheel measures, the number of reversals overtime and the spectrum of input appear to be the most sensitive to changes in drivingbehavior. Spectral qualities of steering wheel position are more difficult to analyze thanother measures of steering behavior. Of these measures, steering wheel action ratesseem to be most indicative of task loading, though further research on the topic oflateral control is desired. These measures should be sensitive to the operationaldemands of in-vehicle devices since time spent operating the device is not spentsteering. They should also be indicative of safety since not attending to the primarytask of steering may lead to an accident. In fact, it could be that steering input is a

Page 93: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

82

better measure of safety than various measures of lateral position because the inertia ofthe vehicle "filters out" some of the input differences.

Recording of steering wheel position is usually accomplished using a stringpotentiometer connected to a computer. In future drive by wire vehicles, the steeringwheel angle may be directly accessible from a steering motor controller.

INPUT MEASURES - SPEED CONTROL

Speed control measures of interest include mean throttle (accelerator) position, throttlevariance, the number of brake pedal actuations, and the number of throttle actuations.Just as with lateral control measures, throttle position measurements may be betterindicators of driver performance that the vehicle output measures (speed, lane position)because the output is not smoothed by the vehicle dynamics. Throttle opening, ameasure directly related to throttle position, can be obtained from the electronic enginecontroller and recorded by a computer.

It is suspected that speed control measures may be indicative of both the safety andoperational performance of in-vehicle systems, though the strength of thoserelationships is unknown. The connection of measures of speed control with theexperience of driving is also unknown.

INPUT MEASURES - SECONDARY TASKS

Response times and response errors are the most commonly used measures ofsecondary task performance, with the measure depending upon the task selected. Thedata collection protocol is task specific. The reservations expressed concerning outputmeasures of secondary tasks also hold for input measures as well, since thereservations are related more to the task than the measure. Most of these drawbacksare not present in fairly simple secondary tasks, such as pressing buttons on thesteering wheel when lights mounted on the hood of a test vehicle are detected. It isuncertain, however, how strong the connection is between the secondary taskdependent measure (light detection time, percentage of lights not detected) and safety-related variables such as crashes. The connection with operational and personalcharacteristics is even more remote, and acceptance of them by the engineeringcommunity is less than for other measures.

In contrast to abstract tasks, measurement of performance in the completion of real in-vehicle tasks (such as the time to dial a phone) is well accepted as a measure of theoperational performance of the device used. Such measures are indicative of designdifferences.[11] Those measures should be related to the enjoyment of using a phone orother in-vehicle device. For warning systems (such as IVSAWS), performancemeasures such as detection time and errors are viewed as operational measures ofsuch systems. Task performance measures tend to be easy to collect and analyze, andare repeatable.

As with many of these measures, the extent to which secondary task input measuresreflect the experience of driving is unknown. There is no reason to expect a directlinkage.

Page 94: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

83

INPUT MEASURES - DRIVER VISION

Eye fixation data can be extremely informative, but are very difficult to collect andanalyze.[120] They can be indicative of safety, operational, and personal aspects ofinterfaces, and are sensitive to design differences. Repeatability within individuals hasnot been given much attention. Eye fixation data are widely accepted by scientists andengineers. Clearly, the likelihood of an accident increases with the number and lengthof eye fixations away from the road scene. The literature suggests that when presentedwith in-vehicle visual demands, the first task to drop out is mirror scanning.

Eye fixations can either be collected by aiming a camera at a driver and recordingwhere he or she looks or by using special recording devices. While the direct recordingmethod seems straightforward, the frame-by-frame reduction of the data can take 30 to40 times the recording time, a costly process. Analysis beyond fixation durations andfrequencies (to examine patterns) is very time consuming.

Systems that automatically record fixation coordinates cost from $25,000 up to$100,000, and are beyond what most research organizations can afford. Further, manysystems restrict the field of view, making driving more difficult. Nevertheless, visualscanning behavior can be an important index of potential safety problems. Where eyefixations can be economically recorded, they should be.

Some sense of the attentional demands of driving can be obtained indirectly usingeither helmets or goggles that temporarily block the view of the driver (e.g., Senders,Kristofferson, Levison, Dietrich, and Ward, 1967a,b).[36,37] To date, this approach hasbeen used primarily to determine the demands of the primary task, not the loading of in-vehicle tasks. One potential manipulation would be to reduce input and make theprimary task so difficult that use of an added in-vehicle display would sharply degradethe primary task. This degradation is likely to occur since the primary source ofoverload is visual, as mentioned earlier. Use of such a method for routine assessmentof in-vehicle systems seems excessively complex, though it can be useful for theoreticalanalyses. There are also significant risks to the driver.

DATA ANALYSIS

Most of these measures described in this section are collected in real time byinstrumented vehicles or simulators, with sampling typically occurring several times pers. To reduce the data, the data are first filtered to identify and correct faulty data. Thisprocess involves examining histograms of data to identify outlier and short-termmeasures of variability. This is often done manually for each test session for eachdriver subject. Faulty data can occur as the result of electrical malfunctions,environmental interference with the lane tracker, typing errors in identifying file names,and for a variety of other reasons. Because the environment is less harsh, there aregenerally fewer problems with simulator data than data collected on the road.Anomalies may require the manual review of session videotapes.

The next step involves computing summary statistics for each driver by task and roadsegment. Typically this is done using computer software for one driver session at a

Page 95: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

84

time as a further check of the data. The results of those analyses are then entered intostandard statistics packages for computation of ANOVA and regression statistics, aswell as correlation statistics.

SUMMARY ON MEASURES

It should be apparent that there is no single best measure or limited set ofmeasurements that are appropriate for assessing the safety, operational, and personalaspects of driver information systems. Considerations pertaining to the selection ofmeasures was provided. For many of the input and output measurements discussed inthis section, data on repeatability is lacking. Several of them have significantinstrumentation requirements; others have significant data analysis requirements.Given these limitations, the next section provides general recommendations as to whichmeasures to consider collecting in studies of driver interfaces. Readers interested infurther discussion of these measures should see reference 18.

Page 96: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

85

CONCLUSIONS

Again, the selection of measures by the researcher should reflect the use of theequipment or system being examined. In the foregoing discussion, the primaryemphasis has been on safety. Ease of use is also important, but the measures ofusability tend to be very system-specific. In the case of route guidance systems, themeasurements of interest are the time to learn the system, the number of wrong turnsmade per trip, and the time to reach a destination. For vehicle monitoring and IVSAWSsystems, the appropriate dependent measures are the time to read a display or hear amessage, and the probability that a correct response ensues. For car phones,candidate measures include the time to dial a phone number and the frequency oferrors.

Thus, this report suggests that the standard deviation of lane position, mean speed, andspeed variance are likely to reflect safety, and, to some extent, ease-of-use problemswith in-vehicle displays. Speed-related measures are easy to collect. Lane position,especially on the road, is more difficult. However, because maintenance of speed andlane position are protected tasks, they are unlikely to be perturbed when the risk to thedriver is moderate.

More sensitive to attentional demands are eye fixation data. Ordinarily, as demandsincrease, the fraction of time spent looking at mirrors decreases. Displays that aredifficult (and potentially less safe) to use have longer fixation times and require moreglances. The drawback of eye fixation data is the considerable difficulty in collectingand analyzing it. The standard deviation of lane position, speed and speed variance,and eye fixation distributions are the preferred measures for the assessment of in-vehicle displays.

Direct performance measures, such as response times and error rates, certainly reflectthe ease of use of in-vehicle systems. However, data linking specific response timesand error rates to specific numbers of accidents do not exist. Response time and errordata are most useful for comparing alternative interface design and, using simpleexperiments, deciding which design is best. Hence, they may be difficult to relate tosafety. Nonetheless, to assess operational performance, it is essential that time anderror measures be collected.

Also of interest are TTC and TLC, measures suspected to be tightly linked withaccidents. While estimates for them can be readily obtained in simulators, obtainingthese measures in test vehicles is problematic. The development of equipment tomeasure TTC and TLC is needed.

Less useful are measures of secondary task performance. While they can be indicativeof specific overloads (especially visual), task performance is difficult to relate to levels ofdriving safety (as measured by the number of crashes).

Finally, some researchers favor the use of physiological measures as indicators ofdriving workload. While the connection of some with driving pleasure is clear, the linkwith performance is not. However, given this interest in exploring driver comfort,convenience, and comfort, these measures need further attention.

Page 97: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

86

Overall, direct measures of driving performance (standard deviation of lane position,speed and speed variance) are preferred as indicators of the safety and ease of use ofdriver information systems. For visually based systems, eye fixations should also beexamined. If ease-of-use requirements are to be taken seriously, then system-specificperformance measures (e.g., number of wrong turns for a navigation system) shouldalso be collected. Physiological measures also need further attention, but at the level ofbasic research rather than product evaluation. Again, for a further discussion of thesemeasures, see the followon report.[18]

For many aspects of automotive engineering—development, design, andproduction—there are tradeoffs. That is true in safety engineering evaluations as well.Objectives vary, as do the funds, equipment, and schedule to achieve them. While theselection of measures of effectiveness in the foregoing discussion considers what isscientifically reasonable to do, not everyone has the resources necessary and it maynot be practical to collect these measures. In evaluating a test protocol, this must bekept in mind.

Hopefully, this report has provided a summary of research methods and measuresemployed in key studies pertaining to driver interface evaluation; provided insight intothe selection process; and offered useful suggestions for the selection of measures.This information has been provided with the intent of guiding assessment protocols tofacilitate the evaluation of IVHS technologies.

Page 98: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

87

REFERENCES

1 Green, P. (1993). Human Factors of In-Vehicle Driver Information Systems: AnExecutive Summary (Technical Report UMTRI-93-18), Ann Arbor, MI: TheUniversity of Michigan Transportation Research Institute.

2 Green, P., Williams, M., Serafin, C., and Paelke, G. (1991). Human FactorsResearch on Future Automotive Instrumentation: A Progress Report,Proceedings of the 35th Annual Meeting of the Human Factors Society, pp.1120-1124.

3 Green, P. (1992). American Human Factors Research on In-Vehicle NavigationSystems (Technical Report UMTRI-92-47), Ann Arbor, MI: The University ofMichigan Transportation Research Institute.

4 Brand, J.E. (1990). Attitudes Toward Advanced Automotive Display Systems:Feedback from Driver Focus Group Discussions (Technical Report UMTRI-90-22), Ann Arbor, MI: The University of Michigan Transportation ResearchInstitute.

5 Green, P., and Brand, J. (1992). Future In-Car Information Systems: Input fromFocus Groups (SAE paper 920614), Warrendale, PA: Society of AutomotiveEngineers.

6 Green, P., Serafin, C., Williams, M., and Paelke, G. (1991). "What Functionsand Features Should Be in Driver Information Systems of the Year 2000?" (SAEpaper 912792), Vehicle Navigation and Information Systems Conference(VNIS'91), Warrendale, PA: Society of Automotive Engineers, pp. 483-498.

7 Serafin, C., Williams, M., Paelke, G., and Green, P. (1991). Functions andFeatures of Future Driver Information Systems (Technical Report UMTRI-91-16),Ann Arbor, MI: The University of Michigan Transportation Research Institute.

8 Green, P., and Williams, M. (1992). "Perspective in Orientation/NavigationDisplays: A Human Factors Test," Conference Record of Papers, the ThirdInternational Conference on Vehicle Navigation and Information Systems(VNIS’92), (IEEE Catalog # 92CH3198-9), Piscataway, NJ: Institute of Electricaland Electronics Engineers, pp. 221-226.

9 Williams, M. and Green, P. (1992). Development and Testing of Driver Interfacesfor Navigation Displays (Technical Report UMTRI 92-21), Ann Arbor, MI: TheUniversity of Michigan Transportation Research Institute.

10 Paelke, G., Green, P., and Wen (1993). Development and Testing of a TrafficInformation System Driver Interface Technical Report UMTRI 93-20), Ann Arbor,MI: The University of Michigan Transportation Research Institute.

Page 99: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

88

11 Serafin, C., Wen, C., Paelke, G. and Green, P. (1993). Development and HumanFactors Tests of Car Telephones. (Technical Report 93-17), Ann Arbor, MI: TheUniversity of Michigan Transportation Research Institute.

12 Hoekstra, E., Williams, M., and Green, P. (1993). Development and DriverUnderstanding of Hazard Warning and Location Symbols for IVSAWS (TechnicalReport UMTRI-93-16), Ann Arbor, MI: The University of Michigan TransportationResearch Institute.

13 Williams, M., Hoekstra, E., and Green. P. (1993). Development and Evaluation ofa Vehicle Monitor Driver Interface (Technical Report UMTRI 93-22), Ann Arbor,MI: The University of Michigan Transportation Research Institute.

14 Green, P., Hoekstra, E., Williams, M., Wen, C., and George, K. (1993).Examination of a Videotape-Based Method to Evaluate the Usability of RouteGuidance and Traffic Information Systems (Technical Report UMTRI-93-31), AnnArbor, MI: The University of Michigan Transportation Research Institute.

15 Green, P., Hoekstra, E., Williams, M., George, K. and Wen, C. (1993). Initial On-the-Road Tests of Driver Information System Interfaces: Route Guidance, TrafficInformation, IVSAWS, and Vehicle Monitoring (Technical report UMTRI-93-32),Ann Arbor, MI: The University of Michigan Transportation Research Institute.

16 Green, P., Hoekstra, E., and Williams, M. (1993) On-the-Road Tests of DriverInterfaces: Examination of a Route-Guidance System and a Car Phone(Technical Report UMTRI-93-35), Ann Arbor, MI: The University of MichiganTransportation Research Institute.

17 Green, P. (1993). Measures and Methods Used to Assess the Safety andUsability of Driver Information Systems (Technical Report UMTRI 93-12), AnnArbor, MI: The University of Michigan Transportation Research Institute.

18 Green, P. (1993). Suggested Procedures and Acceptance Limits for Assessingthe Safety and Ease of Use of Driver Information Systems (Technical ReportUMTRI-93-13), Ann Arbor, MI: The University of Michigan TransportationResearch Institute.

19 Levison and Cramer. (1993). Description of the Integrated Driver Model (BBNReport 7840), Cambridge, MA: Bolt, Beranek and Newman.

20 Levison, W.H. (1993). A Simulation Model for the Driver's Use of In-VehicleInformation Systems (TRB Paper 930935), paper presented at the TransportationResearch Board Annual Meeting, Washington, DC., January 10-14, 1993.

21 Wickens, C.D. (1989). "Resource Management and Time-sharing," in Elkind, J.I.,Card, S.K., Hochberg, J. and Huey, B. (eds.), Human Performance Models forComputer-Aided Engineering, Washington, DC: National Academy Press.

Page 100: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

89

22 DRIVE Safety Task Force (1991). Guidelines on Traffic Safety, Man-MachineInteraction and System Safety, June, Brussels, Belgium.

23 Zaidel, D.M. (1991). Specification of a Methodology for Investigating the HumanFactors of Advanced Driver Information Systems (Technical Report TP 11199(E)), Ottawa, Canada: Road Safety and Motor Vehicle Regulation, TransportCanada.

24 Robertson, J. and Southall, D. (1992). Summary Report on the Feasibility ofDeveloping a Code of Practice for Driver Information Systems, Loughborough,Leicestershire, United Kingdom: ICE Ergonomics Ltd.

25 Labiale, G. (1984). Approche Psycho-Ergonomique des Systemes D'Informationet de Guidage Routier des Conducteurs Automobiles (Psycho-ErgonomicApproach to Driver Information and Route Guidance System) (INRETS TechnicalReport IRT, NNE 62), Institut de Recherche des Transports Centre D'Evaluationet de Recherche des Nuisances et de L'Energie, October.

26 Carsten, O. (1991). personal communication.

27 Franzen, S. (1991). personal communication.

28 Stevens, A. and Pauzie, A. (1992). "Safety Standards for Driver InformationSystems," The 3rd International Conference on Vehicle Navigation andInformation Systems (VNIS'92) (IEEE catalog # 92CH3198-9), Piscataway, NJ:Institute of Electrical and Electronics Engineers, pp. 26-31.

29 Zaidel, D.M. (1991). Specification of a Methodology for Investigating the HumanFactors of Advanced Driver Information Systems (Technical Report TP 11199(E)), Ottawa, Canada: Road Safety and Motor Vehicle Regulation, TransportCanada.

30 Reid, G.B. and Nygren, T.E. (1988). "The Subjective Workload AssessmentTechnique: a Scaling Procedure for Measuring Mental Workload," pp. 185-218 inP.S. Hancock and N. Meshkati (eds.), Human Mental Workload, Amsterdam, TheNetherlands: Elsevier Science.

31 Lysaght, R.J., Hill, S.G., Dick, A.O., Plamondon, B.D., Linton, P.M., Wierwille,W.W., Zaklad, A.L., Bittner, A.C., and Wherry, R.J. (1989). Operator Workload:Comprehensive Review and Evaluation of Operator Workload Methodologies(Technical Report 851). Alexandria, VA: U.S. Army Research Institute.

32 Hart, S.G. and Staveland, L.E. (1988). "Development of a NASA-TLX (Task LoadIndex): Results of Empirical and Theoretical Research," in P.S. Hancock and N.Meshkati (eds.), Human Mental Workload, Amsterdam, The Netherlands: ElsevierScience.

Page 101: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

90

33 Hendy, K.C. (1990). "Measurement and Prediction of Operator InformationProcessing Load," pp. 19-25 in R.M. Heron (ed.), Proceedings of the SecondSeminar in Transportation Ergonomics (publication TP 8855), Montreal, Quebec,Canada: Transport Canada, Transportation Development Center.

34 Knowles, W.B. (1963). "Operator Loading Tasks," Human Factors, 5, 155-163.

35 Michon, J.A. (1966). "Tapping Regularity as a Measure of Perceptual MotorLoad," Ergonomics, 9, 401-412.

36 Senders, J.W., Kristofferson, A.B., Levison, W.H., Dietrich, C.W., and Ward, J.L.(1967b). The Attentional Demand of Automobile Driving (BBN Technical Report1482), Cambridge, MA: Bolt Beranek and Newman, Inc.

37 Senders, J.W., Kristofferson, A.B., Levison, W.H., Dietrich, C.W., and Ward, J.L.(1967a). "The Attentional Demand of Automobile Driving," Highway ResearchRecord # 195, Washington, DC: Highway Research Board, 15-32.

38 Senders, J.W., Kristofferson, A.B., Levison, W.H., Dietrich, C.W., and Ward, J.L.(1966). An Investigation of Automobile Driver Information Processing (BBNTechnical Report 1335), Cambridge, MA: Bolt Beranek and Newman, Inc. (NTISPB 170 879.

39 Hicks, T.G. and Wierwille, W.W. (1979). "Comparison of Five Mental WorkloadAssessment Procedures in a Moving Base Driving Simulation," Human Factors,April, 21(2), 129-144.

40 Godthelp, H., Milgram, P., and Blaauw, G.J. (1984). "The Development of aTime-Related Measure to Describe Driving Strategy," Human Factors, June,26,(3), 257-268.

41 Godthelp, H. (1988). "The Limits of Path Error-Neglecting in Straight LineDriving," Ergonomics, 31(4), 609-619.

42 Zwahlen, H.T. and Balasubramanian, K.N. (1974). "A Theoretical andExperimental Investigation of Automobile Path Deviations When Driver Steerswith No Visual Input," Transportation Research Record # 520, 25-37.

43 Zwahlen, H.T. and DeBald, D.P. (1986). "Safety Aspects of Sophisticated In-Vehicle Information Displays and Controls," Proceedings of the Human FactorsSociety-30th Annual Meeting-1986, Santa Monica, CA: The Human FactorsSociety, pp. 256-260

44 Zwahlen, H.T., Adams, C.C., Jr., and DeBald, D.P. (1988). "Safety Aspects ofCRT Touch Panel Controls in Automobiles," in Vision in Vehicles II [A.G. Gale, etal. (eds.)], The Netherlands: North Holland, pp. 335-344.

Page 102: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

91

45 Zwahlen, H.T., Adams, C.C., Jr., and Schwartz, P.J. (1988). "Safety Aspects ofCellular Telephones in Automobiles," 18th International Symposium onAutomotive Technology and Automation (paper 88058).

46 Quimby, A.R. (1988). In-Car Observation of Unsafe Driving Actions (ResearchReport ARR 153), Victoria, Australia: Australian Road Research Board, April.

47 Van Der Horst, R. and Godthelp, H. (1989). "Measuring Road User Behaviorwith an Instrumented Car and an Outside-the-Vehicle Video ObservationTechnique," Transportation Research Record # 1213, Washington, DC:Transportation Research Board, 72-81.

48 Dingus, T.A. (1988). Attentional Demand for an Automobile Moving-MapNavigation System (unpublished Ph.D. dissertation), Blacksburg, VA: VirginiaPolytechnic Institute and State University, Department of Industrial Engineeringand Operations Research.

49 Wierwille, W.W., Antin, J.F., Dingus, T.A., and Hulse, M.C. (1988). "VisualAttentional Demand of an In-Car Navigation Display," in Gale, A.G. et al. (eds.),Vision in Vehicles II, Amsterdam, The Netherlands: Elsevier Science, pp. 307-316.

50 Dingus, T.A., Antin, J.F., Hulse, M.C., and Wierwille, W.W. (1986). HumanFactors Test and Evaluation of an Automobile Moving-Map Navigation System,Part I Attentional Demand Requirements, (IEOR Department Report 86-03),Blacksburg, VA: Virginia Polytechnic Institute and State University, Departmentof Industrial Engineering and Operations Research, September.

51 Dingus, T.A., Antin, J.F., Hulse, M.C., and Wierwille, W.W. (1989). "AttentionalDemand Requirements of an Automobile Moving-Map Navigation System,"Transportation Research, 23A(4), 301-315.

52 Hulse, M.C. (1988). Effects of Variations in Driving Task Attentional Demand onIn-Car Navigation System Usage (unpublished master’s thesis), Blacksburg, VA:Virginia Polytechnic Institute and State University, March.

53 Wierwille, W.W., Hulse, M.C., Fischer, T.J., and Dingus, T.A. (1987). Effects ofVariations in Driving Task Attentional Demand on In-Car Navigation SystemUsage (IEOR Department Report 87-02, General Motors Research LaboratoriesReport CR-88/02/OS), Blacksburg, VA: Virginia Polytechnic Institute and StateUniversity, September.

54 Wierwille, W.W., Hulse, M.C., Fischer, T.J., and Dingus, T.A. (1988). StrategicUse of Visual Resources by the Driver While Navigating with an In-CarNavigation Display System, (SAE paper 885180), Warrendale, PA: Society ofAutomotive Engineers.

Page 103: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

92

55 Hulse, M.C., Dingus, T.A., Fischer, T.A., and Wierwille, W.W. (1989). "TheInfluence of Roadway Parameters on Driver Perception of Attentional Demands,"pp. 451-456 in Mital, A. (ed.), Advances in Industrial Ergonomics and Safety, 1,New York, NY: Taylor and Francis.

56 Wierwille, W.W., Hulse, M.C., Fischer, T.J., and Dingus, T.A. (1987). Effects ofVariations in Driving Task Attentional Demand on In-Car Navigation SystemUsage (IEOR Department Report 87-02, General Motors Research LaboratoriesReport CR-88/02/OS), Blacksburg, VA: Virginia Polytechnic Institute and StateUniversity, September.

57 Wierwille, W.W., Hulse, M.C., Fischer, T.J., and Dingus, T.A. (1988). StrategicUse of Visual Resources by the Driver While Navigating with an In-CarNavigation Display System, (SAE paper 885180), Warrendale, PA: Society ofAutomotive Engineers.

58 Bos, T., Green, P., and Kerst, J. (1988). How Should Instrument Panel DisplayLegibility Be Tested? (Technical Report UMTRI-88-35), Ann Arbor, MI: TheUniversity of Michigan Transportation Research Institute, (NTIS No. PB 90141375/AS).

59 Green, P., Goldstein, S., Zeltner, K., and Adams, S. (1988). Legibility of Text onInstrument Panels: A Literature Review (Technical Report UMTRI-88-34). AnnArbor, MI: The University of Michigan Transportation Research Institute. (NTISNo. PB 90 141342/AS).

60 Boreczky, J., Green, P., Bos, T., and Kerst, J. (1988). Effects of Size, Location,Contrast, Illumination, and Color on the Legibility of Numeric Speedometers(Technical Report UMTRI-88-36). Ann Arbor, MI: The University of MichiganTransportation Research Institute, (NTIS No. PB 90 150533/AS).

61 Noy, Y.I. (1987). "Theoretical Review of the Secondary Task Methodology forEvaluating Intelligent Automobile Displays," Proceedings of the Human FactorsSociety-31st Annual Meeting, Santa Monica, CA: The Human Factors Society,pp. 205-209.

62 Noy, Y.I. (1989). "Intelligent Route Guidance: Will the New Horse Be as Goodas the Old?" Vehicle Navigation and Information Systems Conference (VNIS’89),Warrendale, PA: Society of Automotive Engineers, pp. 49-55.

63 Noy, Y.I. (1990). Attention and Performance While Driving with Auxiliary In-Vehicle Displays (Transport Canada Publication TP 10727 (E)), Ottawa, Ontario,Canada: Transport Canada, Traffic Safety Standards and Research, ErgonomicsDivision.

Page 104: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

93

64 Hardee, H.L., Johnston, C.M., Kuiper, J.W., and Thomas, W.E. (1990). "Towarda Methodology for Evaluating Instrument-Panel Controls," Proceedings of theHuman Factors Society 34th Annual Meeting, Santa Monica, CA: Human FactorsSociety, pp. 613-617.

65 Allen, R.W., Stein, A.C., Rosenthal, T.J., Ziedman, D., Torres, J.F., and Halati, A.(1991). "A Human Factors Simulation Investigation of Driver Route Diversionand Alternate Route Selection Using In-Vehicle Navigation Systems," VehicleNavigation and Information Systems Conference Proceedings (VNIS’91),Warrendale, PA: Society of Automotive Engineers, pp. 9-26.

66 Allen, R.W. (1991). personal communication.

67 Verwey, W.B. (1991). Towards Guidelines for In-Car Information Management:Driver Workload in Specific Driving Situations (Technical Report IZF 1991 C-13),Soesterberg, The Netherlands: TNO Institute for Perception.

68 Levison, W. (1993). personal communication.

69 Dingus, T.A., Hulse, M.C., Krage, M.K., Szczublewski, F.E., and Berry, P. (1991)."A Usability Evaluation of Navigation and Information System “Pre-Drive”Functions," Vehicle Navigation and Information Systems Conference (VNIS’91),(SAE paper 912794), Warrendale, PA: Society of Automotive Engineers, pp. 527-536.

70 Krage, M.K. (1991). "The TravTek Driver Information System," The 3rdInternational Conference on Vehicle Navigation and Information Systems(VNIS'92) (IEEE catalog # 92CH3198-9), Piscataway, NJ: Institute of Electricaland Electronics Engineers, pp. 739-748.

71 Rillings, J.H. and Lewis, J.W. (1991). "TravTek," The 3rd InternationalConference on Vehicle Navigation and Information Systems (VNIS'92) (IEEEcatalog # 92CH3198-9), Piscataway, NJ: Institute of Electrical and ElectronicsEngineers, pp. 729-737.

72 Green, P., Boreczky, J., and Kim, Seung-Yun (Sylvia). (1990). "Applications ofRapid Prototyping to Control and Display Design." (SAE paper #900470, SpecialPublication SP-809), Warrendale, PA: Society of Automotive Engineers.

73 Mammano, F. (1993). personal communication.

74 Burgett, A.L. (1991). "Safety Evaluation of TravTek," Vehicle Navigation andInformation Systems Conference Proceedings (VNIS’91), Warrendale, PA:Society of Automotive Engineers, pp. 819-825.

Page 105: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

94

75 Fleischman, R.N. (1991). "Research and Evaluation Plans for the TravTek IVHSOperational Field Test," Vehicle Navigation and Information SystemsConference Proceedings (VNIS’91), Warrendale, PA: Society of AutomotiveEngineers, pp. 827-837.

76 de Groot, M.T. (1992). "Rhine-Corridor: An RDS-TMC Pilot for Radio TrafficInformation," The 3rd International Conference on Vehicle Navigation andInformation Systems (VNIS'92) (IEEE catalog # 92CH3198-9), Piscataway, NJ:Institute of Electrical and Electronics Engineers, pp. 8-13.

77 Underwood, S. (1992). "Concept for Evaluation of the FAST-TRAC Program,"The 3rd International Conference on Vehicle Navigation and Information Systems(VNIS'92) (IEEE catalog # 92CH3198-9), Piscataway, NJ: Institute of Electricaland Electronics Engineers, pp. 626-631.

78 Bouis, D., Voss, M., Geiser, G., and Haller, R. (1979). "Visual vs. AuditoryDisplays for Different Tasks of a Car Driver," Proceedings of the Human FactorsSociety-23rd Annual Meeting, 35-39, Santa Monica, CA: Human Factors Society.

79 Brouwer, W.H., Ickenroth, J., Van Wolffelaar, P.C. and Ponds, R.W.H.M. (1990)."Divided Attention in Old Age: Difficulty in Integrating Skills," in R. J. Takens, etal. (eds.), European Perspectives in Psychology, New York, NY: Wiley.

80 Diamon, T. (1992). "Driver's Characteristics and Performances When Using In-vehicle Navigation System [sic]," The 3rd International Conference on VehicleNavigation and Information Systems (VNIS'92) (IEEE catalog # 92CH3198-9),Piscataway, NJ: Institute of Electrical and Electronics Engineers, 251-260.

81 Grant, B.S. and Wierwille, W.W. (1992). "An Accuracy Analysis of Techniquesfor Measuring the Durations of In-car Manual Tasks," Proceedings of the HumanFactors Society 36th Annual Meeting, Santa Monica, CA: Human FactorsSociety, pp. 1190-1194.

82 Korukawa, K. and Wierwille, W.W. (1990). "Validation of a Driving SimulationFacility for Instrument Panel Task Performance," Proceedings of the HumanFactors Society 34th Annual Meeting, Santa Monica, CA: Human FactorsSociety, pp. 1299-1303.

83 MacAdam, C.C. (1992). Truck Driver Informational Overload (Technical ReportUMTRI-92-36), Ann Arbor, MI: The University of Michigan TransportationResearch Institute.

84 Ponds, R.W.H.M., Brouwer, W.H., and van Wolffelaar, P.C. (1988). "AgeDifferences in Divided Attention in Simulated Driving Task," Journal ofGerontology, 43(6), 151-156.

Page 106: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

95

85 Verwey, W.B. (1989). "Simple In-Car Route Guidance Information from AnotherPerspective: Modality versus Coding," First Vehicle Navigation and InformationSystems Conference (VNIS'89), Piscataway, NJ: Institute of Electrical andElectronics Engineers, pp. 56-60.

86 Walker, J., Alicandri, E., Sedney, C., and Roberts, K. (1990). In-VehicleNavigational Devices: Effects on Safety and Driver Performance (TechnicalReport FHWA-RD-90-053), Washington, DC: U.S. Department of Transportation,Federal Highway Administration.

87 Walker, J., Alicandri, E., Sedney, C., and Roberts, K. (1991). In-VehicleNavigational Devices: Effects on Safety and Driver Performance, VehicleNavigation and Information Systems Conference Proceedings (VNIS’91),Warrendale, PA: Society of Automotive Engineers, 499-526.

88 Walker, J., Alicandri, E., Sedney, C., and Roberts, K. (1992). "In-VehicleNavigational Devices: Effects on Safety and Driver Performance," Public Roads,56(1), 9-22.

89 Alm, H. (1990). "Drivers' Cognitive Models of Routes," in Van Winsum, W., Alm,H., Schraagen, J.M., and Rothengatter, T., Laboratory and Field Studies onRoute Representation and Driver's Cognitive Models of Routes (DeliverableGIDS/NAV2, DRIVE Project V1041), Haren, The Netherlands: Traffic ResearchCentre, University of Groningen.

90 Alm (undated). Drivers' Cognitive Models of Routes (unpublished manuscript),Linkoeping, Sweden: Swedish Traffic Institute.

91 Alm. H. and Berlin, M. (undated) What is the Optimal Amount of Information froma Verbally-Based Navigation System? (unpublished manuscript), Linkoeping,Sweden: Swedish Traffic Institute.

92 Alm, H., Nilsson, L., Jarmark, S., Savelid, J., and Hennings, U. (1992). TheEffects of Landmark Presentation on Driver Performance and Uncertainty in aNavigation Task—A Field Study, (Report No. 692A), Linkoeping, Sweden:Statens Vaeg- och Trafikinstitut.

93 Davis, J.R. (1989). Back Seat Driver: Voice Assisted Automobile Navigation(unpublished Ph.D. dissertation), Cambridge, MA: Massachusetts Institute ofTechnology.

94 Davis, J.R. (1989). "The Back Seat Driver: Real Time Spoken DrivingDirections," First Vehicle Navigation and Information Systems Conference(VNIS'89), Piscataway, NJ: Institute of Electrical and Electronics Engineers, pp.146-150.

Page 107: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

96

95 Eberhard, J.W. (1968). Driver Information Requirements and AcceptanceCriteria, Highway Research Record, # 265, Washington, DC: National ResearchCouncil, Highway Research Board, 19-30.

96 Hook, K. (1991). An Approach to Route Guidance (Licentiate thesis), (TechnicalReport 91-019-DSV), Stockholm, Sweden: Stockholm University, Department ofComputer and Systems Sciences.

97 Hook, K. and Karlgren, J. (1991). Some Principles for Route DescriptionsDerived from Human Advisers (SICS Technical Report R91:06), Stockholm,Sweden: Swedish Institute of Computer Science.

98 Brown, C., Gustavsson, R., Hook, K., Lindevall, P., and Waern, A. (1991). FinalReport on Interactive Route Guidance (SICS Technical Report T91:18), Kista,Sweden: Swedish Institute of Computer Science, Knowledge Based SystemsLaboratory.

99 Waern, A. (1992). Planning and Reasoning for Human-Machine Interaction inRoute Guidance (Licentiate thesis) (Technical Report 92-0006-DSV), Stockholm,Sweden: Stockholm University, Department of Computer and Systems Sciences.

100 Kuiken, M.J., Miltenburg, G.M., and van Winsum, W. (1992). "Drivers' Reactionsto an Intelligent Driver Support System (GIDS) Implemented in a DrivingSimulator," The 3rd International Conference on Vehicle Navigation andInformation Systems (VNIS'92) (IEEE catalog # 92CH3198-9), Piscataway, NJ:Institute of Electrical and Electronics Engineers, pp. 176-181.

101 Labiale, G. (1989). Influence of in Car Navigation Map Displays on DriversPerformances (SAE paper 891683), Warrendale, PA: Society of AutomotiveEngineers.

102 Labiale, G. (1990). "In-Car Road Information: Comparisons of Auditory andVisual Presentations," Proceedings of the Human Factors Society 34th AnnualMeeting, Santa Monica, CA: Human Factors Society, pp. 623-627.

103 Labiale, G., Mamberti, M.L., Baez, D., Conus, Y., and Aupetit, J. (1988).Comprehension et Memorisation des Messages Visuels et Auditifs D'InformationRoutiere (Comprehension and Retention of Visual and Auditory Road SafetyMessages) (INRETS report 66), Bron, France: Laboratoire Energie Nuisances,Institut National de Recherche sur les Transports et leur Securite.

104 Labiale, G. (undated). Ergonomics of In-car Road Information System: RDS Inf-Flux, unknown source, 1489-1491.

Page 108: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

97

105 Schraagen, J.M.C. (1990). "Use of Different Types of Map Information for RouteFollowing in Unfamiliar Cities," in van Winsum, W., Alm, H., Schraagen, J.M. andRothengarter, T., Laboratory and Field Studies on Route Representation andDrivers' Cognitive Models of Routes (Deliverable GIDS/NAV2), Haren, TheNetherlands: University of Groningen, Traffic Research Center.

106 Sperandio, J.C. and Dessaigne, M.F. (1988). "Une Comparaison ExperimentaleEntre Model de Presentation Visuels ou Auditifs de Messages D'InformationsRoutieres a des Conducteurs Automobiles (An Experimental Comparison ofVisual and Auditory Messages Displayed to Car Drivers)," Le Travail Humain,51(3), 257-269.

107 Antin, J.F. (1987) Automobile Navigation Methods: Effectiveness, Efficiency, andStrategy, (unpublished doctoral dissertation), Blacksburg, VA: VirginiaPolytechnic Institute and State University.

108 Labiale, G. (1992). "Driver Characteristics and In-car Map Display MemoryRecall Performance," The 3rd International Conference on Vehicle Navigationand Information Systems (VNIS'92) (IEEE catalog # 92CH3198-9), Piscataway,NJ: Institute of Electrical and Electronics Engineers, pp. 227-232.

109 McKnight, A.J. and McKnight, A.S. (1992). Effect of In-Vehicle NavigationInformation Systems Upon Driver Attention (Technical Report), Washington, DC:AAA Foundation for Traffic Safety.

110 Morita, H. and Ogawa, M. (1992). "Experimental Studies for Toyota Voice RouteGuidance Using Beacon," Surface Transportation and the Information Age:Proceedings of the IVHS America 1992 Annual Meeting, Washington, DC: IVHSAmerica, 614-622.

111 Pauzie, A. and Marin-Lamellet, C. (1989). "Analysis of Aging Drivers BehaviorsNavigating with In-Vehicle Visual Display Systems," Vehicle Navigation andInformation Systems (VNIS'89), Piscataway, NJ: Institute of Electrical andElectronics Engineers, pp. 61-67.

112 Popp, M.M. and Farber, B. (1991). "Advanced Display Technologies, Route GuidanceSystems, and the Position of Displays in Cars," Vision in Vehicles III (in Gale, A.G., etal.), Amsterdam, The Netherlands: Elsevier.

113 Rothery, R.W., Thompson, R.R., and von Buseck, C.R. ( 1968). A Design for anExperimental Route Guidance System, Volume III, Driver Display ExperimentalEvaluation (Technical Report GMR-815-III), Warren, MI: General MotorsResearch Laboratories.

114 Staal, D. (1987). ETAK Navigational System Evaluation in Los Angeles and SanFrancisco - Final Results (letter to J. Sullivan of National Car Rental).

Page 109: Measures and Methods Used to Assess the Safety and ...driving/publications/UMTRI-93-12.pdfmeasures are very weak predictors of safety and usability. To assess usability, application-specific

98

115 Streeter, L.A., Vitello, D., and Wonsiewicz, S.A. (1985). "How to Tell PeopleWhere to Go: Comparing Navigational Aids," International Journal of Man-Machine Studies, 22, 549-562.

116 Verwey, W.B. and Janssen, W.H. (1988). "Driving Behavior with Electronic In-Car Navigation Aids," Proceedings of Road Safety in Europe (VTI Report 342A),Linkoping, Sweden: Swedish Road and Traffic Safety Research Institute,pp. 81-97.

117 Voss, M. and Haller, R. (1982). "Processing of On-board Route GuidanceInformation in Real Traffic Situations," Analysis, Design and Evaluation of Man-Machine Systems, Oxford, United Kingdom: Pergammon, 371-375.

118 West, R., Kemp, R., and Hack, S. (1989). Autoguide System Proving andUsability Trials (Contractor Report 181), Crowthorne, Berkshire: United Kingdom:Transport and Road Research Laboratory, Department of Transport.

119 Sweet, R.E. and Green, P. (1993). "UMTRI's instrumented car," UMTRIResearch Review, January-February, 1-11.

120 Green, P. (1992). Review of Eye Fixation Recording Methods and Equipment(Technical Report UMTRI-92-28), Ann Arbor, MI: University of MichiganTransportation Research Institute.