Micro Task Interface Evaluation Method

Preview:

Citation preview

micro taskss w i p e t o s t a r t

our goal: a standardized method for benchmarking control room interfaces

• Tablet tool for data collection• Can be used stand-alone or • Linked to simulator• Standardized data collection procedure• Standardized method for question generation• Standardized set of questions• Database

micro tasks are for Evaluation HRA Training

Efficiently and objectively benchmark innovative displays against conventional interfaces

IFE process overview display on tablet

Example

IFE will conduct a micro task evaluation of this set-up in December 2015 at a U.S. training simulator

Compare

IFE design concept for overview displays on tablet, developed for the 2015 U.S. simulator study (screen 1 of 3)

exampleinnovative vs. conventional

mass balanceconventional innovative

Do the innovative displays lead to faster, more reliable identifications and decisions?

mass balanceconventional innovative

Is performance (time and reliability) with innovative displays at least as good as with conventional?

How to test performance benefitsof new interface solutions?

Scenario-based methods.

Observational or self-report.

Qualitative insights.

Relatively few data points.

Decontextualised.

Performance based.

Quantitative data.

Large amounts of data.

flavors of t&e methods

micro tasks are…

Large number of questions Related to systems, components, procedures, etc.

Varying levels of difficulty Including higher-level decision making

Different display conditions E.g. innovative vs. conventional displays

We measure response time / accuracy Compare data between conditions If needed, review eye tracking recordings to understand anomalies

detection/decision tasks under time pressure

Micro task tablet app linked to simulatorTablet app receives signals from simulatorTablet app can send signals / commands to simulator

Dynamic scenarios To cover monitoring / vigilance tasks

We can now record operator actions Can record operating of components, e.g. “start RCPs”

New system for aggregating data New system for managing task lists

Makes it easier to set up and manage a study

new in 2015

example

4 operators working in the simulator individually(no communication)

We control which displays are available (eg innovative, conventional)

Instructions:”Please answer the questions correctly, but also as quickly as possible.

It is very important that you work as fast as you can.”

Run 1s w i p e t o s t a r t

How many condensate pumps are running on turbine 31?

1 2 3

s w i p e t o c o n ti n u e

“How many condensate pumps are running on turbine 31?”

Average identification timeConventional 8 secInnovative

6 sec

Is the subcooling margin sufficient?

Yes

s w i p e t o c o n ti n u e

No

Which steam generatorsare faulty?

SG-1

s w i p e t o c o n ti n u e

SG-2 SG-3

Should safety injection be stopped?

Yes

s w i p e t o c o n ti n u e

No

What is the narrow-range level in steam generator 1?

s w i p e t o c o n ti n u e

%

1 2 3 4 5 6 7 8 9 0

37

up to 200 questions per hourwith 4 operators in the simulator working individually, that means we can run up to 800 questions per hour

Tablet can trigger events in the simulator (e.g. start a tube leak)

and receive signal from the simulator (e.g. RCP-2 was started)

videohttps://vimeo.com/131387407

results

data sourcewhere does this data come from?

6840 data points3420 response time measures

3420 accuracy measures

20 operators

5 hours simulator timein total for the whole data collection

innovative displays are superior

HS I m e a n p e rfo rm a n ce t im e s

Cu rre n t e ffe ct: F(2 , 8 ,0 0 3 )=2 1 ,8 8 9 , p = ,0 0 0 5 7

V e rti ca l b a rs d e n o te 0 ,9 5 co n fi d e n ce i n te rva l s

L S D (In n o va ti ve ) O W D (Co n ve n ti o n a l ) L S D a n d O W D

HS I d e sig n

1 5

1 6

1 7

1 8

1 9

2 0

2 1

2 2

2 3

2 4

2 5

2 6

Per

form

ance

tim

e (s

ec)

Operators were faster with innovative displays than with conventional displays

innovative

conventional

Cu rre n t e ffe ct: F(2 , 8 ,0 0 3 )=2 1 ,8 8 9 , p = ,0 0 0 5 7

V e rti ca l b a rs d e n o te 0 ,9 5 co n fi d e n ce i n te rva l s

L S D (In n o va ti ve ) O WD (Co n ve n ti o n a l ) L S D a n d O WD

HS I d e si g n

1 5

1 6

1 7

1 8

1 9

2 0

2 1

2 2

2 3

2 4

2 5

2 6

Per

form

ance

tim

e (s

ec)

Highly statistically significant difference

Slightly more accurate with innovative displays than conventional displays

Percentage of correct answers

innovative conventional innovative

Performance variability

conclusion

As reliable as conventional displays

Faster to read

innovative displays in the 2014 study are

summary

Generate quantitative results about performance benefits (time and reliability) of digital/innovative CR interfaces Highly efficient, objective method Compare to existing reference data Qualitative insights via eye tracking Data directly supports HRA Highly customisable and precise Generate exactly the data needed for the HRA

big picture

micro task tablet app = a mobile companion for Human Factors / Human Reliability specialists

Micro task tablet app

Standardized question setsDatabase

training

hra interfacedesign

gamification

evaluation of training

validation

benchmarkingwithin / across organisations

what next?

Benchmarking studies 2015 / 2016

benchmarking studies

(1) 2015 study at U.S. training simulator analog interfaces vs IFE displays

(2) 2016 study of a partially digital control roomanalog interfaces vs. 1990s-vintage overview display vs IFE display

(3) 2016 study of a fully digital control roomMeasure operator performance and reliability in a fully digital control room and benchmark against data from study 1 and 2

Micro task database Micro tasks for training

Includes gamification approach

Micro tasks for actions outside the control room (field operator, mechanical, etc)

roadmap

Micro tasks for team decisions Add secondary task capability Add workload measurement capability

roadmap

michael.hildebrandt@ife.no

Recommended