View
322
Download
0
Category
Preview:
DESCRIPTION
Citation preview
Chapter 4Performance Metrics
Presenter: 00335011 魏傳諺
Agenda
• Preface
• Task Success
• Time-on-Task
• Errors
• Efficiency
• Learnability
Preface of Performance Metrics
• Based on specific user behaviors
– User behaviors
– The use of scenarios or task
• How well users are actually using a product
• Useful to estimate the magnitude of a specific usability issue
– How many people are likely to encounter the same issue after the product is
released?
– How many users are able to successfully complete a core set of tasks using a
product
• Not the magical elixir for every situation
– sample size
– time & money
– tell the what very effectively but not the why
Five Basic Types
• The most widely used performance metric• How effectively users are able to complete a given set of
tasksTask Success
• How much time is required to complete a taskTime-on-Task
• Reflect the mistakes made during a taskErrors
• The amount of effort a user expends to complete a taskEfficiency
• How performance changes over timeLearnability
TASK SUCCESS
Task Success
• The most common usability metric
• As long as the user has a well-defined task, you can measure
success
Collecting Any Type of Success Metric
• Each task must have a clear end-state
– Define the success criteria Data collection
• Find the current price for a share of Google stock (clear end-state)
• Research ways to save for your retirement (not a clear end-state)
• Way to collect success data
– Verbally articulate the answer after completing the task
– Provide their answers in a more structured way
• Try to avoid write-in answers if possible
• In some case the correct solution to a task may not be verifiable
– depends on the user’s specific situation
– testing is not being performed in person
Binary Success
• Either participants complete a task successfully or they don’t
• How to Collect and Measure
– 0 & 1
• How to Analyze and Present
– By individual task
– By user or type of user
• Frequency of use
• Previous experience using the product
• Domain expertise
• Age group
• Can calculate a percentage of tasks that each successfully completed
– Binary data Continuous data
• Calculating Confidence Intervals
Levels of Success
• Partially completing a task?
– coming close to fully completing a task may provide value to the
participant
– Helpful for you to know
• Why some participants failed to complete a task
• With which particular tasks they needed help
Levels of Success (cont’d)
• How to Collect and Measure
– Must define the various levels
– Based on the extent or degree to which a participant completed the task
• Complete Success, Partial Success, and Failure
• What constitutes ‘‘giving assistance’’ to the participant
• Assign a numeric value for each level
• Does not differentiate between different types of failure
– Based on the experience in completing a task
• No Problem, Minor Problem, Major Problem, and Failure/Gave up
• Ordinal data No average score
– Based on the participant accomplishing the task in different ways
• Depending on the quality of the answer (not needs numeric score)
Levels of Success (cont’d)
• How to Analyze and Present
– To create a tacked bar chart
– To report a “usability score”
Issues in Measuring Success
• How to define whether a task was successful?
– When unexpected situations arise
• Make note of them
• Afterward try to reach a consensus
• How or when to end a task
– Stopping rule
• Complete task / Reach the point at which they would give up or seek
assistance
• “Three strikes and you’re out”
• Set a time limit
– If the participant is becoming particularly frustrated or agitated
TIME-ON-TASK
Time-on-Task
• Way to measure the efficiency of any product
– The faster a participant can complete a task, the better the experience
• Exceptions to the assumption that faster is better
– Game
– Learning
Importance of Measuring Time-on-Task
• Particularly important for products
– where tasks are performed repeatedly by the user
• The side benefits of measuring time-on-task
– Increasing Efficiency Cost Savings Actual ROI
How to Collect and Measure Time-on-Task
• The time elapsed between the start of a task and the end of a task
– In minutes
– In seconds
• Measure by any time-keeping device
– Start time & End time
– Two people record the times
• Automated Tools for Measuring Time-on-Task
– less error-prone
– Much less obtrusive
• Turning on and off the Clock
– Rules about how to measure time
• Start the clock as soon as they finish reading the task
• Point the timing ends at the participant hit the “answer” button
• Stop timing when the participant has stopped interacting with the product
How to Collect and Measure Time-on-Task (cont’d)
• Tabulating Time Data
Analyzing and Presenting Time-on-Task Data
• Ways to present
– Mean
– Median
– Geometric mean
• Ranges
– Time interval
• Thresholds
– Whether users can complete certain tasks within an acceptable amount of time
• Distributions and Outliers
– Exclude outliers (> 3 SD above the mean)
– Set up thresholds
– determine the fastest possible time
Issues to Consider When Using Time Data
• Only Successful Tasks or All Tasks?
– Advantage of only including successful tasks
• A cleaner measure of efficiency
– Advantage of including all tasks
• A more accurate reflection of the overall user experience
• An independent measure in relation to the task success data
– Always determined when to end include all times
– Sometimes decided when to end only include successful tasks
• Using a Think-Aloud Protocol?
– Think-aloud protocol: to gain important insight
– Have an impact on the time-on-task data
– Retrospective probing technique
• Should You Tell the Participants about the Time Measurement?
– Perform the tasks as quickly and accurately as possible
ERRORS
Errors
• Usability issue vs. Error
– A usability issue is the underlying cause of a problem
– One or more errors are a possible outcome
• Errors
– incorrect actions that may lead to task failure
When to Measure Errors
• When you want to understand the specific action or set of actions
that may result in task failure
• Errors can tell
– How many mistakes were made
– Where they were made within the product
– How various designs produce different frequencies and types of errors
– How usable something really is
• Three general situations where measuring errors might be useful
– When an error will result in a significant loss in efficiency
– When an error will result in significant costs
– When an error will result in task failure
What Constitutes an Error?
• No widely accepted definition of what constitutes an error
• Based on many different types of incorrect actions by the user
– Entering incorrect data into a form field
– Making the wrong choice in a menu or drop-down list
– Taking an incorrect sequence of actions
– Failing to take a key action
• Determine what constitutes an error
– Make a list of all the possible actions
– Define many of the different types of errors that can be made
What Constitutes an Error? (cont’d)
Collecting and Measuring Errors
• Not always easy
– Need to know what the correct (set of) action(s) should be
• Consideration
– Only a single error opportunity
– Multiple error opportunities
• Way of organizing error data
– Record the number of errors for each task and each user
– 0 ~ max(number of error opportunities)
Analyzing and Presenting Errors
• Tasks with a Single Error Opportunity
– Look at the frequency of the error for each task
• Frequency of errors
• Percentage of participants who made an error for each task
– From an aggregate perspective
• Average the error rates for each task into a single error rate
• Take an average of all the tasks that had a certain number of errors
• Establish maximum acceptable error rates for each task
• Tasks with Multiple Error Opportunities
– Look at the frequency of errors for each task error rate
– The average number of errors made by each participant for each task
– Which tasks fall above or below a threshold
– Weight each type of error with a different value and then calculate an “error score”
Issues to Consider When Using Error Metrics
• Make sure you are not double-counting errors
• Need to know
– An error rate, and
– Why different errors are occurring
• An error is the same as failing to complete a task
– Report errors as task failure
EFFICIENCY
Efficiency
• Time-on-task
• Look at the amount of effort required to complete a task
– In most products, the goal is to minimize the amount of effort
– two types of effort
• Cognitive
– Finding the right place to perform an action
– Deciding what action is necessary
– Interpreting the results of the action
• Physical
– The physical activity required to take action
Collecting and Measuring Efficiency
• Identify the action(s) to be measured
• Define the start and end of an action
• Count the actions
• Actions must be meaningful
– Incremental increase in cognitive effort
– Incremental increase in physical effort
• Look only at successful tasks
Analyzing and Presenting Efficiency Data
• The number of actions each participant takes to complete a task
– if some tasks are more complicated than others, it may be misleading
• Lostness
– N: The number of different web pages visited while performing the task
– S: The total number of pages visited while performing the task
– R: The minimum (optimum) number of pages that must be visited to
accomplish the task
– A perfect lostness score would be 0
– Participants with a lostness score greater than 0.5 definitely did appear
to be lost
– The average lostness value
Analyzing and Presenting Efficiency Data (cont’d)
Efficiency as a Combination of Task Success and Time
• Task Success + Time-on-Task
• Core measure of efficiency
– The ratio of the task completion rate to the mean time per task
LEARNABILITY
LEARNABILITY
• Most products, especially new ones, require some amount of learning
• Experience
– Based on the amount of time spent using a product
– Based on the variety of tasks performed
• Learning
– Sometimes quick and painless
– At other times quite arduous and time consuming
• Learnability
– The extent to which something can be learned
– How much time and effort are required to become proficient
– While happens over a short period of time maximize efficiency
– While happen over a longer time period great rely on memory
Collecting and Measuring Learnability Data
• Basically the same as they are for the other performance metrics
• Collect the data at multiple times
– Based on expected frequency of use
• Decide which metrics to use Decide how much time to allow
between trials
• Alternatives
– Trials within the same session
– Trials within the same session but with breaks between tasks
– Trials between sessions
Analyzing and Presenting Learnability Data
• By examining a specific performance metric
• Interpret the chart
– Notice the slope of the line(s)
– Notice the point of asymptote, or essentially where the line starts to
flatten out
– Look at the difference between the highest and lowest values on the y-
axis
• Compare learnability across different conditions
Issues to Consider When Measuring Learnability
• What Is a Trial?
– Learning is continuous and without breaks in time
• Memory is much less a factor in this situation
• More about developing and modifying different strategies to complete a set
of tasks
• Take measurements at specified time intervals
• Number of Trials
– There must be at least two
– In most cases there should be at least three or four
– You should err on the side of more trials than you think you might need
to reach stable performance.
Thanks for your listening~
Recommended