Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
HY-364: Επικοινωνία Ανθρώπου - Μηχανής Slide 1HY-364: Επικοινωνία Ανθρώπου - Μηχανής Slide 1Slide 1
COURSE CS-364 (OPTIONAL)
HUMAN – COMPUTER INTERACTION
UNIVERSITY OF CRETE
FACULTY OF SCIENCES AND ENGINEERING
COMPUTER SCIENCE DEPARTMENT
In this lecture we will be looking at:
The reasons to conduct evaluations
The various methods of evaluation and how they are classified
Heuristic evaluation
Slide 2CS-364: Human – Computer Interaction
Evaluation is an integral part of the design process
Initial design decisions are confirmed or rejected, and new requirements and possible interaction problems are revealed
Evaluation helps to ensure that the system behaves as it is expected and to assess whether the requirements and goals set during the requirements phase have been met
Consequently, evaluation offers a way for terminating the iterative development procedure
Slide 3CS-364: Human – Computer Interaction
“Common sense” design decisions
Assuming personal behaviour is representative
Continuing acceptance of habit and/or tradition
Implicit unsupported assumption in design
Unsupported early design decisions
Postponement of evaluation “until a more
convenient time”
Formal evaluation using inappropriate subject
groups
Experiments which cannot be analysed
Slide 4CS-364: Human – Computer Interaction
Traditional testing: focus on system functionality and ergonomics, later in the development process
Usability evaluation: focus on the users, the tasks and the user interface, in the early stages of design and development
Mainly used to:
▪ identify usability problems
▪ evaluate design alternatives
▪ assess the effect of the user interface on the users
▪ obtain feedback which can be used to improve design
▪ predict whether usability goals will be met
Slide 5CS-364: Human – Computer Interaction
Nowadays users expect much more than just a usable system; they also look for a pleasing and engaging experience
This means it is even more important to carry out an evaluation
As the Nielsen Norman Group (www.nngroup.com) notes: “User experience encompasses all aspects of the end-user's
interaction . . . the first requirement for an exemplary user experience is to meet the exact needs of the customer, without fuss or bother. Next come simplicity and elegance, which produce products that are a joy to own, a joy to use.”
More about user experience in the UX lecture
Slide 6CS-364: Human – Computer Interaction
Some of the most favourite criteria used for classifying evaluationmethods in HCI literature are:
Who is involved (users/experts)
Where they are carried out (at the lab / in the field)
The development phase they apply to (design/implementation)
When they are carried out (formative/summative)
Their formality (formal/informal)
The objectiveness and type of their results (objective/subjective -qualitative/quantitative)
Slide 7CS-364: Human – Computer Interaction
A common approach is to classify evaluation into summative and formative
Summative evaluation takes place after a product is complete, or nearly so
It is often used prior or after product release
This type of evaluation was typical for linear development phases, as in the waterfall model
In current practice, summative evaluation is mainly used to assess the compliance of a product against a standard or set of guidelines or to fine-tune and debug the beta version of a product
Slide 8CS-364: Human – Computer Interaction
The main drawback of summative evaluation is that, problemsdetected result in major iterations through the same sequence ofdevelopment phases
Furthermore, summative evaluation does not support an iterativerefinement process and consequently cannot support a user-centred design process
Formative evaluation takes place during all the phases ofthe development life-cycle and its results are fed back todesign
This early evaluation brings the maximum pay-off, since problemsare early identified and corrected
Slide 9CS-364: Human – Computer Interaction
Formative Chef who periodically checks a dish
while it’s being prepared and makes adjustments to positively impact the end result
Summative Evaluating the dish after it is
completed like a restaurant critic who compares the meal with other restaurants
Slide 10CS-364: Human – Computer Interaction
Usability methods are divided in two main categories according to whether user participation is required, or not:
empirical, or user-based
non-empirical
Since user-based methods are mostly used for evaluating the implementation, they usually refer to summative evaluation
Non-empirical methods are predominantly used during the design and prototyping phases (formative evaluation)
This is not always the case. For example, users are also used during design to compare alternatives, or identify semantic usability problems
Slide 11CS-364: Human – Computer Interaction
Empirical evaluation involves testing a user interface with real users, and requires a simulation, a prototype, or the full implementation of the system
The subjects must be representative of the user population and sufficient sample is needed (a typical minimum is no less than eight)
This type of evaluation is used to identify usability problems, assess the users’ opinion, and observe and measure user performance
Slide 12CS-364: Human – Computer Interaction
Non-empirical evaluation, is also referred to as formative evaluation;it has to do with the detailed examination of a:
user interface design
specification document
abstract model
prototype
final system
by usability or domain experts as a means of detecting and correcting usability problems early in the development process in order to minimise the cost (time and effort), of potential design changes
Slide 13CS-364: Human – Computer Interaction
Non-empirical evaluation is decomposed into analytic methods and inspection methods
Evaluation using analytic methods is called model-based evaluation and is used during the early phases of development
These methods use formal (formative) models of the system and the user based on tasks and/or grammar to predict user performance and cognitive load
Analytic methods have many drawbacks:
they are very difficult to apply
they require highly trained people
they are very costly
they take too much time
they do not scale up well to complex highly interactive user interfaces
they do not offer any design input
Slide 14CS-364: Human – Computer Interaction
As a result, they are not popular to the software industry and tend to be ignored
The research community seems to have an interest in the use of analytic evaluation methods and in the development of software components aiming to partly automate the evaluation of an interactive system using one of the available analytical models
The most famous method in this category is GOMS* but there is a large variety of additional methods (e.g. Keystroke Level Model – KLM, Critical Path Method, etc) each one covering different needs and serving different purposes
*GOMS Model (Card, Moran, and Newell): a human information processing model that predicts what skilled users will do in seemingly unpredictable situations (i.e. assumes expert user and well-defined tasks).GOMS (Goals, Operators, Methods, Selection rules)
Slide 15CS-364: Human – Computer Interaction
Inspection methods have to do with the inspection of a user interface design or prototype by usability experts, sometimes with the participation of users and/or designers, in order to identify usability problems in an existing user interface design, or task
These methods are used to provide recommendations for fixing problems, improving the usability of the design, or assessing compliance to a specific standard, or set of guidelines
The term inspection is used with reference to formal code and function inspection methods that have been used in software engineering for debugging and improving code
Slide 16CS-364: Human – Computer Interaction
The main advantages of these methods are that they are cheap, fast and easy to administer and perform
Their main disadvantage is that the outcomes depend heavily on the skill, knowledge and experience of the evaluators
There are 3 different approaches to user interface inspection:
evaluation of a user interface against a set of HCI-related heuristics
walkthrough methods
guidelines inspection
Slide 17CS-364: Human – Computer Interaction
A formal form of heuristic evaluation is standards inspection, where the interface is assessed for compliance to a specific standard
This type of inspection, must be performed by a person who is expert in the use and interpretation of the standard
Another form of inspections against heuristics are consistencyinspections which are used to ensure the consistency of the “look and feel” of the products of the same company, or of different components of the same product
Slide 18CS-364: Human – Computer Interaction
Another approach to user interface inspection are the walkthroughmethods, also referred to as storyboarding and tabletopping
These methods are used early in design when no user interface simulation, or prototype exists and they provide a means for systematically reviewing the validity of a user interface and its flow
Their main difference with heuristic methods is that their main focus is on the tasks and not on the interface
Slide 19CS-364: Human – Computer Interaction
One representative method is the cognitive walkthrough which is used for evaluating the ease of learning of an interface
Cognitive walkthroughs have been criticised for being impractical, effortful and for not delivering the beneficial conceptual support of theory
Slide 20CS-364: Human – Computer Interaction
An extension of the traditional walkthrough is the pluralisticwalkthrough developed by IBM
A group of usability experts, users, and product developers, review a user interface design by following a task scenario and examining each element of interaction by posing a set of given questions such as,
“Will the users try to achieve the right effect?”
“Will the user notice that the correct action is available?”
etc.
Slide 21CS-364: Human – Computer Interaction
This type of walkthrough is not just concerned with learnabilityissues, but with the general usability of the user interface
A possible argument would be that, since users participate in the procedure, this method should be classified as empirical
However, this is not the case, because users do not participate as subjects, but as experts, or as living examples of user requirements, preferences and cognitive models
Slide 22CS-364: Human – Computer Interaction
Compared to empirical experiments, inspection methods find less than half of the problems existing in a specific user interface (Desurvire, 1994), but they successfully identify most of the major ones
Additionally, inspection methods have proved that they can find usability problems overlooked by user tests and vice versa
For this reason, in practice, these two categories are used in combination
Slide 23CS-364: Human – Computer Interaction
Evaluation methods
Empirical (User-based)
Objective Assessment Subjective Assessment(Query Technique)
• Thinking aloud, Question asking protocol
• Cooperative evaluation, Co-discovery learning, Coaching
• Retrospective testing
• Performance measurement
Questionnaires • Focus Groups
• Interviews
Mental Effort User satisfaction &Perceived usability
Non-empirical
Inspection Methods Theory-Based Methods
Against HCI-based heuristics
Walkthroughs / Storyboarding /
Tabletoping
Guidelines
• Heuristic (Expert) Evaluation
• Standards Inspection (Adherence & Compliance)
• Consistency Inspection
• Cognitive Walkthrough
• Pluralistic Usability Walkthrough
Paper-based
Tools for working with
guidelines
General purpose
Platform specific(Styleguides)
Slide 24CS-364: Human – Computer Interaction
Factors distinguishing evaluation techniques:
Design vs. implementation
Laboratory vs. field studies
Subjective vs. objective
Qualitative vs. quantitative measures
Information provided
Immediacy of response
Intrusiveness
Resources
Slide 25CS-364: Human – Computer Interaction
The most popular usability inspection method
It is done as a systematic inspection of a user interface design for usability
Its goal is to find the usability problems in the design so that they can be attended to as part of an iterative design process
It involves having a small set of evaluators examine the interface and judge its compliance with recognized usability principles (the "heuristics")
Slide 27CS-364: Human – Computer Interaction
In general, heuristic evaluation is difficult for a single individual to do because one person will never be able to find all the usability problems in an interface
Luckily, experience from many different projects has shown that different people find different usability problems
Therefore, it is possible to improve the effectiveness of the method significantly by involving multiple evaluators
Slide 28CS-364: Human – Computer Interaction
An example from a case study of heuristic evaluation where 19 evaluators were used to find 16 usability problems in a voice response system allowing customers access to their bank accounts
The figure clearly shows that there is a substantial amount of non-overlap between the sets of usability problems found by different evaluators
Slide 29CS-364: Human – Computer Interaction
Some usability problems are so easy to find that they are found
by almost everybody, but there are also some problems that
are found by very few evaluators
Furthermore, one cannot just identify the best evaluator and
rely solely on that person's findings
The recommendation is normally to use 3 to 5 evaluators since
one does not gain that much additional information by using
larger numbers
Slide 30CS-364: Human – Computer Interaction
Heuristic evaluation is performed by having each individualevaluator inspect the interface alone
The results of the evaluation can be recorded either as written reports from each evaluator or by having the evaluators verbalize their comments to an observer as they go through the interface
Written reports have the advantage of presenting a formal record of the evaluation, but require an additional effort by the evaluators and the need to be read and aggregated by an evaluation manager
Slide 31CS-364: Human – Computer Interaction
Using an observer adds to the overhead of each evaluation session, but reduces the workload on the evaluators
The results of the evaluation are available fairly soon after the last evaluation session since the observer only needs to understand and organize one set of personal notes, not a set of reports written by others
The observer can assist the evaluators in operating the interface in case of problems, such as an unstable prototype, and help if the evaluators have limited domain expertise and need to have certain aspects of the interface explained
Slide 32CS-364: Human – Computer Interaction
Typically, a heuristic evaluation session for an individual evaluator lasts
between one to two hours
Longer evaluation sessions might be necessary for larger or very
complicated interfaces with a substantial number of dialogue
elements, but it would be better to split up the evaluation into
several smaller sessions, each concentrating on a part of the interface
During the evaluation session, the evaluator goes through the
interface several times and inspects the various dialogue elements
and compares them with a list of recognized usability principles (the
heuristics)
These heuristics are general rules that seem to describe common properties of
usable interfaces
Slide 33CS-364: Human – Computer Interaction
In addition to the checklist of general heuristics, the evaluator obviously is also allowed to consider any additional usability principles or results that come to mind that may be relevant for any specific dialogue element
Furthermore, it is possible to develop category-specific heuristicsthat apply to a specific class of products as a supplement to the general heuristics
Slide 34CS-364: Human – Computer Interaction
In principle, the evaluators decide on their own how they want to proceed with evaluating the interface
A general recommendation would be that they go through the interface at least twice, however
The first pass would be intended to get a feel for the flow of the interaction and the general scope of the system
The second pass then allows the evaluator to focus on specific interface elements while knowing how they fit into the larger whole
Slide 35CS-364: Human – Computer Interaction
Since the evaluators are not using the system as such (to perform a real task), it is possible to perform heuristic evaluation of user interfaces that exist on paper only and have not yet been implemented
This makes heuristic evaluation suited for use early in the usability engineering lifecycle
If the system is intended as a walk-up-and-use interface for the general population or if the evaluators are domain experts, it will be possible to let the evaluators use the system without further assistance
Slide 36CS-364: Human – Computer Interaction
If the system is domain-dependent and the evaluators are fairly naive with respect to the domain of the system, it will be necessary to assist the evaluators to enable them to use the interface
One approach that has been applied successfully is to supply the evaluators with a typical usage scenario, listing the various steps a user would take to perform a sample set of realistic tasks
Slide 37CS-364: Human – Computer Interaction
A list of usability problems in the interface with references to those usability principles that were violated by the design in each case in the opinion of the evaluator
It is not sufficient for evaluators to simply say that they do not like something; they should explain why they do not like it with reference to the heuristics or to other usability results
The evaluators should try to be as specific as possible and should list each usability problem separately
Slide 38CS-364: Human – Computer Interaction
There are two main reasons to note each problem separately:
First, there is a risk of repeating some problematic aspect of a dialogue element, even if it were to be completely replaced with a new design, unless one is aware of all its problems
Second, it may not be possible to fix all usability problems in an interface element or to replace it with a new design, but it could still be possible to fix some of the problems if they are all known
Slide 39CS-364: Human – Computer Interaction
Heuristic evaluation does not provide a systematic way to generate fixes to the usability problems or a way to assess the probable quality of any redesigns
Also, many usability problems have fairly obvious fixes as soon as they have been identified
Even for these simple examples, however, the designer has no information to help design the exact changes to the interface
Slide 40CS-364: Human – Computer Interaction
Severity ratings can be used to allocate the most resources to fix the most serious problems and can also provide a rough estimate of the need for additional usability efforts
If the severity ratings indicate that several disastrous usability problems remain in an interface, it will probably be unadvisable to release it
But one might decide to go ahead with the release of a system with several usability problems if they are all judged as being cosmetic in nature
Slide 41CS-364: Human – Computer Interaction
The severity of a usability problem is a combination of three (+1) factors:
Slide 42CS-364: Human – Computer Interaction
• Is it a one-time problem that users can overcome once they know about it or will users repeatedly be bothered by the problem?
• Certain usability problems can have a devastating effect on the popularity of a product, even if they are "objectively" quite easy to overcome
• Will it be easy or difficult for the users to overcome?
• Is it common or rare?
The frequencywith which
the problem occurs
The impactof the
problem if it occurs
The persistence
of the problem
+ Market impact of the
problem
Even though severity has several components, it is common to combine all aspects of severity in a single severity rating as an overall assessment of each usability problem in order to facilitate prioritising and decision-making
The following 0 to 4 rating scale can be used to rate the severity of usability problems:
Slide 43CS-364: Human – Computer Interaction
I don't agree that this is a usability
problem at all
Cosmetic problem only:
need not be fixed unless extra time
is available on project
Minor usability problem: fixing
this should be given low priority
Major usability problem:
important to fix, so should be
given high priority
Usability catastrophe:
imperative to fix this before
product can be released
It is difficult to get good severity estimates from the evaluators during a heuristic evaluation session when they are more focused on finding new usability problems
Also, each evaluator will only find a small number of the usability problems, so a set of severity ratings of only the problems found by that evaluator will be incomplete
Instead, severity ratings can be collected by sending a questionnaire to the evaluators after the actual evaluation sessions, listing the complete set of usability problems that have been discovered, and asking them to rate the severity of each problem
Slide 44CS-364: Human – Computer Interaction
Since each evaluator has only identified a subset of the problems included in the list, the problems need to be described in reasonable depth, possibly using screen-dumps as illustrations
The descriptions can be synthesized by the evaluation observer from the aggregate of comments made by those evaluators who had found each problem ▪ or, if written evaluation reports are used, the descriptions can be
synthesized from the descriptions in the reports
These descriptions allow the evaluators to assess the various problems fairly easily even if they have not found them in their own evaluation session
Slide 45CS-364: Human – Computer Interaction
Typically, evaluators need only spend about 30 minutes to provide their severity ratings
It is important to note that each evaluator should provide individual severity ratings independently of the other evaluators
Often, the evaluators will not have access to the actual system while they are considering the severity of the various usability problems
Slide 46CS-364: Human – Computer Interaction
It is possible that the evaluators can gain additional insights by revisiting parts of the running interface rather than relying on their memory and the written problem descriptions
Experience indicates that severity ratings from a single evaluator are too unreliable to be trusted
As more evaluators are asked to judge the severity of usability problems, the quality of the mean severity rating increases rapidly, and using the mean of a set of ratings from three evaluators is satisfactory for many practical purposes
Slide 47CS-364: Human – Computer Interaction
Heuristic evaluation is explicitly intended as a "discount usability engineering" method
Independent research has indeed confirmed that heuristic evaluation is a very efficient usability engineering method
As a discount usability engineering method, heuristic evaluation is not guaranteed to provide "perfect" results or to find every last usability problem in an interface
Slide 48CS-364: Human – Computer Interaction
Usability problems can be located in a dialogue in four different ways:
at a single location in the interface
at two or more locations that have to be compared to find the problem
as a problem with the overall structure of the interface
as something that ought to be included in the interface but is currently missing
An analysis of 211 usability problems (Nielsen 1992) found that the difference between the four location categories was small and not statistically significant
In other words, evaluators were approximately equally good at finding all four kinds of usability problems
Slide 49CS-364: Human – Computer Interaction
Even though heuristic evaluation finds many usability problems that are not found by user testing, it is also the case that it may miss some problems that can be found by user testing
Evaluators are probably especially likely to overlook usability problems if the system is highly domain-dependent and they have little domain expertise
Since heuristic evaluation and user testing each finds usability problems overlooked by the other method, it is recommended that both methods be used
Slide 50CS-364: Human – Computer Interaction
Because there is no reason to spend resources on evaluating an interface with many known usability problems only to have many of them come up again, it is normally best to use iterative design between uses of the two evaluation methods
Typically, one would first perform a heuristic evaluation to clean up the interface and remove as many "obvious" usability problems as possible
After a redesign of the interface, it would be subjected to user testing both to check the outcome of the iterative design step and to find remaining usability problems that were not picked up by the heuristic evaluation
Slide 51CS-364: Human – Computer Interaction
Visibility of system status
The system should always keep users informed about what is going on, through appropriate feedback within reasonable time
Match between system and the real world
The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms
Follow real-world conventions, making information appear in a natural and logical order
Slide 52CS-364: Human – Computer Interaction
User control and freedom
Users often choose system functions by mistake and will need a clearly marked "emergency exit" to leave the unwanted state without having to go through an extended dialogue. Support undo and redo
Consistency and standards
Users should not have to wonder whether different words, situations, or actions mean the same thing
Follow platform conventions
Slide 53CS-364: Human – Computer Interaction
Error prevention
Even better than good error messages is a careful design which prevents a problem from occurring in the first place
Recognition rather than recall
Make objects, actions, and options visible
The user should not have to remember information from one part of the dialogue to another
Instructions for use of the system should be visible or easily retrievable whenever appropriate
Slide 54CS-364: Human – Computer Interaction
Flexibility and efficiency of use
Accelerators -- unseen by the novice user -- may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users
Allow users to tailor frequent actions
Aesthetic and minimalist design
Dialogues should not contain information which is irrelevant or rarely needed
Every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility
Slide 55CS-364: Human – Computer Interaction
Help users recognize, diagnose, and recover from errors
Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution
Help and documentation
Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation
Any such information should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be too extensive
Slide 56CS-364: Human – Computer Interaction
Slide 57CS-364: Human – Computer Interaction