Upload
suzan-jordan
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Rolf MolichDialogDesign
Denmark
CHI99 PanelComparative Evaluation of Usability Tests
Take a web-site.
Take nine professional usability teams.
Let each team usability test the web-site.
Are the results similar?
What Have We Done?
Nine teams have usability tested the same web-site– Seven professional teams– Two student teams
Test web-site: www.hotmail.comFree e-mail service
Panel Format
Introduction (Rolf Molich)
Five minute statements from five participating teams
The Customer’s point of view (Meeta Arcuri, Hotmail)
Conclusions (Rolf Molich)
Discussion - 30 minutes
Purposes of Comparison
Survey the state-of-the art within professional usability testing of web-sites.
Investigate the reproducibility of usability test results
NON Purposes of Comparison
To pick a winner
To make a profit
Basis for Usability Test
Web-site address: www.hotmail.com
Client scenario
Access to client through intermediary
Three weeks to carry out test
What Each Team Did
Run standard usability test
Anonymize the usability test report
Send the report to Rolf Molich
Problems Found
Total number of different usability problems found 300
Found by seven teams 1 six teams 1 five teams 4 four teams 4 three teams 15 two teams 49 one team 226 (75%)
Comparative Usability Evaluation 2
Barbara Karyukina, SGI (USA)
Klaus Kaasgaard & Ann D. Thomsen, KMD (Denmark)
Lars Schmidt and others, Networkers (Denmark)
Meghan Ede and others, Sun Microsystems, Inc., (USA)
Wilma van Oel, P5 (The Netherlands)
Meeta Arcuri, Hotmail, Microsoft Corp. (USA) (Customer)
Rolf Molich, DialogDesign (Denmark) (Coordinator)
Comparative Usability Evaluation 2
Joseph Seeley, NovaNET Learning Inc. (USA)
Kent Norman, University of Maryland (USA)
Torben Norgaard Rasmussen and others, Technical University of Denmark
Marji Schumann and others, Southern Polytechnic State University (USA)
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Barbara KaruykinaSGI, Wisconsin
USA
Challenges:
Twenty functional areas
+
User preferences questions
Possible Solutions:
Two usability tests Surveys User notes Focus groups
Results:
26 tasks + 10 interview questions
100 findings
Challenges:
Twenty functional areas
+
User preferences questions
Problems Found
Total number of different usability problems found 300
Found by seven teams 1 six teams 1 five teams 4 four teams 4 three teams 15 two teams 49 one team 226 (75%)
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Klaus KaasgaardKommunedata
Denmark
Slides currently not available
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Lars SchmidtFramtidsfabriken Networkers
Denmark
Team E
Framtidsfabriken Networkers Testlab, Denmark
Key learnings CUE-2
Setting up the test– Insist on dialog with customer – Secure complete understanding of user groups and user
tasks– Narrow down test goals
Writing the report– Use screendumps– State conclusions - skip the premises – Test the usability of the usability report
Improving Test Methodology
Searching for usability and usefulness– Hook up with different methodologies (e.g. interviews)
Focus on website context– Test against e.g. YahooMail– Test against softwarebased email clients
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Meghan EdeSun Microsystems
California, USA
Hotmail Study Requests
18 Specific Features e.g. Registration, Login, Compose...
6 Questions e.g. "How do users currently do email?"
24 Potential Study Areas
Usability Methods
Expert Review 6 Reviewers
6 Questions
Usability Study 6 Participants (3 + 3)
5 Tasks (with sub-tasks)
Report Description
1. Executive Summary - 4 Main High-Level Themes - Brief Study Description
2. Debriefing Meeting Summary - 7 Areas (e.g. overall, navigation, power features, ...)
3. Findings - 31 Sections
- Study Requests, Extra Areas, Bugs, Task Times, Study Q & A
4. Study Description
Total: 36 Pages - 150 Findings
Lessons Learned
Importance of close contact with product team
Consider including: severity ratings
more specific recommendations
screen shots
Discussion Issues
How can we measure the usability of our reports?
How to deal with the difference between number of problems found and number included in report?
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Wilma van OelP5
The Netherlands
Wilma van Oel
P5 adviseurs voor produkt-& kwaliteitsbeleidquality & productmanagement consultants
Amsterdam, the Netherlands
Structure of Presentation
1. Introduction
2. Deviations in approach– Test design– Results and recommendations
3. Lessons for the future– Change in approach?– Was it worth the effort?
Introduction
• Company: P5 Consultants
• Personal background: psychologist
Test design Subjects: n=11, pilot, ‘critical users’, 1 hour session Data collection: log software, video recording
Methods: lab evaluation + informal approach
Techniques: exploration, task execution,
think aloud, interview, questionnaire
Tool: SUS
A Test Session
Results and recommendations
N e g a t iv en = m e d ia n
P o s it iv en > m e a n
R ecomm enda tion s:g e n e ra l
n o t 'h o w '
R esu lts:'g e n e ra l 's e v e r ity ?
Lessons for the future
Change in approach?– Methods: add a usability inspection method– Procedure: extensive analysis, add session time– Results: less general, severity?
Was it worth the effort?– Company: to get experience & benchmarking– Personally: to improve skills, knowledge
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Meeta ArcuriMicrosoft Corporation
California, USA
Meeta Arcuri
User Experience Manager
Microsoft Corp., San Jose, CA
CUE - 2 The Customer’s Perspective
New findings ~ 4% Validation of known issues ~ 67%
– Previous finding from our lab tests– Finding from on-going inspections
Remainder - beyond Hotmail Usability– Business reasons for not changing– Out of Hotmail’s control (partner sites)– Problems generic to the web
Customer Summary of Findings
Quick and Dirty results Recommendations for problem fixes Participant quotes – get tone/intensity of
feedback Exact # of P who encountered each issue Background of Participants Environment (browser, speed of connection,
etc.)
Report Content: Positive Observations
Fresh perspectives Lots of data on non-US users Recommendations from participants Trend reporting Report of outdated material on site
(some help files) Appreciate positive findings, comments
Additional Strengths of Reports
Some recommendations not sensitive to web issues (performance, security)
At least one finding irreproducible (not preserving fields in Reg. Form)
Frequency of issue reported was sometimes vague.
Some descriptions terse, vague - had to decipher
Report Content: Weaknesses
Cross-validate new findings with Hotmail Customer Service reports
Lots of good data to cite in planning meetings Some good recommendations given by labs
and participants
How Hotmail Will Use Results
Focused, iterative testing would give better results
Wide array of user data very valuable Overall - good qualitative and quantitative data
to help prioritize, schedule, and improve usability of Hotmail.
Conclusion
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Rolf MolichDialogDesign
Denmark
Comparison of Tests
Based only on test reports
Liberal scoring
Focus on major differences
Two generally recognized textbooks:
– Dumas and Redish, ”A Practical Guide to Usability Testing”
– Jeff Rubin, ”Handbook of Usability Testing”
Resources
Team A B C D E F G HJ
Person hours used for test 136 123 84 (16) 130 50 107 45
218
# Usability professionals 2 1 1 1 3 1 1 3
6
Number of tests 7 6 6 50 9 5 11 46
Usability Results
Team A B C D E F G HJ
# Positive findings 0 8 4 7 24 25 14 46
# Problems 26 150 17 10 58 75 30 1820
% Exclusive 42 71 24 10 57 51 33 5660
Usability Results
Team A B C D E F G H J
# Problems 26 150 17 10 58 75 30 18 20
% Core problems (100%=26) 38 73 35 8 58 54 50 27 31
Person hours used for test 136 123 84 NA 130 50 107 45 218
Problems Found
Total number of different usability problems found 300
Found by seven teams 1 six teams 1 five teams 4 four teams 4 three teams 15 two teams 49 one team 226 (75%)
If Hotmail is typical, then the total number of usability problems for a typical web-site is huge,much larger than you can hope to find in one series of usability tests
Usability testing techniques can be improved
We need more awareness of the Usability of Usability work
Conclusion
http://www.dialogdesign.dk/cue2.htm
Download Test Reports and Slides