31
Christos Katsanos | [email protected] Nikolaos Tselios | [email protected] Nikolaos Avouris | [email protected] AutoCardSorter: Designing the Information Architecture of a Web Site Using Latent Semantic Analysis ACM SIGCHI | Florence, Italy | 5- 10 April, 2008

Chi 2008 katsanos et al auto_cardsorter_final

Embed Size (px)

DESCRIPTION

autocardsorter

Citation preview

Page 1: Chi 2008 katsanos et al auto_cardsorter_final

Christos Katsanos | [email protected] Tselios | [email protected] Avouris | [email protected]

AutoCardSorter: Designing the Information Architecture of a Web Site Using Latent Semantic Analysis

ACM SIGCHI | Florence, Italy | 5-10 April, 2008

Page 2: Chi 2008 katsanos et al auto_cardsorter_final

2

Purpose & Motivation

Automate Structural Design of Information Spaces Increase efficiency and flexibility for practitioners

Page 3: Chi 2008 katsanos et al auto_cardsorter_final

3

Why it is important?

Structural design greatly affects user experience

Current approaches (e.g. Card Sorting) often neglected: Time constraints Cost to recruit users and run the studies Increased complexity for data analysis Challenging for large sites (>100 pages)

Page 4: Chi 2008 katsanos et al auto_cardsorter_final

4

Our tool-based Methodology

Page Text Descriptions

Semantic Similarity Measure (e.g. LSA)

Hierarchical Clustering Algorithms

Interactive Tree Structure

Additional Support1. Number of Groups2. Cross-Hierarchy Links

Semantic Similarity Matrix

Page 5: Chi 2008 katsanos et al auto_cardsorter_final

5

The tool Interface: AutoCardSorter

Page Descriptions

Analysis Options:- Semantic Similarity Measure- Clustering Algorithm Type

Interactive Dendrogram

Page 6: Chi 2008 katsanos et al auto_cardsorter_final

Validation Study Design

6

Page 7: Chi 2008 katsanos et al auto_cardsorter_final

Validation Study Design

vs

7

Card SortingAutoCardSorter

Investigate quality of results & efficiency Health & Nutrition Site Same content item descriptions 18 representative users

Page 8: Chi 2008 katsanos et al auto_cardsorter_final

Measures & Analysis

8

Page 9: Chi 2008 katsanos et al auto_cardsorter_final

P1 P2 P3 P4 P5

P1 -

P2 0.94 -

P3 0.11 0.33 -

P4 0.33 0.28 0.11 -

P5 0.50 0.83 0.06 0.06 -

P1 P2 P3 P4 P5

P1 -

P2 0.62 -

P3 0.21 0.14 -

P4 0.49 0.51 0.83 -

P5 0.61 0.11 0.21 0.92 -

ValiditySimilarity-Matrices Correlation

9

AutoCardSorter Card Sorting

LSA (P5,P1)Frequency Users placed in Same Pile P1 and P5

Page 10: Chi 2008 katsanos et al auto_cardsorter_final

Validity % Agreement of Design

1) Hierarchical Cluster Analysis of Card Sorting Data

2) AutoCardSorter vs User-Data Dendrograma) Eigenvalue Analysis to ‘cut’ objectively

b) User structure => Ideal

c) In Agreement => Longer sequence of pages grouped together in the same category as Ideal

10

Page 11: Chi 2008 katsanos et al auto_cardsorter_final

EfficiencyTotal Time Required

11

AutoCardSorter

Card Sorting

Page 12: Chi 2008 katsanos et al auto_cardsorter_final

Study Results

12

Page 13: Chi 2008 katsanos et al auto_cardsorter_final

Study Results - Validity

13

AutoCardSorter produced results of comparative quality with Card Sorting:

Similarity-Matrices Correlation = 0.80 (P<0.01)

% Agreement of Design = 100%

AutoCardSorter Card Sorting

Page 14: Chi 2008 katsanos et al auto_cardsorter_final

Study Results - Efficiency

14

Page 15: Chi 2008 katsanos et al auto_cardsorter_final

Discussion - Advantages

Increased efficiency (x27)

Reduces resources required

Explore alternative solutions early

Simple to learn and apply

Easy to apply for large sites (>100)

15

Possibility for wider adoption

Page 16: Chi 2008 katsanos et al auto_cardsorter_final

16

Discussion – Current Limitations

Lack of qualitative feedback

No insight to category-labels

Page 17: Chi 2008 katsanos et al auto_cardsorter_final

Future Research

More validation studies in different domains

Additional constraints (e.g. group size)

Improvements to algorithm

Dynamic semantic similarity algos (e.g. LSA IR)

Alternatives to Hierarchical Clustering (e.g.

Factor Analysis)

17

Page 18: Chi 2008 katsanos et al auto_cardsorter_final

A Demo - Sit back and enjoy

18

Page 19: Chi 2008 katsanos et al auto_cardsorter_final

Summary & Questions

Proposed an approach that automates structural design of an information space.

Validation study depicted substantial effectiveness gain, with similar results to a user-based technique

Cheap + Fast + Easy = Possibility for wider adoption

19

Complementary to user-based methods

Christos Katsanos | [email protected]

Page 20: Chi 2008 katsanos et al auto_cardsorter_final

Extra Slides

20

Page 21: Chi 2008 katsanos et al auto_cardsorter_final

More Validation StudiesDetails

21

Health & Nutrition

Educational Portal

Travel & Tourism Site

# of Participants in Card Sorting 18 26 34

# of Items to Sort 16 27 38

Page 22: Chi 2008 katsanos et al auto_cardsorter_final

More Validation StudiesSummary of Results

22

Health & Nutrition

Educational Portal

Travel & Tourism Site

Similarity-Matrices r (p<0.01) 0.80 0.52 0.59

% Agreement of Design 100% 93% 87%

Efficiency (X Times Faster) 27 11 14

Page 23: Chi 2008 katsanos et al auto_cardsorter_final

23

More Validation StudiesEfficiency

Page 24: Chi 2008 katsanos et al auto_cardsorter_final

24

More Validation StudiesNumber of Proposed Categories

Page 25: Chi 2008 katsanos et al auto_cardsorter_final

25

More Validation StudiesAvg. Items/Proposed Category

Page 26: Chi 2008 katsanos et al auto_cardsorter_final

26

More Validation StudiesCorrelation against No of items

Page 27: Chi 2008 katsanos et al auto_cardsorter_final

Statistical Semantic Similarity Measures - Overview

LSA: Latent Semantic Analysis (Landauer & Dumais, 1997) LSA-IR (Falconer et al, 2006) PLSA (Hofmann, 1999)

PMI: Point-wise Mutual Information (Manning & Schutze, 1999) PMI-IR (Turney, 2001) GLSA (Matveeva et al, 2005)

HAL: Hyperspace Analogue to Language (Lund & Burgess, 1996) COALS (Rhode et al, 2004)

27

Page 28: Chi 2008 katsanos et al auto_cardsorter_final

Latent Semantic Analysis

Similar documents tend to have common words

1) Parse corpora representing users’ understanding skills

2) Calculate each word’s frequency of occurrence (TDM)

3) Weight by word’s importance (document, domain)

4) Apply Singular Value Decomposition

5) LSA Index = Cos(Angle of Document Vectors) => [-1,1]

Page 29: Chi 2008 katsanos et al auto_cardsorter_final

29

Card SortingTypical Effort in person days

http://www.intranetleadership.com.au

Page 30: Chi 2008 katsanos et al auto_cardsorter_final

Complementary to user-based methods => What’s the point?

Far better than doing nothing

User-based applied in a more focused & formative manner

Get insight before running a study

Fast redesigns30

Page 31: Chi 2008 katsanos et al auto_cardsorter_final

Why 2 validation measures?

Similarity-matrices Correlation strictest approach (compares

measurements of semantic similarity) more general (does not presuppose

cluster analysis)

% Agreement of Design Less strict How close the ‘proposed’ designs are?

31