28
Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Embed Size (px)

Citation preview

Page 1: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Understanding Web SearchingSecondary Readings and So On…

Will Meurer for WIREDOctober 7, 2004

Page 2: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Introduction

• Why do we care about how people use the Web?• Today’s topics (10/7, not the present age):

– Implicit vs. explicit feedback– Representation effectiveness– Browser-based activities– History mechanisms– How do we cater to the people?– Resources– Research

Page 3: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Implicit vs. Explicit FeedbackReading Time, Scrolling and… (Kelly & Belkin, 2001)

• Implicit feedback (Morita & Shinoda):– Time spent on a page is directly related to user

interest. Backed by many studies.

• Explicit feedback (this study)– Time spent on a page is similar for relevant and

irrelevant content.

• Results suggest:– “Generalizability” is severely affected by explicit

feedback methods.– Spend time to choose the right feedback type!

Page 4: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Implicit vs. Explicit FeedbackReading Time, Scrolling and… (Kelly & Belkin, 2001)

• Why do the results differ?– Relevance was difficult to

distinguish this time– Participants are truly

interested in the content former studies

– Users may have rushed to complete in this experimental context

Page 5: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation Effectiveness How we really use the Web (Krug, 2000)

Three “facts of life”:

1. “We don’t read pages. We scan them.”– Why? hurry, necessity, habit– If we are to read its entirety, we save or print!

(ClearType project)

Page 6: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation Effectiveness How we really use the Web (Krug, 2000)

2. “We don’t make optimal choices. We Satisfice.”

– Why? hurry, quick access to and fro, less work than thinking

– Generally, it’s more productive to guess.

Page 7: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation EffectivenessHow we really use the Web (Krug, 2000)

3. “We don’t figure out how things work.”– Why? not important, “if it ain’t broke

(baroque)…”– Is it important to us whether the user

understands how it works or not? Why?

Page 8: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)

• Users get lost on the Web. Why?

• It is not just interactivity between user and system, rather user, task, and information

• Analysis structure of browsing behavior presented and tested“The Interactivity Framework” or “How we

should analyze cognitive strategies”

Page 9: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)

• The Interactivity Framework– User Level – Web experience, cognitive

processes, cognitive style, knowledge (CS majors knew more about SE processes)

– User Strategies – based on searching structure (or lack of), task nature

SEARCHING CONDITIONS FACT FINDING EXPLORATORY

DISPERSED  STRUCTURE

• Look for data base algorithm in Java • Look for criteria for the diagnosis of

diseases

• Find all the available jobs for profession

CATEGORY  STRUCTURE

• Look for word definition • Find all information about 1997 Nobel Prize for Literature

Page 10: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)

– Information Structure• Internal (user’s) representation• External (system’s) representation• Computational Offloading – How much work does the

user have to do to understand and how much does a representation help?

– Re-representation – How much it makes problem solving easier or more difficult

– Graphical Constraining – How it constrains inferences

– Temporal and Spatial Constraining – How it helps when distributed over time and space

Page 11: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)

SEARCHING TASK

EXPERIENCED WEB-PARTICIPANTS

NOVICE WEB-PARTICIPANTS

INFORMATION IN WEB DISPERSED STRUCTURE 

(e.g. find criteria for a psychological disease)

SPECIFIC FACT FINDING:

• Bottom-up  • Mixed strategy at the

beginning and selecting Bottom-up

• Start with top-down and change at the end to bottom-up

• Start typing without knowing why

EXPLORATORY: • Top-down

INFORMATION IN WEB

CATEGORY  STRUCTURE (e.g. find a job opening)

• Mixed strategy at the beginning and then selecting top-down

• Top-down

• Top-down following browser categories

• Start with bottom-up and change to top-down

 

Page 12: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)

• More Results– Experienced users searched with a plan– By having a plan you keep a more internal

representation and focus your search– Inexperienced users were more influenced by

external representations– Computational Offloading Results

• Must explain

– How have these issues changed?

Page 13: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999)

• Conclusions– Cognitive strategies used by the participants

depend on how the information is structured.– Interaction is a multi-dimensioned concept.– Search engine interfaces should be designed

to have less restrictive external representation.

Page 14: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Browser-based ActivitiesCharacterizing Browsing… (Catledge & Pitkow, 1995)

• User study of browsing events at the Georgia Tech (xMosaic browser)

• Three main browsing strategies identified:– Search browsing – directed search, goal known– General purpose browsing – consulting highly likely

sources for needed information (dictionary.com)– Serendipitous browsing – random– Most people use a combination of these

Page 15: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Browser-based ActivitiesCharacterizing Browsing… (Catledge & Pitkow, 1995)

• Results– Users were patient 99% of the time for long page loads– 1222 unique sites accessed outside of GATech (~16% of Web servers)– Paths were calculated (sequences of page navigation)

• Per session, paths of 7 different sites occurred 5 times• Per user, paths of 8 different sites occurred 9 times

Page 16: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Browser-based ActivitiesCharacterizing Browsing… (Catledge & Pitkow, 1995)

• More Results– 2% of the retrieved pages were saved or printed– Based on user’s slope, browsing strategy categories were

applied– Slope can also categorize usage

patterns of Web documents– Users tended to operate in one

small area of a site

Page 17: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Browser-based Activities Characterizing Browsing… (Catledge & Pitkow, 1995)

• Design Strategies– Users averaged 10 pages per server

• Make most important info within 2 or 3 jumps from the index• Do not put too many links on one page – increases search

time (back, forward, back, site map, etc.)

– Facilitate the likely visitor browser patterns• Maybe make more than one version of your page?• Most work well in a “hub and spoke” environment

• The Future– Offer site tour based on most frequently traveled

paths– Alter page design dynamically based on site trends

Page 18: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Purpose: Provide empirical data to aid in the development of effective history mechanisms– Understand revisitation patterns– Evaluate current mechanisms and suggest

best practices and methods

• Data Collection– Altered version of xMosaic to record activity– Survey of users afterward

Page 19: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Revisitation Results– 58% recurrence rate (>40% are new pages!)– As people search they build their vocabulary– 7 browsing strategies

• First-time visits to cluster of pages• Revisits to pages• Authoring of pages (high reload percentage)• Regular use of web-based apps• Hub-and-spoke (breadth-first approach)• Guided tour (e.g. next page links)• Depth-first search (following links deeply before returning to

the index)

Page 20: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Revisitation Results– Visit frequency as a function of distance

• Users mostly revisit recently visited pages (within about 6 jumps)• 39% chance that the next URL will match one of the previous 6

pages visited– Access frequency

• 60% of pages visited only once• 19% visited twice• 8% visited 3 times• 4% visited 4 times

– Locality (not valuable for predicting next page)• Most locality sets were small• Only 2.5 to 4.5 URLs per set• Only 15% of pages were part of a locality set

– Paths (not valuable for predicting next page)• Could these be captured and offered in a history mechanism?• Time per page could indicate path

Page 21: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Mechanism types– Recency Ordered

• Sequential order based on time accessed• Repeated entries for revisitation• “Pruned” by keeping only first instance or only last• Simple for users to understand (they remember paths)

– Frequency Ordered• Most visited at top, least visited at bottom• User interest changes, latest URLs must have frequency• How to break ties – last visited, earliest visited• When few items are on the list, this suffers• Difficult for users to understand

Page 22: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Stack-based– Recently visited at top– Order and availability depend on:

• Loading – causes page to be added to the top• Recalling – changes pointer to the currently displayed page• Revisiting – user reloads the page, has no effect on the stack

– Keeps duplicates– Non-persistent vs. persistent (btw sessions)– Better than recency at short distances– Users have difficulty understanding this model

Page 23: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Hierarchically Structured– Recency ordered hyperlink sublists

• Like recency w/ latest position saved• Each URL has its own sublist of links from that page• Helps with common linking paths• Easier to understand

– Context-sensitive web subspace• Somewhat of a combination of the above-mentioned and

stack-based approaches• Gives user better understanding of context of his/her

searches• May be difficult to remember where a certain URL was• I THINK this approach would be a great tool

Page 24: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

History Mechanisms (in browsers)

Revisitation Patterns in… (Tauscher & Greenberg, 1997)

• Do users actually use history mechanisms?– Less than 1% of navigation– 3% involve favorites– 30% of navigation was back button usage

Page 25: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

How do we cater to the people?

• Inter-site browsing strategies are not easy to tackle. How would you control that?

• Why should we attempt to understand user behavior and search strategies?– Formulate general design principles (e.g. 3 level

depth)– Design for multiple searching personalities– Understand how to survey your intended users or get

feedback most appropriately– Identify importance of all aspects of the development

process and allocate resources accordingly

Page 26: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

How do we cater to the people?

Some Bright Ideas• Personalized search

– Learning systems – You might also like…– www.a9.com (history, favorites, personalized

interface)– But what about changing for different types of user

behavior based on the user’s path history on your server?

• Researched since 1995 and earlier!• What has resulted?• Microsoft ASP.net 2.0 – Web Parts

Page 27: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

What resources are out there?

• xMosaic 2.6 download, for those of you so excited• Architecture of the World Wide Web

http://www.w3.org/TR/webarch/• Sum Sun Sug Gestions

http://www.sun.com/980713/webwriting/• Jakob Nielsen – research on content usability,

http://useit.com/alertbox/9710a.html

Page 28: Understanding Web Searching Secondary Readings and So On… Will Meurer for WIRED October 7, 2004

Research

• Vox Populi: The Public Searching Of The Web (2001)– Compares statistics from two studies– Shows how public searching changed from 1997 to 1999

• Usage Patterns of a Web-Based Library Catalog (2001), Michael D. Cooper

• Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web (2000), Jansen, Spink & Saracevic

• Redefining the Browser History in Hypertext Terms (), Mark Ollerenshaw