21
Secondary Evidence for Secondary Evidence for User Satisfaction With User Satisfaction With Community Information Community Information Systems Systems Gregory B. Newby University of North Carolina at Chapel Hill ASIS Midyear Meeting 1999

Secondary Evidence for User Satisfaction With Community Information Systems

  • Upload
    mirit

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

Secondary Evidence for User Satisfaction With Community Information Systems. Gregory B. Newby University of North Carolina at Chapel Hill ASIS Midyear Meeting 1999. What do we want to know?. Who are information seekers ; users? What are their needs? Are their needs being met? - PowerPoint PPT Presentation

Citation preview

Page 1: Secondary Evidence for User Satisfaction With Community Information Systems

Secondary Evidence for Secondary Evidence for User Satisfaction With User Satisfaction With

Community Information Community Information SystemsSystems

Gregory B. NewbyUniversity of North Carolina at Chapel Hill

ASIS Midyear Meeting 1999

Page 2: Secondary Evidence for User Satisfaction With Community Information Systems

What do we want to know?What do we want to know?

Who are information seekers ; Who are information seekers ; users?users?

What are their needs?What are their needs? Are their needs being met?Are their needs being met? Context: the goals and missions of Context: the goals and missions of

the community netthe community net

Page 3: Secondary Evidence for User Satisfaction With Community Information Systems

What else do we want to What else do we want to know?know?

Are people viewing sponsorship Are people viewing sponsorship information?information?

Reading policy documents?Reading policy documents? Displaying images?Displaying images? Using search engines or indexes?Using search engines or indexes? Local or remote?Local or remote? Browsing or reading?Browsing or reading?

Page 4: Secondary Evidence for User Satisfaction With Community Information Systems

Possible sources of Possible sources of evidenceevidence

Content analysis: what’s available on Content analysis: what’s available on the system(s)? Questions asked.the system(s)? Questions asked.

Sociological research: talk to people, Sociological research: talk to people, look at what they use the net for, etc.look at what they use the net for, etc.

Psychological research: evaluate Psychological research: evaluate cognitive change in user knowledge, cognitive change in user knowledge, etc.etc.

Market research: broad data collection Market research: broad data collection from multiple potential audiencesfrom multiple potential audiences

Page 5: Secondary Evidence for User Satisfaction With Community Information Systems

More possible sources of More possible sources of evidenceevidence

Secondary data: artifacts generated Secondary data: artifacts generated by information system useby information system use

Today’s focus: analysis of log file Today’s focus: analysis of log file entriesentries– Web usage statisticsWeb usage statistics– Instrumenting online menu systemsInstrumenting online menu systems– Login or call historyLogin or call history– Other system logs (email, FTP)Other system logs (email, FTP)

Page 6: Secondary Evidence for User Satisfaction With Community Information Systems

What questions may be What questions may be asked of secondary data?asked of secondary data?

What content is accessed, with what What content is accessed, with what frequency?frequency?

What paths are followed to content?What paths are followed to content? Are entry points, policy documents, or Are entry points, policy documents, or

other front-end material bypassed?other front-end material bypassed? Is content read, skimmed or skipped Is content read, skimmed or skipped

through?through? What subsets of content are viewed by What subsets of content are viewed by

individuals (patterns of use)individuals (patterns of use)

Page 7: Secondary Evidence for User Satisfaction With Community Information Systems

What’s wrong with Web What’s wrong with Web server logs?server logs?

Aggregate level access to content: not Aggregate level access to content: not the whole story!the whole story!

What are SESSIONS like (a sequence of What are SESSIONS like (a sequence of accesses by a single person)?accesses by a single person)?

What are paths from item to item What are paths from item to item (transcends a single “referrer” log)(transcends a single “referrer” log)

Are data used linearly (following Are data used linearly (following hyperlinks)?hyperlinks)?

How long is spent on a document?How long is spent on a document?

Page 8: Secondary Evidence for User Satisfaction With Community Information Systems

More analysis is feasible. More analysis is feasible. Sample: Web server logsSample: Web server logs

Single line entries for each “hit” Single line entries for each “hit” (HTTP “GET” or similar request)(HTTP “GET” or similar request)

Separate file for errors, referrersSeparate file for errors, referrers Sample entry:Sample entry:

56kdial52.absi.net - - 56kdial52.absi.net - - [22/May/1999:20:12:45 -0500] "GET [22/May/1999:20:12:45 -0500] "GET /index.html HTTP/1.0" 200 6353/index.html HTTP/1.0" 200 6353

Page 9: Secondary Evidence for User Satisfaction With Community Information Systems

Sources of complexity:Sources of complexity:

Multiple types of servers might be on a Multiple types of servers might be on a single system (e.g., RealServer, single system (e.g., RealServer, database server, search engine)database server, search engine)

A Web page visit might involve many A Web page visit might involve many filesfiles

Frames and other authoring techniques Frames and other authoring techniques can confusecan confuse

More than one person might use the More than one person might use the same remote computersame remote computer

Page 10: Secondary Evidence for User Satisfaction With Community Information Systems

Question: Can we get the Question: Can we get the “story” of a session?“story” of a session?

Yes! Just track through all the “hits” Yes! Just track through all the “hits” from the same host within a narrow from the same host within a narrow time periodtime period– Challenge: how narrow a time period?Challenge: how narrow a time period?– Challenge: some hosts support multiple Challenge: some hosts support multiple

simultaneous users (but not many)simultaneous users (but not many)– Challenge: lots of files per page might Challenge: lots of files per page might

confuse things (but narrow +/- a few confuse things (but narrow +/- a few second time frames can help)second time frames can help)

– Challenge: what is structure of site?Challenge: what is structure of site?

Page 11: Secondary Evidence for User Satisfaction With Community Information Systems

Sample “GET” might Sample “GET” might include multiple filesinclude multiple files

203.87.57.76 - - [20/May/1999:18:44:48 -0400] "GET 203.87.57.76 - - [20/May/1999:18:44:48 -0400] "GET /~gbnewby/inls80/explore2.html HTTP/1.1" 200 9681/~gbnewby/inls80/explore2.html HTTP/1.1" 200 9681

203.87.57.76 - - [20/May/1999:18:44:50 -0400] "GET 203.87.57.76 - - [20/May/1999:18:44:50 -0400] "GET /~gbnewby/inls80/octo.gif HTTP/1.1" 200 12053/~gbnewby/inls80/octo.gif HTTP/1.1" 200 12053

203.87.57.76 - - [20/May/1999:18:44:53 -0400] "GET 203.87.57.76 - - [20/May/1999:18:44:53 -0400] "GET /~gbnewby/inls80/pmail.gif HTTP/1.1" 200 593/~gbnewby/inls80/pmail.gif HTTP/1.1" 200 593

Page 12: Secondary Evidence for User Satisfaction With Community Information Systems

Here’s a “story” (gbn’s Here’s a “story” (gbn’s pages)pages)

116.33.237.26 - - [08/May/1999:09:30:59 -0400] "GET /~gbnewby/index_top.html HTTP/1.0" 116.33.237.26 - - [08/May/1999:09:30:59 -0400] "GET /~gbnewby/index_top.html HTTP/1.0" 200 7030200 7030116.33.237.26 - - [09/May/1999:00:44:45 -0400] "GET /~gbnewby/index_top.html HTTP/1.0" 116.33.237.26 - - [09/May/1999:00:44:45 -0400] "GET /~gbnewby/index_top.html HTTP/1.0" 200 7030200 7030116.33.237.26 - - [09/May/1999:11:43:31 -0400] "GET /gbnewby/forms HTTP/1.0" 301 186116.33.237.26 - - [09/May/1999:11:43:31 -0400] "GET /gbnewby/forms HTTP/1.0" 301 186116.33.237.26 - - [09/May/1999:12:06:30 -0400] "GET /gbnewby/forms/ HTTP/1.0" 200 1837116.33.237.26 - - [09/May/1999:12:06:30 -0400] "GET /gbnewby/forms/ HTTP/1.0" 200 1837116.33.237.26 - - [09/May/1999:16:36:06 -0400] "GET /~gbnewby HTTP/1.0" 301 181116.33.237.26 - - [09/May/1999:16:36:06 -0400] "GET /~gbnewby HTTP/1.0" 301 181116.33.237.26 - - [09/May/1999:17:44:47 -0400] "GET /~gbnewby/ HTTP/1.0" 200 1355116.33.237.26 - - [09/May/1999:17:44:47 -0400] "GET /~gbnewby/ HTTP/1.0" 200 1355116.33.237.26 - - [10/May/1999:06:20:22 -0400] "GET /gbnewby/review2.html HTTP/1.0" 200 116.33.237.26 - - [10/May/1999:06:20:22 -0400] "GET /gbnewby/review2.html HTTP/1.0" 200 51785178116.33.237.26 - - [10/May/1999:09:33:51 -0400] "GET /gbnewby/vita.html HTTP/1.0" 200 116.33.237.26 - - [10/May/1999:09:33:51 -0400] "GET /gbnewby/vita.html HTTP/1.0" 200 2948729487116.33.237.26 - - [10/May/1999:13:33:30 -0400] "GET /gbnewby/inls80/explore1.html 116.33.237.26 - - [10/May/1999:13:33:30 -0400] "GET /gbnewby/inls80/explore1.html HTTP/1.0" 200 3977HTTP/1.0" 200 3977116.33.237.26 - - [11/May/1999:02:43:15 -0400] "GET /gbnewby/inls80/explore2.html 116.33.237.26 - - [11/May/1999:02:43:15 -0400] "GET /gbnewby/inls80/explore2.html HTTP/1.0" 200 9681HTTP/1.0" 200 9681116.33.237.26 - - [11/May/1999:09:21:56 -0400] "GET /~gbnewby/vita.html HTTP/1.0" 200 116.33.237.26 - - [11/May/1999:09:21:56 -0400] "GET /~gbnewby/vita.html HTTP/1.0" 200 2948729487116.33.237.26 - - [11/May/1999:10:05:31 -0400] "GET /gbnewby/presentations/security.html 116.33.237.26 - - [11/May/1999:10:05:31 -0400] "GET /gbnewby/presentations/security.html HTTP/1.0" 200 11270HTTP/1.0" 200 11270116.33.237.26 - - [11/May/1999:13:35:27 -0400] "GET /gbnewby/index_top.html HTTP/1.0" 116.33.237.26 - - [11/May/1999:13:35:27 -0400] "GET /gbnewby/index_top.html HTTP/1.0" 200 7030200 7030

Page 13: Secondary Evidence for User Satisfaction With Community Information Systems

Question: What are entry Question: What are entry points for particular points for particular

documents?documents?

You’re on easy street with httpd You’re on easy street with httpd “referrer” logs, but these are often not “referrer” logs, but these are often not kept (for efficiency)kept (for efficiency)

Otherwise, you don’t know where Otherwise, you don’t know where someone came from unless it was from someone came from unless it was from YOUR siteYOUR site

By looking through a session “story” By looking through a session “story” you can see the path people take to you can see the path people take to particular pages. Analyze finding aids!particular pages. Analyze finding aids!

Page 14: Secondary Evidence for User Satisfaction With Community Information Systems

Here’s a path, including Here’s a path, including searching and readingsearching and reading

128.22.40.142 - - [20/May/1999:11:08:34 -0400] 128.22.40.142 - - [20/May/1999:11:08:34 -0400] "GET /docsouth HTTP/1.0" 301 307"GET /docsouth HTTP/1.0" 301 307

128.22.40.142 - - [20/May/1999:11:08:45 -0400] 128.22.40.142 - - [20/May/1999:11:08:45 -0400] "GET /docsouth/dasmain.html HTTP/1.0" 200 2705"GET /docsouth/dasmain.html HTTP/1.0" 200 2705

128.22.40.142 - - [20/May/1999:11:08:46 -0400] 128.22.40.142 - - [20/May/1999:11:08:46 -0400] "GET /docsouth/dasnav.html HTTP/1.0" 200 679"GET /docsouth/dasnav.html HTTP/1.0" 200 679

128.22.40.142 - - [20/May/1999:11:08:46 -0400] 128.22.40.142 - - [20/May/1999:11:08:46 -0400] "GET /docsouth/images/greensquare.gif HTTP/1.0" "GET /docsouth/images/greensquare.gif HTTP/1.0" 200 55200 55

128.22.40.142 - - [20/May/1999:11:08:56 -0400] 128.22.40.142 - - [20/May/1999:11:08:56 -0400] "GET /docsouth/search.html HTTP/1.0" 200 3778"GET /docsouth/search.html HTTP/1.0" 200 3778

Page 15: Secondary Evidence for User Satisfaction With Community Information Systems

(part II. This is via (part II. This is via metalab.unc.edu)metalab.unc.edu)

128.22.40.142 - - [20/May/1999:11:08:57 -0400] 128.22.40.142 - - [20/May/1999:11:08:57 -0400] "GET /docsouth/images/greenarrow.gif HTTP/1.0" "GET /docsouth/images/greenarrow.gif HTTP/1.0" 200 113200 113

128.22.40.142 - - [20/May/1999:11:19:58 -0400] 128.22.40.142 - - [20/May/1999:11:19:58 -0400] "GET /docsouth/southlit/southlit.html HTTP/1.0" 200 "GET /docsouth/southlit/southlit.html HTTP/1.0" 200 36853685

128.22.40.142 - - [20/May/1999:11:20:07 -0400] 128.22.40.142 - - [20/May/1999:11:20:07 -0400] "GET /docsouth/southlit/southlitmain.html HTTP/1.0" "GET /docsouth/southlit/southlitmain.html HTTP/1.0" 200 2583200 2583

128.22.40.142 - - [20/May/1999:11:20:07 -0400] 128.22.40.142 - - [20/May/1999:11:20:07 -0400] "GET /docsouth/southlit/southlitnav.html HTTP/1.0" "GET /docsouth/southlit/southlitnav.html HTTP/1.0" 200 789200 789

Page 16: Secondary Evidence for User Satisfaction With Community Information Systems

(Part III.)(Part III.) 128.22.40.142 - - [20/May/1999:11:38:40 -0400] "GET /docsouth/neh/neh.html HTTP/1.0" 128.22.40.142 - - [20/May/1999:11:38:40 -0400] "GET /docsouth/neh/neh.html HTTP/1.0"

200 3539200 3539 128.22.40.142 - - [20/May/1999:11:38:45 -0400] "GET /docsouth/neh/nehmain.html 128.22.40.142 - - [20/May/1999:11:38:45 -0400] "GET /docsouth/neh/nehmain.html

HTTP/1.0" 200 2743HTTP/1.0" 200 2743 128.22.40.142 - - [20/May/1999:11:38:45 -0400] "GET /docsouth/neh/nehnav.html 128.22.40.142 - - [20/May/1999:11:38:45 -0400] "GET /docsouth/neh/nehnav.html

HTTP/1.0" 200 759HTTP/1.0" 200 759 128.22.40.142 - - [20/May/1999:11:39:21 -0400] "GET /docsouth/neh/specialneh.html 128.22.40.142 - - [20/May/1999:11:39:21 -0400] "GET /docsouth/neh/specialneh.html

HTTP/1.0" 200 16549HTTP/1.0" 200 16549 128.22.40.142 - - [20/May/1999:11:39:51 -0400] "GET /docsouth/neh/texts.html HTTP/1.0" 128.22.40.142 - - [20/May/1999:11:39:51 -0400] "GET /docsouth/neh/texts.html HTTP/1.0"

200 11999200 11999 128.22.40.142 - - [20/May/1999:11:40:16 -0400] "GET /docsouth/harriet/menu.html 128.22.40.142 - - [20/May/1999:11:40:16 -0400] "GET /docsouth/harriet/menu.html

HTTP/1.0" 200 2085HTTP/1.0" 200 2085 128.22.40.142 - - [20/May/1999:11:40:27 -0400] "GET /docsouth/harriet/small.gif HTTP/1.0" 128.22.40.142 - - [20/May/1999:11:40:27 -0400] "GET /docsouth/harriet/small.gif HTTP/1.0"

200 43701200 43701 128.22.40.142 - - [20/May/1999:11:41:01 -0400] "GET /docsouth/harriet/harriet.html 128.22.40.142 - - [20/May/1999:11:41:01 -0400] "GET /docsouth/harriet/harriet.html

HTTP/1.0" 200 217418HTTP/1.0" 200 217418 128.22.40.142 - - [20/May/1999:11:41:07 -0400] "GET /docsouth/harriet/harrietcva.gif 128.22.40.142 - - [20/May/1999:11:41:07 -0400] "GET /docsouth/harriet/harrietcva.gif

HTTP/1.0" 200 85180HTTP/1.0" 200 85180 128.22.40.142 - - [20/May/1999:11:41:11 -0400] "GET /docsouth/harriet/harriettpa.gif 128.22.40.142 - - [20/May/1999:11:41:11 -0400] "GET /docsouth/harriet/harriettpa.gif

HTTP/1.0" 200 77742HTTP/1.0" 200 77742

Page 17: Secondary Evidence for User Satisfaction With Community Information Systems

Question: Where do Question: Where do people go from a people go from a

particular location?particular location?

Again, your “story” logs can track Again, your “story” logs can track thisthis

Again, caching is a particular Again, caching is a particular challenge. For example, a user challenge. For example, a user might follow hyperlinks, but the might follow hyperlinks, but the logs show discontinuities (because logs show discontinuities (because they went via a cached document)they went via a cached document)

Page 18: Secondary Evidence for User Satisfaction With Community Information Systems

Sample: going from Sample: going from specifics, to index, to sub-specifics, to index, to sub-

indexindex 4blah18.blahinc.com - - [22/May/1999:00:21:01 -0500] "GET /mrm/father.html HTTP/1.0" 4blah18.blahinc.com - - [22/May/1999:00:21:01 -0500] "GET /mrm/father.html HTTP/1.0"

200 1760200 1760 4blah18.blahinc.com - - [22/May/1999:00:21:03 -0500] "GET /mrm/bluegrass.gif HTTP/1.0" 4blah18.blahinc.com - - [22/May/1999:00:21:03 -0500] "GET /mrm/bluegrass.gif HTTP/1.0"

200 26959200 26959 4blah18.blahinc.com - - [22/May/1999:00:27:48 -0500] "GET /index.html HTTP/1.0" 200 4blah18.blahinc.com - - [22/May/1999:00:27:48 -0500] "GET /index.html HTTP/1.0" 200

62166216 4blah18.blahinc.com - - [22/May/1999:00:27:51 -0500] "GET /beige_pale.gif HTTP/1.0" 200 4blah18.blahinc.com - - [22/May/1999:00:27:51 -0500] "GET /beige_pale.gif HTTP/1.0" 200

20852085 4blah18.blahinc.com - - [22/May/1999:00:27:53 -0500] "GET /pnetlogo.gif HTTP/1.0" 200 4blah18.blahinc.com - - [22/May/1999:00:27:53 -0500] "GET /pnetlogo.gif HTTP/1.0" 200

38613861 4blah18.blahinc.com - - [22/May/1999:00:28:07 -0500] "GET /directory.html HTTP/1.0" 302 4blah18.blahinc.com - - [22/May/1999:00:28:07 -0500] "GET /directory.html HTTP/1.0" 302

216216 4blah18.blahinc.com - - [22/May/1999:00:28:16 -0500] "GET /directory/culture.html 4blah18.blahinc.com - - [22/May/1999:00:28:16 -0500] "GET /directory/culture.html

HTTP/1.0" 200 2980HTTP/1.0" 200 2980 4blah18.blahinc.com - - [22/May/1999:00:28:18 -0500] "GET /directory/buggy.jpg 4blah18.blahinc.com - - [22/May/1999:00:28:18 -0500] "GET /directory/buggy.jpg

HTTP/1.0" 200 8213HTTP/1.0" 200 8213 4blah18.blahinc.com - - [22/May/1999:00:28:38 -0500] "GET /prairienations/index.htm 4blah18.blahinc.com - - [22/May/1999:00:28:38 -0500] "GET /prairienations/index.htm

HTTP/1.0" 200 9136HTTP/1.0" 200 9136 4blah18.blahinc.com - - [22/May/1999:00:30:23 -0500] "GET /directory/nature.html 4blah18.blahinc.com - - [22/May/1999:00:30:23 -0500] "GET /directory/nature.html

HTTP/1.0" 200 6865HTTP/1.0" 200 6865

Page 19: Secondary Evidence for User Satisfaction With Community Information Systems

Question: How long is Question: How long is spent on a document?spent on a document?

Easy: inter-click time from a sessionEasy: inter-click time from a session You could even make an “average time You could even make an “average time

per document” for some gateway per document” for some gateway documents (such as user agreements). documents (such as user agreements). Or, infer AT/D by tracking those sessions Or, infer AT/D by tracking those sessions that “seem” to be contiguous. This is that “seem” to be contiguous. This is challenging: what if someone goes to challenging: what if someone goes to another site, or takes a nap?another site, or takes a nap?

Caching is still a problemCaching is still a problem

Page 20: Secondary Evidence for User Satisfaction With Community Information Systems

Analysis of other Analysis of other secondary sources of datasecondary sources of data

See Newby & Bishop 1997 for See Newby & Bishop 1997 for instrumentation of menu systemsinstrumentation of menu systems– Log choices of menu optionsLog choices of menu options– Correlate with basic user demographics Correlate with basic user demographics

(collected online)(collected online)– Problem: most modern systems are not login-Problem: most modern systems are not login-

based, they’re Web-basedbased, they’re Web-based Access logs: are people coming in from Access logs: are people coming in from

dial-up lines, academic locations, etc? dial-up lines, academic locations, etc? Dial-up = watch graphics!Dial-up = watch graphics!

Page 21: Secondary Evidence for User Satisfaction With Community Information Systems

ConclusionsConclusions

The “easy” automated tools for The “easy” automated tools for Web log analysis are insufficientWeb log analysis are insufficient

They could be extended with some They could be extended with some programming effort or utilitiesprogramming effort or utilities

““Eyeballing” the logs is still usefulEyeballing” the logs is still useful Be cautious about privacy - both Be cautious about privacy - both

your own site’s policy, and the your own site’s policy, and the problems of posting some log dataproblems of posting some log data