Click here to load reader

Secondary Evidence for User Satisfaction With Community Information Systems

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Secondary Evidence for User Satisfaction With Community Information Systems. Gregory B. Newby University of North Carolina at Chapel Hill ASIS Midyear Meeting 1999. What do we want to know?. Who are information seekers ; users? What are their needs? Are their needs being met? - PowerPoint PPT Presentation

Text of Secondary Evidence for User Satisfaction With Community Information Systems

  • Secondary Evidence for User Satisfaction With Community Information SystemsGregory B. Newby University of North Carolina at Chapel Hill

    ASIS Midyear Meeting 1999

  • What do we want to know?Who are information seekers ; users?What are their needs?Are their needs being met?Context: the goals and missions of the community net

  • What else do we want to know?Are people viewing sponsorship information?Reading policy documents?Displaying images?Using search engines or indexes?Local or remote?Browsing or reading?

  • Possible sources of evidenceContent analysis: whats available on the system(s)? Questions asked.Sociological research: talk to people, look at what they use the net for, etc.Psychological research: evaluate cognitive change in user knowledge, etc.Market research: broad data collection from multiple potential audiences

  • More possible sources of evidenceSecondary data: artifacts generated by information system useTodays focus: analysis of log file entriesWeb usage statisticsInstrumenting online menu systemsLogin or call historyOther system logs (email, FTP)

  • What questions may be asked of secondary data?What content is accessed, with what frequency?What paths are followed to content?Are entry points, policy documents, or other front-end material bypassed?Is content read, skimmed or skipped through?What subsets of content are viewed by individuals (patterns of use)

  • Whats wrong with Web server logs?Aggregate level access to content: not the whole story!What are SESSIONS like (a sequence of accesses by a single person)?What are paths from item to item (transcends a single referrer log)Are data used linearly (following hyperlinks)?How long is spent on a document?

  • More analysis is feasible. Sample: Web server logsSingle line entries for each hit (HTTP GET or similar request)Separate file for errors, referrersSample entry:

    56kdial52.absi.net - - [22/May/1999:20:12:45 -0500] "GET /index.html HTTP/1.0" 200 6353

  • Sources of complexity:Multiple types of servers might be on a single system (e.g., RealServer, database server, search engine)A Web page visit might involve many filesFrames and other authoring techniques can confuseMore than one person might use the same remote computer

  • Question: Can we get the story of a session?Yes! Just track through all the hits from the same host within a narrow time periodChallenge: how narrow a time period?Challenge: some hosts support multiple simultaneous users (but not many)Challenge: lots of files per page might confuse things (but narrow +/- a few second time frames can help)Challenge: what is structure of site?

  • Sample GET might include multiple files203.87.57.76 - - [20/May/1999:18:44:48 -0400] "GET /~gbnewby/inls80/explore2.html HTTP/1.1" 200 9681203.87.57.76 - - [20/May/1999:18:44:50 -0400] "GET /~gbnewby/inls80/octo.gif HTTP/1.1" 200 12053203.87.57.76 - - [20/May/1999:18:44:53 -0400] "GET /~gbnewby/inls80/pmail.gif HTTP/1.1" 200 593

  • Heres a story (gbns pages)116.33.237.26 - - [08/May/1999:09:30:59 -0400] "GET /~gbnewby/index_top.html HTTP/1.0" 200 7030 116.33.237.26 - - [09/May/1999:00:44:45 -0400] "GET /~gbnewby/index_top.html HTTP/1.0" 200 7030 116.33.237.26 - - [09/May/1999:11:43:31 -0400] "GET /gbnewby/forms HTTP/1.0" 301 186 116.33.237.26 - - [09/May/1999:12:06:30 -0400] "GET /gbnewby/forms/ HTTP/1.0" 200 1837 116.33.237.26 - - [09/May/1999:16:36:06 -0400] "GET /~gbnewby HTTP/1.0" 301 181 116.33.237.26 - - [09/May/1999:17:44:47 -0400] "GET /~gbnewby/ HTTP/1.0" 200 1355 116.33.237.26 - - [10/May/1999:06:20:22 -0400] "GET /gbnewby/review2.html HTTP/1.0" 200 5178 116.33.237.26 - - [10/May/1999:09:33:51 -0400] "GET /gbnewby/vita.html HTTP/1.0" 200 29487 116.33.237.26 - - [10/May/1999:13:33:30 -0400] "GET /gbnewby/inls80/explore1.html HTTP/1.0" 200 3977 116.33.237.26 - - [11/May/1999:02:43:15 -0400] "GET /gbnewby/inls80/explore2.html HTTP/1.0" 200 9681 116.33.237.26 - - [11/May/1999:09:21:56 -0400] "GET /~gbnewby/vita.html HTTP/1.0" 200 29487 116.33.237.26 - - [11/May/1999:10:05:31 -0400] "GET /gbnewby/presentations/security.html HTTP/1.0" 200 11270 116.33.237.26 - - [11/May/1999:13:35:27 -0400] "GET /gbnewby/index_top.html HTTP/1.0" 200 7030

  • Question: What are entry points for particular documents?Youre on easy street with httpd referrer logs, but these are often not kept (for efficiency)Otherwise, you dont know where someone came from unless it was from YOUR siteBy looking through a session story you can see the path people take to particular pages. Analyze finding aids!

  • Heres a path, including searching and reading128.22.40.142 - - [20/May/1999:11:08:34 -0400] "GET /docsouth HTTP/1.0" 301 307128.22.40.142 - - [20/May/1999:11:08:45 -0400] "GET /docsouth/dasmain.html HTTP/1.0" 200 2705128.22.40.142 - - [20/May/1999:11:08:46 -0400] "GET /docsouth/dasnav.html HTTP/1.0" 200 679128.22.40.142 - - [20/May/1999:11:08:46 -0400] "GET /docsouth/images/greensquare.gif HTTP/1.0" 200 55128.22.40.142 - - [20/May/1999:11:08:56 -0400] "GET /docsouth/search.html HTTP/1.0" 200 3778

  • (part II. This is via metalab.unc.edu)128.22.40.142 - - [20/May/1999:11:08:57 -0400] "GET /docsouth/images/greenarrow.gif HTTP/1.0" 200 113128.22.40.142 - - [20/May/1999:11:19:58 -0400] "GET /docsouth/southlit/southlit.html HTTP/1.0" 200 3685128.22.40.142 - - [20/May/1999:11:20:07 -0400] "GET /docsouth/southlit/southlitmain.html HTTP/1.0" 200 2583128.22.40.142 - - [20/May/1999:11:20:07 -0400] "GET /docsouth/southlit/southlitnav.html HTTP/1.0" 200 789

  • (Part III.)128.22.40.142 - - [20/May/1999:11:38:40 -0400] "GET /docsouth/neh/neh.html HTTP/1.0" 200 3539128.22.40.142 - - [20/May/1999:11:38:45 -0400] "GET /docsouth/neh/nehmain.html HTTP/1.0" 200 2743128.22.40.142 - - [20/May/1999:11:38:45 -0400] "GET /docsouth/neh/nehnav.html HTTP/1.0" 200 759128.22.40.142 - - [20/May/1999:11:39:21 -0400] "GET /docsouth/neh/specialneh.html HTTP/1.0" 200 16549128.22.40.142 - - [20/May/1999:11:39:51 -0400] "GET /docsouth/neh/texts.html HTTP/1.0" 200 11999128.22.40.142 - - [20/May/1999:11:40:16 -0400] "GET /docsouth/harriet/menu.html HTTP/1.0" 200 2085128.22.40.142 - - [20/May/1999:11:40:27 -0400] "GET /docsouth/harriet/small.gif HTTP/1.0" 200 43701128.22.40.142 - - [20/May/1999:11:41:01 -0400] "GET /docsouth/harriet/harriet.html HTTP/1.0" 200 217418128.22.40.142 - - [20/May/1999:11:41:07 -0400] "GET /docsouth/harriet/harrietcva.gif HTTP/1.0" 200 85180128.22.40.142 - - [20/May/1999:11:41:11 -0400] "GET /docsouth/harriet/harriettpa.gif HTTP/1.0" 200 77742

  • Question: Where do people go from a particular location?Again, your story logs can track thisAgain, caching is a particular challenge. For example, a user might follow hyperlinks, but the logs show discontinuities (because they went via a cached document)

  • Sample: going from specifics, to index, to sub-index4blah18.blahinc.com - - [22/May/1999:00:21:01 -0500] "GET /mrm/father.html HTTP/1.0" 200 17604blah18.blahinc.com - - [22/May/1999:00:21:03 -0500] "GET /mrm/bluegrass.gif HTTP/1.0" 200 269594blah18.blahinc.com - - [22/May/1999:00:27:48 -0500] "GET /index.html HTTP/1.0" 200 62164blah18.blahinc.com - - [22/May/1999:00:27:51 -0500] "GET /beige_pale.gif HTTP/1.0" 200 20854blah18.blahinc.com - - [22/May/1999:00:27:53 -0500] "GET /pnetlogo.gif HTTP/1.0" 200 38614blah18.blahinc.com - - [22/May/1999:00:28:07 -0500] "GET /directory.html HTTP/1.0" 302 2164blah18.blahinc.com - - [22/May/1999:00:28:16 -0500] "GET /directory/culture.html HTTP/1.0" 200 29804blah18.blahinc.com - - [22/May/1999:00:28:18 -0500] "GET /directory/buggy.jpg HTTP/1.0" 200 82134blah18.blahinc.com - - [22/May/1999:00:28:38 -0500] "GET /prairienations/index.htm HTTP/1.0" 200 91364blah18.blahinc.com - - [22/May/1999:00:30:23 -0500] "GET /directory/nature.html HTTP/1.0" 200 6865

  • Question: How long is spent on a document?Easy: inter-click time from a sessionYou could even make an average time per document for some gateway documents (such as user agreements). Or, infer AT/D by tracking those sessions that seem to be contiguous. This is challenging: what if someone goes to another site, or takes a nap?Caching is still a problem

  • Analysis of other secondary sources of dataSee Newby & Bishop 1997 for instrumentation of menu systemsLog choices of menu optionsCorrelate with basic user demographics (collected online)Problem: most modern systems are not login-based, theyre Web-basedAccess logs: are people coming in from dial-up lines, academic locations, etc? Dial-up = watch graphics!

  • ConclusionsThe easy automated tools for Web log analysis are insufficientThey could be extended with some programming effort or utilitiesEyeballing the logs is still usefulBe cautious about privacy - both your own sites policy, and the problems of posting some log data