36
WWW Challenges WWW Challenges : : Supporting Users in Search and Supporting Users in Search and Navigation Navigation Natasa Milic-Frayling Natasa Milic-Frayling Microsoft Research, Cambridge UK Microsoft Research, Cambridge UK SOFSEM 2004 SOFSEM 2004 January 28, 2004 January 28, 2004

WWW Challenges : Supporting Users in Search and Navigation

Embed Size (px)

DESCRIPTION

WWW Challenges : Supporting Users in Search and Navigation. SOFSEM 2004. Natasa Milic-Frayling Microsoft Research, Cambridge UK. January 28, 2004. Introduction. Intersection Browser Interface – Internet, Intranet, services, local drives. Devices and applications: TabletPC, PDA, eBook - PowerPoint PPT Presentation

Citation preview

WWW ChallengesWWW Challenges::Supporting Users in Search and NavigationSupporting Users in Search and Navigation

Natasa Milic-FraylingNatasa Milic-FraylingMicrosoft Research, Cambridge UKMicrosoft Research, Cambridge UK

SOFSEM 2004SOFSEM 2004

January 28, 2004January 28, 2004

IntroductionIntroduction

ResearchResearch: : Web usage and interfacesWeb usage and interfacesOptimization of service architecturesOptimization of service architecturesText Classification – support for document classification, Text Classification – support for document classification, routing, filteringrouting, filtering

Presentation FocusPresentation Focus WWW challengesWWW challenges in designing effective services and applications. in designing effective services and applications.

IntersectionIntersection Browser InterfaceBrowser Interface – –

Internet, Intranet, services, Internet, Intranet, services, local drives. local drives.

Devices and applications:Devices and applications: TabletPC, PDA, eBookTabletPC, PDA, eBook

Services:Services: MSN Portal and MSN Portal and Search - on-line searching, Search - on-line searching, reading, and browsingreading, and browsing

IntroductionIntroductionIntersectionIntersection

Browser InterfaceBrowser Interface – – Internet, Intranet, services, Internet, Intranet, services, local drives. local drives.

Devices and applicationsDevices and applications: : TabletPC, PDA, eBookTabletPC, PDA, eBook

ServicesServices: MSN Portal and : MSN Portal and Search - on-line searching, Search - on-line searching, reading, and browsingreading, and browsing

ResearchResearch: : Web usage and interfacesWeb usage and interfacesOptimization of service architecturesOptimization of service architecturesText Classification – support for document classification, Text Classification – support for document classification, routing, filteringrouting, filtering

Presentation FocusPresentation Focus WWW challengesWWW challenges in designing effective services and applications. in designing effective services and applications.

IntroductionIntroduction

ResearchResearch: : Web usage and interfacesWeb usage and interfacesOptimization of service architecturesOptimization of service architecturesText Classification – support for document classification, Text Classification – support for document classification, routing, filteringrouting, filtering

Presentation FocusPresentation Focus WWW challengesWWW challenges in designing effective services and applications. in designing effective services and applications.

IntersectionIntersection Browser InterfaceBrowser Interface – –

Internet, Intranet, services, Internet, Intranet, services, local drives. local drives.

Devices and applications:Devices and applications: TabletPC, PDA, eBookTabletPC, PDA, eBook

Services:Services: MSN Portal and MSN Portal and Search - on-line searching, Search - on-line searching, reading, and browsingreading, and browsing

Characteristics of the WebCharacteristics of the Web

Highly distributed:Highly distributed: distributed data and distributed data and processesprocesses

HighlyHighly dynamicdynamic

Evolving content, with still inadequate Evolving content, with still inadequate contentcontent publishing practice. publishing practice.

IMPLICATIONS IMPLICATIONS

On-line ExperienceOn-line Experience

Web access is a Web access is a combination of search combination of search and navigationand navigation Search to find URL of relevant pagesSearch to find URL of relevant pages Navigation to explore result spaceNavigation to explore result space Reading on devices of various display sizes.Reading on devices of various display sizes.

Only limited “context” in both activities preserved Only limited “context” in both activities preserved and exposedand exposed Ineffective searchIneffective search Lost in hyperspaceLost in hyperspace Lost within a document, on small screen Lost within a document, on small screen

devices. devices.

‘‘Diagnoses’Diagnoses’

Three aspects of the WebThree aspects of the Web

Separation of search and document Separation of search and document deliverydelivery

Separation of document authoring and Separation of document authoring and generation of metadatageneration of metadata about the about the documents required by services and documents required by services and applicationsapplications

Lack of generic publishing formatLack of generic publishing format to support to support flexible display of content across devices.flexible display of content across devices.

Part IPart I

Separation of search and document deliverySeparation of search and document delivery

Ineffective SearchIneffective Search

MIDAS - SiteExplorerMIDAS - SiteExplorer

Query

URLsURLs

User’s Information User’s Information NeedNeed

User’s Information User’s Information NeedNeed

Web ServerWeb Server

Search EngineSearch Engine

Web ServerWeb Server

HTTP RequestHTTP Request

Search processesSearch processes

Web page deliveryWeb page delivery

MS READ ServiceMS READ Service MS READ ServiceMS READ Service

Highlighting - How is it done ?Highlighting - How is it done ?

Query

URLsURLs

Query syntactic AnalysisQuery syntactic AnalysisSemantic ExpansionSemantic ExpansionHighlighting RegimeHighlighting RegimeThumbnail CreationThumbnail Creation

Query syntactic AnalysisQuery syntactic AnalysisSemantic ExpansionSemantic ExpansionHighlighting RegimeHighlighting RegimeThumbnail CreationThumbnail Creation

User’s Information User’s Information NeedNeed

User’s Information User’s Information NeedNeed

Topic DescriptionTopic Description

Web ServerWeb Server

Search EngineSearch Engine

Web ServerWeb Server

HTTP RequestHTTP Request

MS READ MS READ ServiceService

MS READ MS READ ServiceService

Link Evaluation - How is it done ?Link Evaluation - How is it done ?

• NLPNLP

• IndexingIndexing

• Search Over Search Over Local IndexLocal Index

Web ServerWeb Server

TopicTopicStorage:Storage:

Topic 1Topic 1Topic 2Topic 2Topic 3Topic 3Topic 4Topic 4

HTTP Requests HTTP Requests for Text Onlyfor Text Only

Mark Links for RelevanceMark Links for Relevance

Download Text OnlyDownload Text Only

MS ReadMS Read

Users have difficulty locating relevant parts of a Web page while Users have difficulty locating relevant parts of a Web page while reviewing search results reviewing search results (MSN Search Diary and Field Interviews)(MSN Search Diary and Field Interviews)

Users have difficulty evaluating search results and refining their search Users have difficulty evaluating search results and refining their search (Anne Cohen-Kiel’s ethnographic study in Spain, UK and Canada; (Anne Cohen-Kiel’s ethnographic study in Spain, UK and Canada; MSN Search Diary Study and Site Interviews).MSN Search Diary Study and Site Interviews).

Solution:Solution:Preserve user’s topic of interest and provide Preserve user’s topic of interest and provide highlighting of topic terms highlighting of topic terms on the pages that the user is viewingon the pages that the user is viewing. .

Allow the users to Allow the users to enhance the topicenhance the topic by adding new query terms or by adding new query terms or resources (lists of concepts, entities, etc.) and resources (lists of concepts, entities, etc.) and perform search over the perform search over the page contentpage content

Allow the user to search the content of the Allow the user to search the content of the pages that are linkedpages that are linked to the to the current page. current page.

When the page is the search result page, this is equivalent to When the page is the search result page, this is equivalent to refining the refining the searchsearch over the over the previous top N search resultsprevious top N search results. .

MSRead – Supporting searchMSRead – Supporting search

MIDAS and SiteExplorerMIDAS and SiteExplorer

Separation of document authoring and Separation of document authoring and generation of metadatageneration of metadata about the documentsabout the documents

required by services and applicationsrequired by services and applications

User lost in the hyperspaceUser lost in the hyperspace

Part IIPart II

ProblemProblem

Crawling - Crawling - Services, such as search engines, collect the Services, such as search engines, collect the data and create metadata but data and create metadata but do not deliver the contentdo not deliver the content Out of sync with the data on the Web servers Out of sync with the data on the Web servers ‘broken links’ ‘broken links’

Services can perform only basic analysis of the context Services can perform only basic analysis of the context No information about structure of information resourcesNo information about structure of information resources No sophisticated linguistic process.No sophisticated linguistic process.

Solution: Solution: MIDAS FrameworkMIDAS Framework

Distributed metadata generationDistributed metadata generation

Generate & store meta-information Generate & store meta-information alongside contentsalongside contents At authoring or publishing timeAt authoring or publishing time Synchronised with publishingSynchronised with publishing

Deliver metadata upon requestDeliver metadata upon request

In case of centralized servicesIn case of centralized services Services do not crawl for data but only for metadataServices do not crawl for data but only for metadata Obtain data through ‘push’ by authors/web servers.Obtain data through ‘push’ by authors/web servers.

Site structureSite structure

Page structurePage structure

METADATAMETADATA:: Linguistic analysisLinguistic analysis

Statistical analysisStatistical analysis

Visual representationVisual representation

Site structureSite structure

Page structurePage structure

METADATAMETADATA:: Linguistic analysisLinguistic analysis

Statistical analysisStatistical analysis

Visual representationVisual representation

AUTHORAUTHOR CLIENTCLIENTSERVERSERVER

Web ServerWeb Server

Web ContentWeb Content Web ContentWeb Content

<dxf:views> <dxf:view title="Main"> <dxf:node url="index.htm"> <dxf:node url="aboutme.htm" /> <dxf:node url="interest.htm" /> <dxf:node url="favorite.htm" /> <dxf:node url="photo.htm" /> <dxf:node url="feedback.htm" /> </dxf:node> </dxf:view></dxf:views>

AUTHORAUTHOR CLIENTCLIENT

Metadata ServerMetadata Server

SERVERSERVER

Web ServerWeb Server

Automatically Automatically Generated Generated

MetadataMetadata

Web ContentWeb Content Web ContentWeb Content

FrontPage Site Template and FrontPage Site Template and Structure in XML FormatStructure in XML Format

SiteExplorerSiteExplorer

Author Author generated generated metadatametadata

Web metadata Web metadata (XML)(XML)

MIDAS is MIDAS is NOTNOT……

……an element of the Semantic Weban element of the Semantic Web

Not adding “knowledge” explicitly into the Not adding “knowledge” explicitly into the WebWeb

SimpleSimple metadata metadata Easily authored/easily computable at Easily authored/easily computable at

authoring/publishing timeauthoring/publishing time Presently available but dismissedPresently available but dismissed

Problems addressed Problems addressed Users have difficulty choosing the right website from the result setUsers have difficulty choosing the right website from the result set

Users want overviews of sites in a list of search results Users want overviews of sites in a list of search results (Anne Cohen-Kiel’s ethnographic study in Spain, (Anne Cohen-Kiel’s ethnographic study in Spain, UK and Canada)UK and Canada)

Users have difficulty evaluating search results and refining their search Users have difficulty evaluating search results and refining their search (MSN Search Diary Study and (MSN Search Diary Study and

Site Interviews)Site Interviews)

Users have difficulty locating relevant information within a destination site once they get to Users have difficulty locating relevant information within a destination site once they get to the sitethe site (MSN Search Diary Study and Site Interviews)(MSN Search Diary Study and Site Interviews)

Site Explorer’s Solutions:Site Explorer’s Solutions: Providing users with an Providing users with an overview of the site content overview of the site content asas interactive sitemap interactive sitemap Supporting exploration of the site through Supporting exploration of the site through local searchlocal search

“Anyone who has been to a shopping mall knows the value of the ‘you are here’ dot on the map …

Site maps must become more aware of users’ website navigation…”

Jakob Nielsen, Site Map Usability January 6, 2002

External studies External studies

SiteExplorer BarSiteExplorer Bar

Search Box

Site Overview

Site Structure

Page details

SiteExplorer BarSiteExplorer Bar

Search Box

Site Overview

Site Structure

Page details

SmartView and SearchMobilSmartView and SearchMobilViewing Web on PDAs and Mobile PhonesViewing Web on PDAs and Mobile Phones

Lack of generic publishing formatLack of generic publishing format to support to support flexible display of content across devicesflexible display of content across devices

Ineffective reading on mobile Ineffective reading on mobile devices devices

Part IIIPart III

Lost in Hyperspace - SmallLost in Hyperspace - Small

Complex pages on Complex pages on small screenssmall screens Overview Overview

– – none provided at the none provided at the momentmoment

Extensive Extensive horizontal/vertical horizontal/vertical scrollingscrolling

Lost in Hyperspace - SmallLost in Hyperspace - Small

Location of Location of search search hitshits on result page on result page Difficulty even on Difficulty even on

desktop screensdesktop screens Reason: Reason: disassociationdisassociation

of search service and of search service and document deliverydocument delivery

SmartViewSmartView

SmartView PrototypeSmartView Prototype

SmartView PlusSmartView Plus

SearchMobil SearchMobil

SearchMobil Web ServiceSearchMobil Web Service Collection of search results – “booklet” of Web Collection of search results – “booklet” of Web

pagespages Creation of the “local” full text indexCreation of the “local” full text index

Search within a designated set of pagesSearch within a designated set of pages Annotated booklets (hit highlighting)Annotated booklets (hit highlighting)

Web SearchWeb Search

On-line search: GoogleOn-line search: Google Automatic downloadAutomatic download of pages of pages Processing of pages – Processing of pages –

structure discoverystructure discovery and and content indexingcontent indexing

Creation of a Creation of a booklet of booklet of overviewsoverviews

Indicators of Indicators of search hitssearch hits Indicator of the Indicator of the best regionbest region – –

scroll down the ‘red’ section scroll down the ‘red’ section Select the region and access Select the region and access

the the detailed viewdetailed view

SearchMobil FeaturesSearchMobil Features

Web Search – Detail ViewWeb Search – Detail View

SearchMobil FeaturesSearchMobil Features

On-line search: GoogleOn-line search: Google Automatic downloadAutomatic download of pages of pages Processing of pages – Processing of pages –

structure discoverystructure discovery and and content indexingcontent indexing

Creation of a Creation of a booklet of booklet of overviewsoverviews

Indicators of Indicators of search hitssearch hits Indicator of the Indicator of the best regionbest region – –

scroll down the ‘red’ section scroll down the ‘red’ section Select the region and access Select the region and access

the the detailed viewdetailed view

Local SearchLocal Search

SearchMobil Features – Cont.SearchMobil Features – Cont.

Local searchLocal search – focussed on – focussed on the set of pages in the bookletthe set of pages in the booklet

Indicators of relevance at the Indicators of relevance at the page and the booklet levelpage and the booklet level

SearchMobil PrototypeSearchMobil Prototype

SummarySummarySimple proposition: Simple proposition:

SaveSave metadata about structure and content generated metadata about structure and content generated by authoring applicationsby authoring applications

Benefits on the client side:Benefits on the client side: Rich Rich context for search and navigationcontext for search and navigation Interactive download of document elements and metadata for Interactive download of document elements and metadata for

small devicessmall devices

Benefit for services:Benefit for services: Metadata collected and in sMetadata collected and in s Opportunity for new services based on rich metadataOpportunity for new services based on rich metadata Opportunity for push based services – reduce the need for Opportunity for push based services – reduce the need for

crawling. crawling.

Thank you!Thank you!