Content Analysis Techniques to Ease Browsing with Handhelds Jalal Mahmud Yevgen Borodin I.V....

Preview:

Citation preview

Content Analysis Techniques to Ease Browsing with Handhelds

Jalal MahmudJalal Mahmud

Yevgen Borodin Yevgen Borodin

I.V. RamakrishnanI.V. Ramakrishnan

Department of Computer ScienceState University of New York at Stony Brook

Stony Brook, NY 11794

Outline

Browsing with Handhelds:

Content Analysis Techniques: - Model-directed Web Transaction - Merchant-Side Web Transaction

- Context Browsing with Mobile - Context-directed Web Transaction

Evaluation:

Future Work:

Browsing with Handheld

User needs to do a lot of scrolling to get to the relevant content

Using PDA

Relevant Content

Problems

Small Screens Offer Narrow Interaction Bandwidth.

Unable to convey the Richness of the Web content.

Involves a Lot of Horizontal and Vertical Scrolling.

Tedious to Get to the Pertinent Content in a Page.

This is worse when one is interested in Web transactions (e.g. buying books, paying utility bills).

Our Approach

Relevant content

Irrelevant content

Filter Away Irrelevant Content and Only Present Relevant Content

First Present the Relevant Content.

Model-directed Web Transaction Web Transaction Examples:

- Buying a CD Player from Bestbuy

- Paying Utility Bills Online

Web Transaction Characteristics:

- A Sequence of Steps

- Each Step is Based on User-Selected Operation

Two aspects of a Web transaction:

- Semantic Concept

- Process Model

Semantic Concepts

Search ResultsTaxonomy Add to Cart Product Details

item_select

submit_searchform

Process Model

TAXONOMY CONCEPT

SEARCH FORM CONCEPT

1

select_item_category

item_select

submit_searchform

Process Model

1

2

subm

it_se

arch

form

item_select

Process Model

SEARCH FORM CONCEPT

SEARCH RESULT CONCEPT

ite

m_

se

lec

t

select_item_category

item_select

submit_searchform

2

add_to_cart

submit_searchform

Process Model

1

Process Model

3

1

2

4

5

6

show_item_detailadd_to_cart

add_to_cart

add_to_cartcheck_out

check_out

check_out

continue_shopping

item_select

select_item_category

select_item_category

submit_searchform

item_select

view_shoppingcart

view_shoppingcart,update_shoppingcart

submit_searchform

submit_searchform

1 - START STATE6 - FINAL STATE

Model-driven transaction

ite

m_

se

lec

t

Su

bm

it_s

ea

rch

form

Process Model

3

1

2

4

5

6

show_item_detailadd_to_cart

add_to_cart

add_to_cartcheck_out

check_out

check_out

continue_shopping

item_select

select_item_category

select_item_category

submit_searchform

item_select

view_shoppingcart

view_shoppingcart,update_shoppingcart

submit_searchform

submit_searchform

1 - START STATE6 - FINAL STATE

Model-driven transaction

ite

m_

se

lec

t

Su

bm

it_s

ea

rch

form

Evaluation Results

Built using Automata Learning Techniques

Training Data

Over 200 Transaction Sequences Collected from over 30 Sites

Recall / Precision

90% / 96% for Books domain

86% / 88% for Consumer Electronics domain

84% / 92% for Office Supplies domain

Process Model

Concept Extraction

LOGICAL TREE

Sort Results By

Select Box

Image

Insignia

Image

Browse

Image

Case Logic

Best Matches

Brand

Sony

Browse

Browse

Camera

Software

Electronics

Case Logic

Taxonomy

Camera

Software

Electronics

Image

Insignia

Image

Browse

Image

Sony

Browse

Browse

Search Result

Electronics

Search Phrase

Search Form

Select Box

Go Button

Entire Site

CONCEPT TREE

Developed a Statistical Model for Each Concept using Machine Learning Techniques

Training DataUsed Labeled Concepts from Over 100 Pages Collected from Two Dozen Sites

Evaluation ResultsConcept Extraction

Evaluation Results

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%S

ea

rch

Fo

rm

Se

arc

hR

esu

lt

Item

Ta

xon

om

y

Item

Lis

t

Item

De

tail

Sh

op

pin

gC

art

Ad

d to

Ca

rt

Ed

it C

art

Co

ntin

ue

Sh

op

pin

g

Ch

eck

ou

t

Books

Electronics

Office Supplies

Recall for Concept Extraction

Model-directed Web Transaction on Handheld: Guide-O-Mobile

Guide-O Mobile

Guide-O-Mobile

Outline

Browsing with Handhelds: Content Analysis Techniques:

- Model-directed Web transaction - Merchant-Side Process Modeling

- Context-Browsing with Mobile - Context-Directed Web Transaction

Evaluation:

Future Work:

Client-Side Process Modeling: Problems

Client-Side Process Modeling in Guide-O-Mobile.

Process Model is Stored in Client Side.

Separate Process Model Needed for Each Domain.

Performance Largely Depends on Concept Extraction.

Merchant-Side Process Modeling Labeled Web Content with Semantic Annotations.

Content Providers will Label their Web Content.

XHTML will be Used to Label Relevant Content in the Web Sites Describe Process Models Specific to the Sites.

Mobile Users will Use the System to Easily Identify Relevant Information. Perform On-Line Transactions.

Prototype ImplementationXHTML tags:

<log in>, <continue shopping>, <add to cart>, <edit cart>, <search form>, <search result>, <item>, <item taxonomy>, <item list>, <item detail>, <item description>, and <checkout>.

Outline

Browsing with Handhelds: Content Analysis Techniques:

- Model-directed Web Transaction - Merchant-side Web Transaction

- Context-Browsing with Mobile - Context-Directed Web Transaction

Evaluation:

Future Work:

Context Browsing with Mobile

On Following a Link Collect Context of the Link Identify the Relevant Section on the Next Page

Using the Context Present the Relevant Section.

Context Browsing Reduces Information Overload Makes Mobile Browsing Faster.

Context-directed Browsing

Context-directed Browsing

How Do We Find Relevant Content?

Finding What is Important on a Web Page: Is Subjective on Any Distinct Page Can be Inferred in a Sequence of Pages

Click on the “MP3 Players" LinkClick on the “MP3 Players" LinkClick on the “MP3 Players" LinkClick on the “MP3 Players" LinkCollect Context of the LinkCollect Context of the LinkCollect Context of the LinkCollect Context of the Link

Find Relevant Section Using Find Relevant Section Using ContextContext

Find Relevant Section Using Find Relevant Section Using ContextContext

Collect Context of the LinkCollect Context of the LinkCollect Context of the LinkCollect Context of the LinkClick the Link – Collect ContextClick the Link – Collect ContextClick the Link – Collect ContextClick the Link – Collect Context

Find Relevant Section Using Find Relevant Section Using ContextContext

Find Relevant Section Using Find Relevant Section Using ContextContext

Click the Link – Collect ContextClick the Link – Collect ContextClick the Link – Collect ContextClick the Link – Collect Context

Context Browsing with Mobile: CMo Prototype

Product Search Using CMo

Outline

Browsing with Handhelds: Content Analysis Techniques:

- Model-directed Web transaction - Merchant-side Web transaction

- Context-Browsing with Mobile - Context-directed Web Transaction

Evaluation:

Future Work:

No Process Model

Contextual Browsing with a Domain-Dependent Knowledge-Base

Relevant Segment Identification Using Contextual

Browsing

Concept Segment Identification Using Knowledge-Base and Heuristics Algorithms

Context-directed Web Transaction

Context-directed Web Transaction: Prototype System

The Online Shopping Knowledge-Base Consists of the Following Few Concepts:

SearchForm, AddToCart, Taxonomy, ShoppingCart, Checkout, etc.

Implementing the Prototype is a Work in Progress.

Evaluation: Guide-O-Mobile Experimental Set-Up

Guide-O-Mobile1.2 GHz desktop with 256 MB RAM

Client-Server Model

Client: 400 MHz iPaq with 64 MB RAM

Server: Core Guide-O System

Evaluation Over two dozen CS graduate students

Over 30 web sites spanning Books, Consumer Electronics and Office Supplies domains

Evaluation: Guide-O MobileGuide-O-Mobile: Overall Time Performance

0

100

200

300

400

500

600

Tim

e(se

c)

Books Electronics OfficeSupplies

Overall Time

Original Page inHandheld

Guide - O - Mobile

Evaluation: Guide-O Mobile

Guide-O-Mobile Overall Time Performance– with standard deviation

Overall Time

0

100

200

300

400

500

600

Books Electronics Office Supplies

Tim

e(s

ec)

Original Page inHandheld

Guide - O - MobileStandard Deviation

Evaluation: Guide-O MobileGuide-O-Mobile: Interaction Time

0

50

100

150

200

250

Tim

e(se

c)

Books Electronics OfficeSupplies

Interaction Time

Original Page inHandheld

Guide - O - Mobile

Evaluation: Guide-O MobileGuide-O-Mobile Interaction Time Performance– with standard deviation

Interaction Time

0

50

100

150

200

250

300

Books Electronics OfficeSupplies

Tim

e(s

ec

)

Original Page inHandheld

Guide - O - Mobile

Standard Deviation

Evaluation:CMo Experimental Set-Up

Client-Server Model Client: IPAQ Pocket PC equipped with Microsoft Pocket PC operating system with wireless Internet connectivity.

Server: Core CMo System

Evaluation 8 CS graduate students completing 8 tasks (8 times each) on 8 Web sites from News and Shopping Domain.

Evaluation:CMoPerformance of Context Identification

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

News

Books

Elect

ronic

s

Offi

ce

Info

rmat

ional

Domains

Ac

cu

rac

y Recall

Precision

F-measure

Evaluation: CMoRelevant Information Identification

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Domains

Acc

urac

y Recall

Precision

F-measure

Browsing Efficiency with CMo

Conclusion and Future Work

Port all the Server Steps to the Handheld.

Extend the Mozilla's Minimo Mobile Browser with CMo Functionalities.

Mining Transactional Models from Contextual Information.

Questions?