Upload
karen-park
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
VLDB2005
CMS-ToPSS: Efficient Dissemination of RSS Documents
Milenko Petrovic Haifeng Liu Hans-Arno JacobsenUniversity of Toronto
VLDB05 2
Information Dissemination
Easy to use web publishing tools (blog, wiki) are fueling the increase in the number of web publishers
RSS frequently used to disseminate update to interested users CNN.com, Yahoo! News, Amazon.com, MSN search (beta)
RSSaggregator
RSSreaders
RSSpublishers
Problem: Polling based architecture
VLDB05 4
Interaction Model: Publish/Subscribe
Broker
Publisher Publisher
Subscriber Subscriber
RSS feeds
MatchingRSS feeds
MatchingRSS feeds
Queries over all RSS
VLDB05 5
Research challenges
1. Need a subscription (query) language suitable for filtering of rss documents
2. Need an efficient matching algorithm based on graph representation• Structurally matching• Constraint matching
3. Scalability to a large number of subscriptions and high publishing rate
VLDB05 7
vary number of subs
020406080
100120140160
5,00
08,
000
10, 0
00
20, 0
00
30, 0
00
40, 0
00
50, 0
00
60, 0
00
70, 0
00
80, 0
00
90, 0
00
100,
000
number of subscri pti ons
matc
hing
tim
e (m
s)
Subscription Scalability
VLDB05 8
Memory Scalability
memory vs. #subs
0100200300400500600700800
5,00
08,
000
10, 0
00
20, 0
00
30, 0
00
40, 0
00
50, 0
00
60, 0
00
70, 0
00
80, 0
00
90, 0
00
100,
000
number of subscri pti ons
memo
ry s
ize
(M)
VLDB05 9
Matching Semantics
PAPER17
“Arno Jacobsen”
AUTHOR
SIGMOD
CONFERENCE
“California”
LOCATION“2001”
YEAR
?y(?y <= Publication)
“Arno Jacobsen”
AUTHOR
SIGMOD
CONFERENCE
?z(?z > 2000)
YEAR
Publication
Subscription
VLDB05 10
Data Model (RSS Documents) Publications are represented as directed
graphs with node and edge labels Node labels are typed
Literal value Class
Edge labels are typed Class
Classes can be related using multiple inheritance ontology
VLDB05 11
Query Language (GQL)
Queries are represented as directed graph patterns with node and edge labels
Node labels are variables Variables can be constrained by
Classes Class instances and literal values
Edge labels are class instances Mapping (matching) semantics
Pattern graph maps to data graph if the topology (structure) of the two graphs matches and all variable constraints are satisfied