44
Future and Emerging Technologies (FET) The roots of innovation The roots of innovation Proactive initiative on: Global Computing (GC) DBGlobe IST-2001-32645 1 st Review Paphos, January 31, 2003 Results on Data Delivery (WP3)

Future and Emerging Technologies (FET)

Embed Size (px)

DESCRIPTION

Results on Data Delivery (WP3). DBGlobe IST-2001-32645. 1 st Review Paphos, January 31, 2003. Proactive initiative on: Global Computing (GC). Future and Emerging Technologies (FET). The roots of innovation. WP3 Outline. Co-ordination/Data Delivery: - PowerPoint PPT Presentation

Citation preview

Future and Emerging Technologies (FET)

Future and Emerging Technologies (FET)

The roots of innovationThe roots of innovationThe roots of innovationThe roots of innovation

Proactive initiative on:

Global Computing (GC)

Proactive initiative on:

Global Computing (GC)

DBGlobe IST-2001-32645

1st Review

Paphos, January 31, 2003

Results on Data Delivery (WP3)

2

WP3 Outline

Co-ordination/Data Delivery:

Task 3.1 Data delivery among the system components. Derive adaptive data delivery mechanisms considering various modes of delivery such as push (transmission of data without an explicit request) and pull, periodic and aperiodic, multicast and unicast delivery.

Task 3.2 Model the co-ordination of the mobile entities using workflow management (and transactional workflows) and techniques used in the multi-agent community.

DBGlobe, 1st Annual Review Paphos, Jan 2003

3

Timeline …

DBGlobe, 1st Annual Review Paphos, Jan 2003

Year 1 Year 2

3 6 9 12 15 18 21 24

3.3 Performance

3.2 Coordination

3.1: Data Delivery

WP3

D8: Data Delivery Mechanisms (Oct 2002)

D9: Modeling Coordination Through Workflows (April 2003)

D10: Data Delivery and Querying (August 2003)

Deliverables

4

DBGlobe, 1st Annual Review Paphos, Jan 2003

A number of specific results in data delivery:

• Coherent Push-based Data Delivery

• Adaptive Multi-version Broadcast Data Delivery

• Efficient Publish-Subscribe Data Delivery

D8: Data Delivery Mechanisms

A taxonomy of mechanisms

An outline of potential use within the DBGlobe architecture

Outcomes of WP3 so far:

5

DBGlobe, 1st Annual Review Paphos, Jan 2003

Just a note on the different modes

Summary of technical results

1. Coherent Data Delivery

2. Adaptive Multi-version Broadcast Data Delivery

3. Efficient Publish-Subscribe Data Delivery

In this presentation:

6

DBGlobe, 1st Annual Review Paphos, Jan 2003

Client Pull vs. Server Push

pull-based: transfer of information is initiated by the clientpush-based: server-initiated, servers send information to

clients without any specific request. push is scalable but clients may receive irrelevant data

hybrid schema: hot data are pushed and cold data are pushed

Aperiodic vs. Periodic

aperiodic delivery: usually event-driven a data request (for pull) or transmission (for push) is triggered by an event (i.e. a user action for pull or a data update for push).

periodic delivery: performed according to some pre-arranged schedule

D8: Taxonomy of Different Modes of Data Delivery

Data Delivery Modes

7

DBGlobe, 1st Annual Review Paphos, Jan 2003

Unicast vs 1-N

Unicast: from a data source (server) to the client 1-to-N: data sent received by multiple clients multicast and broadcast

Data vs. Query Shipping

Based on the unit of interaction between clients and data sources

Depends on whether the data sources have data processing capabilities

Query shipping may result in reducing the communication load, since only relevant data sets are delivered to the client.

D8: Taxonomy of Different Modes of Data Delivery

8

DBGlobe, 1st Annual Review Paphos, Jan 2003

D8: Taxonomy of Different Modes of Data Delivery

pull

unicast broadcast

periodic

aperiodic

push

Email list( aperiodic, unicast, push)

Publish/subscribe(aperiodic, 1-N, push)

1-N

Polling( periodic, unicast, pull)

Request/Response(aperiodic, unicast, pull)

9

DBGlobe, 1st Annual Review Paphos, Jan 2003

A note on the different modes

Summary of technical results

1. Coherent Push-based Data Delivery

2. Adaptive Multi-version Broadcast Data Delivery

3. Efficient Publish-Subscribe Data Delivery

Outline:

10

The Data Broadcast Push Model

Client

Server Broadcast Channel

• The server broadcasts data from a database to a large number of clients

• push mode + no direct communication with the server (stateless server, e.g., sensors)

“client-side” protocols

• Data updates at the server

• Periodic updates for the values on the channel Efficient way to disseminate information to large client populations with

similar interests Physical support in wireless networks (satellite, cellular) Various other applications, sensor networks, data streams

Coherent Data Delivery

11

Ensure that clients receive temporally coherent (e.g., current) and semantically coherent (transaction-wise) data

Our Goal

Coherent Data Delivery

1. Provide a model for temporal and semantic coherency

2. Show what type of coherency we get if there are no additional protocols

3. Show what type of coherency is achieved by a number of protocols proposed in the literature (and their extensions)

12

(Currency Interval of an Item)

where cb is the time instance the value of x read by R was stored in the database and ce is the time instance of the next change of this value in the database.

If the value read by R has not been changed subsequently, ce is infinity.

Temporal Coherency: Model

Based on CI(x, R), two types of currency of the readset of a transaction R

• Overlapping

• Oldest-value

CI(x, R): currency interval of x in the readset of R = [cb, ce)

Currency properties of the readset (set of items read and their values) based on currency of the currency of the items in the readset

13

(x, u) RS(R) CI(x, R)

In general, oldest value currency of a transaction R, denoted OV (R), = ce-, where ce is the smallest among the endpoints of the CI(x, R), for every x, (x, u) RS(R).

there is an interval of time that is included in the currency interval of all tems in R's readset

, say [cb, ce) overlapping current, with overlapping currency, Overlap(R) = ce- (if ce is

not infinity),

current_time (otherwise)

If R is overlapping current, Overlap(R) = OV(R)

Temporal Coherency: Model

14

If not overlapping, we want to measure the discrepancy among the database states seen by a transaction: temporal spread

(Temporal Spread of a Readset)Let min_ce be the smallest among the endpoints and max_cb the largest among the begin-points of the CI(x, R) for x in the readset of a transaction R.

temporal_spread(R) = max_cb - min_ce, if max_cb > min_ce

0 otherwise.

For an overlapping current transaction, the temporal spread is zero!

Temporal Coherency: Model

15

ExampleTemporal Coherency: Model

2 4 6 8 10 12 14 16 18 20

R1 reads x1, x2, x3, x4

CI(x1, R1)

CI(x2, R1)

CI(x3, R1)

CI(x4, R1)

Overlapping current with Overlap(R) = 8

and temporal_spread(R) = 0

16

ExampleTemporal Coherency: Model

2 4 6 8 10 12 14 16 18 20

R1 reads x1, x2, x3, x4

CI(x1, R1)

CI(x2, R1)

CI(x3, R1)

CI(x4, R1)

Not Overlapping, but OV(R) = 8

and temporal_spread(R) = 9 – 8 = 1

Oldest value read (min_ce)

max_cb (most current)

17

ExampleTemporal Coherency: Model

2 4 6 8 10 12 14 16 18 20

R1 reads x1, x2, x3, x4

CI(x1, R1)

CI(x2, R1)

CI(x3, R1)

CI(x4, R1)

Not Overlapping, but OV(R) = 8

and temporal_spread(R) = 15 – 8 = 9

Oldest value read (min_ce)

max_cb (most current)

18

(Transaction-Relative Currency)R is relative overlapping current with respect to time instance t, if t CI(x, R), x read by R. R is relative oldest-value current with respect to time instance t, if t ≤ OV(R).

Besides discrepancy, currency (how old are the values seen)

(Temporal Lag)Let tc be the largest t ≤ tcommit_R, with respect to which R is relative (overlapping or oldest value) current, then temporal_lag(R) = tcommit_R - tc.The smaller the temporal lag and the temporal spread, the higher the temporal coherency of a read transaction.

Temporal Coherency: Model

best temporal coherency when overlapping relative current with respect to tcommit_R (both the time lag and the temporal spread are zero).

19

Example Temporal Coherency: Model

2 4 6 8 10 12 14 16 18 20

CI(x1, R1)

CI(x2, R1)

CI(x3, R1)

CI(x4, R1)

Overlapping current with Overlap(R) = 8

temporal_spread(R) = 0

temporal_lag(R) = 0

R1

20

Example Temporal Coherency: Model

2 4 6 8 10 12 14 16 18 20

CI(x1, R1)

CI(x2, R1)

CI(x3, R1)

CI(x4, R1)

Overlapping current with Overlap(R) = 8

temporal_spread(R) = 0

temporal_lag(R) = 12 – 8 = 4

R1

21

Example Temporal Coherency: Model

2 4 6 8 10 12 14 16 18 20

CI(x1, R1)

CI(x2, R1)

CI(x3, R1)

CI(x4, R1)

Overlapping current with Overlap(R) = 8

temporal_spread(R) = 0

temporal_lag(R) = 19 – 8 = 11

R1

22

What is the coherency of R (temporal lag and spread) if R just

reads items from the broadcast?

Let tlastread_R be the time instance R performs its last read.

temporal_lag(R) ≤ tcommit_R - begin_cycle(tbegin_R) and temporal_spread(R) ≤ tlastread_R - begin_cycle(tbegin_R)

(tight bounds)There are cases that we get the worst lag and spread

• If pu = 0 (immediate updates), best (worst) lag and spread

• If all items from the same cycle, spread is 0, and lag = pu

Temporal Coherency: Protocols

23

Basic Techniques

Protocols fall in two broad categories: • invalidation (which corresponds to broadcasting the endpoints (ces) of the currency interval for each item)

• versioning (which corresponds to broadcasting the begin points (cbs) of the currency interval for each item)

Temporal Coherency: Protocols

And a hybrid protocol that combines versioning and invalidation

24

Invalidation

Periodically broadcast, IR, a list with the items that have been updated since the broadcast of the previous IR

In the paper: variations that give transactions with different values of temporal spread and lag

Temporal Coherency: Protocols

With each item, broadcast a timestamp (version) when it was created

Again in the paper: variations that give transactions with different values of temporal lag (spread is always 0)

Versioning

25

Definitions of Semantic Coherency (Consistency)

C0

C1 RS(R) DS (subset of a consistent database state)

C2 R serializable with the set of server transactions that read values read (directly or indirectly) by R

C3 R serializable with the all server transactions C4 R serializable with the all server transactions and the serializability order of the server transactions that R observes is consistent with the commit order of transactions at the server

Semantic Coherency: Model

Rigorous schedules: commit order compatible with the serialization order

26

Relating Semantic and Temporal Coherency

(Currency Interval of an Item)

CI(x, R): currency interval of x in the readset of R = [cb, ce)

where cb is the commit time of the transaction that wrote the value

of x read by R ce is the commit time of the transaction that updated x immediately after or infinity

27

Reading from a single cycle

If transaction R reads all items from the same cycle, it is C1 but not necessarily C2

Semantic Coherency: Protocols

If the server schedule is rigorous and R reads all items from the same cycle, it is C4

28

Read Test Theorem

Semantic Coherency: Protocols

It suffices to check for violation of C2, C3, and C4 by a client transaction R when R reads a data item if and only if the server schedule is rigorous

In the paper: various read-tests (based on testing the serailizability graph) for attaining various Ci-consistency degrees and their relationships to proposed approaches in the literature

29

Future Work

Coherency in Broadcast-Based Dissemination

• Multiple Servers: What is the semantic and temporal coherency the client gets

• Performance Evaluation of the various types of coherency

Reference

E. Pitoura, P. K. Chrysanthis and K. Ramamritham. “Characterizing the Temporal and Semantic Coherency of Broadcast-based Data Dissemination”. Proc. of the 9th International Conference on Database Theory (ICDT03), January 2003, Siena, Italy.

30

DBGlobe, 1st Annual Review Paphos, Jan 2003

A note on the different modes

Summary of technical results

1. Coherent Push-based Data Delivery

2. Adaptive Multi-version Broadcast Data Delivery

3. Efficient Publish-Subscribe Data Delivery

Outline:

31

Multi-version Broadcast

Similar Model BUT

The server (data source) at each cycle sends not just one value per item but instead multiple versions per item

Applications:

Multiple data servers share the channel (multi-sensors networks)

Enhance consistency at the server (similar to multi-version schemes in traditional client-server systems)

32

Multi-version Broadcast

Issues

How should the broadcast be organized?

What are appropriate client-cache protocols?

Adaptability

Performance depends on client access patterns

Historical queries

Random queries

33

References

E. Pitoura and P. K. Chrysanthis. “Multiversion Data Broadcast”, IEEE Transactions on Computers 51(10):1224-1230, October, 2002

O. Shigiltchoff, P. K. Chrysanthis and E. Pitoura. “Multi-version Data Broadcast Organizations”. In Proc. of the 6th East European Conference on Advances in Databases and Information Systems (ADBIS), September 2002, Bratislava, Sloavakia

O. Shigiltchoff, P. K. Chrysanthis and E. Pitoura. “Adaptive Multi-version Data Broadcast Organizations”, In preparation for journal publication

Multi-version Broadcast

34

DBGlobe, 1st Annual Review Paphos, Jan 2003

A note on the different modes

Summary of technical results

1. Coherent Push-based Data Delivery

2. Adaptive Multi-version Broadcast Data Delivery

3. Efficient Publish-Subscribe Data Delivery

Outline:

35

Extra Slides

36

Extra Slides (coherency)

37

The Model

ClientServer

Broadcast Channel

• The server repetitively pushes data from a database to a large number of clients

• sequential client access

• asymmetry:

• large number of clients

• transmission capabilities

• Client-site protocols

• The server is stateless

• Data updates at the server

Coherent Data Delivery

38

Updates

Coherency in Broadcast-Based Dissemination

Data are updated at the server

What is the value broadcast at time instance t?

we assume periodic updates with an update frequency or period of pu: meaning that the value placed at time t is the value of the item at the beginning of the update period denoted begin_cycle(t)

For periodic broadcast, usually pu is equal to the broadcast period

39

Preliminary Definitions

Database state: set of (data item, value) pairs

Readset of a transaction R, RS(R): set of (data item, values) that R read

BSc: the content of the broadcast at the cycle that starts at time instance c (again a set of (data item, value) pairs

R may read items from different broadcast cycles, thus items in RS(R) may correspond to different database states

Coherent Data Delivery

40

Variations: instead of a single client transaction a set S of client transactions

Example

C3- site All transactions of a client serializable with all server transactions

C3 - All

Semantic Coherency: Model

41

Relating Semantic and Temporal Coherency

If R is overlapping current, then it is C1 consistent

Assumptions:

• Server schedules are serializable

• Broadcast only committed values

42

Relating Semantic and Temporal Coherency

(Currency Interval of an Item)

CI(x, R): currency interval of x in the readset of R = [cb, ce)

where cb is the commit time of the transaction that wrote the value of x

read by R ce is the commit time of the transaction that updated x immediately after or infinity

Note:

• overlapping currency similar to vintage transactions Server schedules are serializable: ce-vinatge

• semantic currency similar to t-bound, if OV(R) = to, then to-bound

43

Previous Work

• Datacycle [Bowen et al, CACM92] – hardware for detecting changes

Extended for multiple servers [Banerjee&Li, JCI94]

• Certification reports [Barbara, ICDCS97]

• F-Matrix for (update (C2) consistency) [Shanmugasundaram, SIGMOD99]

•SGT Graph (for serializability) [Pitoura, ER-Workshop98], [Pitoura, DEXA-Workshop98], [Pitoura&Chrysanthis, ICDCS99]

• Multiple Versions [Pitoura&Chrysanthis, VLDB99] [Pitoura&Chrysanthis, IEEE TOC 2003]

• cache consistency

(e.g., [Barbara&Imielinski, SIGMOD95, Acharya et al, VLDB1996])

Coherency in Broadcast-Based Dissemination

44

DBGlobe IST-2001-32645