Distributed Top-K Monitoring

Preview:

DESCRIPTION

Distributed Top-K Monitoring. Brian Babcock & Chris Olston Presented by Yuval Altman. To be presented at ACM SIGMOD 2003 International Conference on Management of Data. The problem. Continuously report the k largest values obtained from distributed data streams. Motivation -. - PowerPoint PPT Presentation

Citation preview

1

Distributed Top-K Monitoring

Brian Babcock & Chris Olston

Presented by Yuval Altman

To be presented at ACM SIGMOD 2003 International Conference on Management of Data

2

The problem

Continuously report the k largest values obtained from

distributed data streams.

3

Motivation -

Google is the most popular search engine in the world.

Servers in multiple sites in the world handle millions of queries an hour.

What are the top 20 search terms?

4

The problem

Continuously report the k largest values obtained from distributed data streams.

Multiple sources - physically far away Communication is expensive. Inefficient to transmit large amounts of data

Streaming model Values change over time

Approximation may be sufficient

5

Motivation – Detecting DDos attacks

6

Formal problem definition

m+1 nodes: Monitor nodes: N1, N2 , … , Nm

Coordinator node: N0

Set of n data objects U = {O1, O2 , … , On} i.e. Search terms, IP addresses

Objects are associated with real values V1, V2 , … , Vn i.e. # of requests DNS queries to IP address in

last 15 minutes

7

Distributed streaming model

Updates to values through a sequence of < Oi , Nj , > touples where: Nj detects a change in the value Vi of Oi. Change is not seen by other nodes Nk

(ki)

For each node j, Define Partial values V1,j, V2,j,…, Vn,j: Vi,j

= < Oi , Nj , > ()

The value Vi for an object Oi: Vi= j (Vi,j)

8

Model example

< O1 , N1 , 2>< O2 , N1 , 3>< O4 , N1 , 4>< O3 , N1 , 2>< O1 , N1 , 1> N1 N2 N3

U = {O1, O2 , O3 , O4}

< O2 , N2 , 3>< O4 , N2 , 5>< O4 , N2 , -2>< O3 , N2 , 4>< O3 , N2 , 5>

< O2 , N3 , -1>< O3 , N3 , 4>< O2 , N3 , 2>< O3 , N3 , 3>< O2 , N3 , 5>

V1,2 = 0V2,2 = 3V3,2 = 9V4,2 = 3

V1,1 = 3V2,1 = 3V3,1 = 2V4,1 = 4

V1,3 = 0V2,3 = 6V3,3 = 7V4,3 = 0

V1=3 , V2=12 , V3=18 , V4=7

9

Using the model

Top-k IP addresses in the last 15 minutes: <IPAddr,Router,1> when receiving a request

for an IP address. A cancelling <IPAddr,Router,-1> 15 minutes

afterwards

Can Adopt a different strategy: <IPAddr, Router, 15> when receiving a request. <IPAddr, Router, -1> 15 times on the minute

10

The problem

The coordinator node N0 must report a set TU, |T|=k, that represents the top-k data objects.

Must be the correct within .

Formally. If OtT and OsU-T :

Vt+ VS

Example

=5

1009795 92908887838075

11

Related work

One time distributed top-k calculation Bruno, Gravano, Marian 2002 Fagin, Lotem, Naor 2001

Much better than transmitting all the values to coordinator nodeNot streaming no means to detect changes to data Running algorithm continuously is very expensive

Monitor nodes have limited query capabilities Sorted (GetNext) and random (GetValue)

12

Related work

Streaming top-k monitoring from single source Charikar, Chen, Farach-Colton 2002 Manku, Motwani 2002 Gibbons, Matias 1998

Randomized Algorithms Focus on minimizing space

Reminder: The objective is to minimize communication costs

13

Overview of algorithm

Initialize a top-k set at the coordinator node

Set arithmetic constraints at monitor nodes Depend on current top-k set

Constraints valid No communications

Constraints invalidated Resolution Possibly new top-k set Reallocation of constraints

14

Choosing the constraints

Ideally, data is distributed evenly at monitor nodes, such that the top-k sets are the sameIn this case, the global top-k set matches the local local top-k sets It suffices that local constraints remain valid

N1 (US)

Money=100Sex=98

Health=94Mail=92

N2 (Germany)

Sex=30Money=20

Mail=5Health=3

N3 (Japan)

Money=50Sex=5Mail=4

Health=1

Global List

Money=170Sex=133Mail=101Health=98

15

Adjustment factors

In real life, data is not distributed evenly

N1 (US)

Money=100Health=94Mail=92Sex=90

N2 (Germany)

Sex=30Money=20

Mail=5Health=3

N3 (Japan)

Money=50Health=6

Sex=5Mail=4

Global List

Money=170Sex=125

Health=103Mail=101

Local constraints are invalidated, but global top-k still valid

<N1,Sex,-8> <N3,Health,5>

16

Adjustment factors

For each node Nj and object Oi associate an adjustment factor i,j

Constraints are evaluated after adding the adjustment factors If OtT and OsU-T : Vt,i+ t,i Vs,i + t,i

Adjustment factors for each object sum to zero: This ensures sum remains valid

17

Adjustment factors example

N1 (US)

Money=100Health=94Mail=92Sex=90

N2 (Germany)

Sex=30Money=10

Mail=5Health=3

N3 (Japan)

Money=50Health=6

Sex=5Mail=4

Global List

Money=170Sex=125

Health=103Mail=101

Sex,1=10, Sex,2=-15, Sex,3=5N1 (US)

Money=100Sex=100

Health=94Mail=92

N2 (Germany)

Money=20Sex=15Mail=5

Health=3

N3 (Japan)

Money=50Sex=10

Health=6Mail=4

Global List

Money=170Sex=125

Health=103Mail=101

18

Coordinator adjustment factor

For each object Oj add an adjustment factor j,0 at the coordinator node Factors for each object Oj must still sum to 0

To allow error, if OtT and OsU-T : Give Ot values a “bonus” of Let Vt,0

= Vs,0 = 0

The constraint: t,0+ s,0

19

Allowing error – example

N1 (US)

Money=100Sex=98

Health=94Mail=92

N2 (Germany)

Sex=30Money=20

Mail=5Health=3

N3 (Japan)

Money=50Health=41

Sex=5Mail=4

Global List

Money=170Health=138

Sex=133Mail=101

<N3,Health,40> =5

sex,1=-4, 2,sex,2=-25, sex,3=29

health,2=2, health,3=-7

The trick: Health,0 =5sex,0 + 5 health,0

20

Why do adjustment factors work?

For OtT and OsU-T :

As long as for each node Ni the adjusted constraints and the coordinator constraint are valid: Vt,i+ t,i Vs,i + t,I

t,0+ s,0

We can sum for the nodes and the error constraint and get:Vt+ Vs

21

Algorithm details

Coordinator node No maintains Current approximate Top-k set All adjustment factors i,j

Each monitor node Nj maintains Current approximate top-k set For each object Oi

Partial value: Vi,j

Relevant adjustment factor: i,j

22

Algorithm details

Initialization. Coordinator: Computes the approximate top-k set once. Chooses adjustment factors Sends adjustment factors and top-k set to monitors

Monitor node constraints: For OtT and OsU-T : Vt,j+ t,j Vs,j + t,j

Adjustment factor constraints: For each object Oi: j (i,j) = 0

For objects OtT and OsU-T: t,0+ s,0

23

Algorithm for monitor node Nj

Algorithm for monitor node Nj

While (1) Read tuple < Oi , Nj , >

Vi,j = Vi,j +

Check constraints: For OtT and OsU-T : Vt,j+ t,j Vs,j + t,j

If invalid, initiate resolution.

End

To check constraints: Use two Heaps (or Fibheaps)

24

Resolution – phase 1

First, Nj sends a message to N0

with: F - The set of objects

involved in violated constraints

All partial values for objects in R = FT

The border value Bf - Maximum adjusted value not in the resolution set

N3 (Japan)

Money=50 Mail=10

Sex=5Health=1Love=0

F3 = {Mail, Sex}R3 = {Money,Mail, Sex}

Vmoney,3 = 50Vmail,3 = 10Vsex,3 = 5B3 = 1

25

Resolution – phase 2

The coordinator N0 attempts to resolve the constraints using the *,0 slack

For each violated constraint N0 tests:

Vt,j+ t,j + t,0 + Vs,j + s,j + s,0

If all tests succeed, the top-k set is valid, and there’s no need to communicate with other nodes. No reallocates adjustment factors. Resolution is over

If at least one test fails, proceed to phase 3

26

Phase 2 resolution example

Money=100Sex=98Mail=96

Health=92

Money=35Sex =20Mail=5

Health=3

Money=50Sex=5Mail=4

Health=1

Money=185Sex=123Mail=105Health=96

=5

<N2,Mail,17>

*,* =0

Money=100Sex=98Mail=96

Health=92

Money=35Mail=22Sex =20Health=3

Money=50Sex=5Mail=4

Health=1

Money=185 Sex=123Mail=122Health=96

To fix: sex,0 =-2 sex,2 =2

27

Phase 2 resolution failure

<N2,Sex,5>

Money=100Sex=98Mail=96

Health=92

Money=35 Sex =27

Mail=22Health=3

Money=50Sex=5Mail=4

Health=1

Money=185 Sex=128Mail=122Health=96

sex,0 =-2 sex,2 =2

<N3,Mail,5>

Money=100Sex=98Mail=96

Health=92

Money=35 Sex =27

Mail=22Health=3

Money=50Mail=9Sex=5

Health=1

Money=185 Sex=128Mail=127Health=96

Can’t “loan” 4 from sex,0

28

Resolution – phase 3

The coordinator N0 contacts all the nodes Ni

excluding Nj, requesting: Partial values for objects in R = FT Border values Bi

N0 sums the partial values and sorts them to compute new top-k list T’

N0 reallocates new adjustment factors for T’

N0 sends T’ and adjustment factors to all nodes

29

Resolution – summary

Phase 1 - Nj detects failed constraints and notifies N0. Initiates resolution for R = FT

Phase 2 – N0 attempts to resolve constraints using *,0 – the “bank” If success, reallocate adjustment factors & stop

Phase 3 - N0 requests all updated partial values for R, sorts, computes new top-k list Reallocate adjustment factors

30

Resolution Performance

Means to measure algorithm performanceMessages are usually small Only resolution set R = FT is involved

Two phase resolution Initiation + reallocation Only two messages

Three phase resolution Initiation + Query + reallocation 1 + 2(m-1) + m = 3m –1

31

Adjustment factor reallocation

Input: top-k list T’ Partial values in resolution set R Border values

Output New adjustment factors i,j

Method - For each object: Meet border value constraints Calculate leeway Distribute leeway evenly

Money=50 Mail=10

Sex=5Health=1Love=0

F = {Mail, Sex}R = {Money,Mail, Sex}

Vmoney = 50Vmail = 10Vsex = 5B = 1

32

Leeway computation

For each object in R compute leeway : the slack above the sum of border valuesDefine: Sum of border values: B = j (Bj) Computed values: Vi = j (Vi,j) Vi,0 = 0 ; Bj = max (i,0) where Oi not in R

If Oi T’ : i= Vi – B + Otherwise : i= Vi – B

33

Leeway computation example

N1 (US)

Money=100Sex=98Health=94

Mail=92Love = 85

N2 (Germany)

Sex=30Money=20

Mail=5Love = 5Health=3

N3 (Japan)

Money=50 Mail=10

Sex=5Health=1Love=0

Global List

Money=170Sex=133Mail=107Health=98Love=90

B = 94+5+1 = 100

money = 170 – B = 70

sex = 133 – B = 33

Mail = 107 – B = 7

=0

34

Leeway distribution

Initialization: Meet constraints i,j = Bj - Vi,j

For Oi T’ , j = 0 : i,0 = B0 - Leeway distribution: i,j = i,j + (i / m)

Correctness: Vt,j+ t,j Vs,j + t,j

If Os R: follows from Vt,i, > Bi

If Os R: follows from t,i > s,i

35

Leeway distribution example

N1 (US)

Money=100Sex=98Health=94

Mail=92Love = 85

N2 (Germany)

Sex=30Money=20

Mail=5Love = 5Health=3

N3 (Japan)

Money=50 Mail=10

Sex=5Health=1Love=0

Global List

Money=170Sex=133Mail=107Health=98Love=90

sex = 33

sex,1 = B1 – Vsex,1 + 33/3 = 94 – 98 + 11 = 7

sex,2 = B2 – Vsex,2 + 33/3 = 5 – 30 + 11 = -14

sex,3 = B3 – Vsex,3 + 33/3 = 1 – 5 + 11 = 7

36

Leeway distribution example

money = 70

money,1 = B1 – Vmoney,1 + 70/3 = 94 – 100 + 24 = 18

money,2 = B2 – Vmoney,2 + 70/3 = 5 – 20 + 23 = 8

money,3 = B3 – Vmoney,3 + 70/3 = 1 – 50 + 23 = -26

mail = 7

mail,1 = B1 – Vmail,1 + 7/3 = 94 – 92 + 3 = 5

mail,2 = B2 – Vmail,2 + 7/3 = 5 – 5 + 2 = 2

mail,3 = B3 – Vmail,3 + 7/3 = 1 – 10 + 2 = -7

37

Reallocation Results

N1 (US)

Money=100Sex=98Health=94

Mail=92Love = 85

N2 (Germany)

Sex=30Money=20

Mail=5Love = 5Health=3

N3 (Japan)

Money=50 Mail=10

Sex=5Health=1Love=0

Global List

Money=170Sex=133Mail=107Health=98Love=90

N1 (US)

Money=118Sex=105Mail=97Health=94Love = 85

N2 (Germany)

Money=28Sex=16Mail=7Love = 5Health=3

N3 (Japan)

Money=24 Sex=12 Mail=3 Health=1Love=0

Global List

Money=170Sex=133Mail=107Health=98Love=90

38

Leeway distribution to N0

Leeway also distributed to monitor node added to leeway computation for Ot T’ Initialization for t,0 for Ot T’ is B0 - Any addition can be “loaned” to monitor nodes

Amount distributed to N0

Higher (i / 2) – Less chance for phase 3 in resolution

Lower (0) – Less resolutions (More leeway to monitor nodes)

39

Proportional leeway distribution

Allocate more leeway to monitor nodes updated more often

Top-k likely to change more

Good for monitor notes that exhibit characteristic behavior Google locations Enterprise routers

40

Experiments

Query 1: FIFA ’98 Servers at 4 locations throughout the world. 20 top Web site page hit statistics

Query 2: Most loaded server in a cluster Single value per monitor node

Query 3: Berkly to world WAN link, with 4 monitor points 20 top destination hosts by number outgoing tcp

packets

41

Results – Query 1

42

Results – Query 2

43

Results – query 3

44

Analysis of results

Allowing error improves results dramatically

Leeway for N0 – Dominant factor Low – Half leeway to N0

Low little leeway Resolutions are bound to happen. Make them less

expensive

High – No leeway to N0

45

Analysis of results

Even / Proportional leeway distribution depends on query. Server load – Proportional Berkly WAN – Monitor nodes simulated, so

even distribution better FIFA – Proportional for lower . Even for

higher .

46

Comparison to alternative

Caching Coordinator holds cached partial data values Monitor must send update to coordinator when

partial value deviates by /2m

Monitor will always have correct partial values, within /2

Top-k list always correct within

47

Results:

Note the

log scale!

48

Summary

Problem – find top-k set within error Distributed – multiple sources Streaming – frequent updates

Naive approach Transmit streams to coordinator node If error is allowed, transmit only when deviation from

cached value threatens correctness

New approach offers dramatic improvement over naïve approach for low-medium .

49

Summary

Use adjustment factors to establish constraintsMonitor node initiates resolution when constraint gets brokenResolution Attempt to use coordinator node leeway. If successful,

fix constraints by adjustment factor reallocation. Get partial values for resolution set from all nodes,

compute new top-k set. Reallocate leeway to all nodes.

Reallocation Distribute leeway evenly between monitor nodes Distribute leeway for monitor on on low

50

Questions?

Recommended