DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science,

1DNA Research Group

Growth Codes:Maximizing Sensor Network Data

Persistence

Abhinav Kamra, Vishal Misra, Dan RubensteinDepartment of Computer Science, Columbia

UniversityJon FeldmanGoogle Labs

2ACM Sigcomm 2006

Outline

Problem Description Solution Approach: Growth Codes Experiments and Simulations Conclusions and Ongoing work

3ACM Sigcomm 2006

Background: A generic sensor network

Sink(s)

Sensor Nodes Data follows

multi-hop path to sink(s)

Sensed Data

x1 x9

x10

x12 x11

x13

x4

x5

x6

x3

x2

x8

x7

A few node failures can break the data flow

Generic Aim: Collect data from all nodes at sink(s)

4ACM Sigcomm 2006

Specific Context: Disaster Scenarios

e.g., Monitoring earthquakes, fires, floods, war zones

Problems in this setting Congestion near sink(s)

All nodes simultaneously forward data Overwhelm sink(s) capacity

Congestion near sinkVirtual queue:

5ACM Sigcomm 2006

Specific Context: Disaster Scenarios - 2

Problems in this setting Network Collapsing: nodes failing rapidly

Pre-computed routes may fail Data from failed nodes can be lost Data Recovery from subset of nodes

acceptable

6ACM Sigcomm 2006

Challenges Networking Challenges:

Disaster scenarios: feedback often infeasible Frequent disruptions to routing tree if setup Difficult to predict node failures: sink locations

unknown, surviving routes unknown Difficult to synchronize nodes’ clocks

Coding Challenges: Data source distributed (among all sensor

nodes) Prior approaches (Turbo codes, LDPC codes) aim at

fast complete recovery Sensor nodes have very limited memory,

CPU, bandwidth

7ACM Sigcomm 2006

Maximize Data Persistence

Preserve data from failed sensor nodes

Deliver data to sink(s) as fast as possible

Objectives

6 of 10 symbols reach sink. Persistence = 60%

Fraction of data that eventually reaches the sink(s)

x1

x9

x5

x3

x2x8

x10

x12

x11

x6

+

=

Sink

Data Persistence

8ACM Sigcomm 2006

Limitations of Previous Work

Channel Coding based(e.g. Turbo Codes [Anderson-ISIT94], LT Codes [Luby02])

Aim for complete recovery in minimum time Difficult to implement with distributed

sources Routing-based

(e.g. Directed Diffusion [Govindan00], Cougar [Yao-SIGMOD02])

Conjecture: Too fragile (disrupted easily) for disaster scenarios

9ACM Sigcomm 2006

Our Approach

Two main ideas Randomized routing and replication

Avoid actively maintaining routes Replicate data to increase data survival

Distributed channel codes (Growth Codes) Expedite data delivery & survivability

First (to our knowledge) distributed channel codes

10ACM Sigcomm 2006

Outline

Problem Description Our Solution: Growth Codes Experiments and Simulations Conclusions and Ongoing work

11ACM Sigcomm 2006

Network Assumptions

N node sensor network Limited storage: each node stores small # of data units Large storage at sink(s): sink receives codewords from

random node(s) All sensed data assumed independent (no source

coding)

5

1

4 3

7

2

6

S

S

12ACM Sigcomm 2006

High Level View of the Protocol

1

4

2

3

Nodes send data at random times

(Current implementation: exponentially distributed timers)

13ACM Sigcomm 2006

High Level View of the Protocol (2)

1 2

After time K1, nodes start sending degree 2 codewords

Degree 2 codeword

Symbols

Degree 1 codewords

Sender picks a random symbolXORs it with its own symbol

4

3Even if node 3 fails

Node 3’s data survives

0

K2

K3

K1

14ACM Sigcomm 2006

High Level View of the Protocol (3) After time K1, nodes start sending degree 2 codewords

After time K2, nodes start sending degree 3 codewords

. . After time Ki, nodes start sending degree i+1 codewords

(Times Ki can be out of sync at different nodes) Note: No need to tightly synchronize clocks

0

K2

K3

K1

What are good values for {Ki}?

Please refer to our paper

15ACM Sigcomm 2006

The Intuition behind Growth Codes

Set of symbols

decoded at Sink

Codewords

When very few symbols decoded

Easy to decode low degree codewords

time

16ACM Sigcomm 2006

The Intuition behind Growth Codes(2)

When significant number of symbols decoded

Low degree codewords often redundant

Higher degree codewords more likely to be useful

Set of symbols

decoded at Sink

Codewords

17ACM Sigcomm 2006

Outline

Problem Description Growth Codes Simulations and Experiments Conclusions and Ongoing work

18ACM Sigcomm 2006

Simulations/Experiments:Compare data persistence of various

approaches

1. Simulations: Centralized Setting: compare GC with

other channel coding schemes Distributed Simulation: assess large-scale

performance of coding vs no coding

2. Experiments on motes: Compare time of complete recovery for

GC vs routing Measure resilience to node failures

19ACM Sigcomm 2006

No coding is fast in beginning: slowdown is explained via Coupon Collector’s problem

Soliton/ R-Soliton: poor partial recovery (reason: high degree codewords sent too early)

Growth Codes closest to theoretical upper bound (reason: right degree at the right time)

Centralized Simulation(to compare with other channel coding

schemes for which only centralized versions exist) Single source, single sink Source generates random codewords

according to coding scheme (GC, Soliton)

Zero failure rate

Comparison with various coding schemes

(N = 1500)

1

Source

Sink

20ACM Sigcomm 2006

Growth Codes vs No Coding(Varying N)

Distributed Simulation(to assess the performance gain of coding)

N sources, single sink Random graph topology (avg degree 10) Sink receives 1 codeword per time unit

Complete recovery takes:O(N logN) time without coding (Coupon Collector’s effect)

Linear time with Growth Codes

Soliton/R-Soliton: cannot compare in a distributed setup

21ACM Sigcomm 2006

Experiments with (micaz) motes

(to measure data persistence with time) GC vs TinyOS’s “MultiHop” routing

protocol No routing state at time 0 (scenario where

sensor nodes are deployed rapidly)

“MultiHop” for persistence: takes long time to complete route setup

Comparison with GC simulator validates simulator performance

SExperimental Topology

22ACM Sigcomm 2006

Motes experiments:Resilience to node failures

Nodes generate data every 300 seconds 3 nodes fail just after 3rd data generation

0 300 600 900

Nodes generate data

“MultiHop” sets up routing

“MultiHop” repairs routesNodes send data to

sink

3 random nodes fail

S

Experimental Topology

23ACM Sigcomm 2006

Motes experiments:Resilience to node failures

1st generation: GC faster, MH takes time to setup routes2nd generation: routing already setup, MH very fast

3rd generation: MH needs to repair routes

0 300 600 900

Nodes generate data

“MultiHop” sets up routing

“MultiHop” repairs routes

Nodes send data to sink

3 random nodes fail

24ACM Sigcomm 2006

Other Results: Please refer to our paper

Good values for K1, K2, … More simulations/experiments

Various topologies Other failure scenarios

Implementation details: Memory usage at sensor nodes: how it

affects performance How to handle periodic data generation How to reduce overhead of coefficients

25ACM Sigcomm 2006

Conclusions

Data persistence in sensor networks: First distributed channel codes (GC) Protocol requires minimal configuration Is robust to node failures

Simulations and experiments on micaz motes show, (compared to prior coding and routing methods)

GC achieves complete recovery faster GC recovers more partial data at any time

26ACM Sigcomm 2006

Ongoing Work

Adapt Growth Codes to scenarios where sensor data is correlated

Take advantage of any available routing information (e.g. before a disaster)

Estimate network size on the fly to use in Growth Codes

27ACM Sigcomm 2006

Thanks for your patience !

For more informationDNA Research Lab, Columbia University

http://dna-wsl.cs.columbia.edu/

Documents

DNA Research Group 1 Growth Codes: Maximizing Sensor Network Data Persistence Abhinav Kamra, Vishal Misra, Dan Rubenstein Department of Computer Science,