ECSE 6962 lecture presentation1 Compression in Correlated Data Aggregation Zhenzhen Ye Oct. 24th, 2005

ECSE 6962 lecture presentation 1

Compression in Correlated Data Aggregation

Zhenzhen YeOct. 24th, 2005


Outline

Background Information Fidelity Compression via Coding

Data Aggregation with Compression Two basic strategies

Aggregation with Distributed Source Coding Aggregation with Explicit Communication

An Example

Conclusion & Future Work


Background

Data Aggregation in Wireless Sensor Networks reduce energy cost, prolong network lifetime, etc.

Aggregation Functions Simple functions: max, min, average, etc. Field reconstruction - much more challenging!

Example [1]: Measuring, conveying and reproducing a temperature field

Region G


Data Correlation

A Motivation for Compression Spatial Correlation & Temporal Correlation Similar to image compression in a distributed fashion

Correlation Structure Model A stationary random process Y(x, t);

Example: One-dimensional Gaussian random field [2]

wheremeasures the intensity of correlation;scaling factor of temporal correlation; 1/2: Gauss-Markov field model; 1: squared distance model;

2 2 21 1 2 2 1 2 1 2[ ( , ) ( , )] exp[ (( ) ( ) ) ]kE Y x t Y x t c x x t t

c

{1/ 2,1}k


Information Fidelity

Fidelity Measure - distortionOriginal Field: X, Reconstructed Field X

Distortion d(X, X) - difference between two field values;Example: Mean-Square Error E[(X - X)2]

Distortion Sources [1, 2, 3] Finite number of sensors - interpolation error, spatial

distortion; Conversion of Continuous-magnitude samples to a

discrete format; Lossy Coding for discretized samples; Delay - time distortion in real-time applications;


Information Fidelity (Cont’)

Total distortion [1]

( ) is the value at position in the original (reconstructed) field; when sensors are dense, the distortion due to interpolation error is negligible (i.e., satisfying Nyquist sampling theorem),

i.e., total distortion is approximately determined by the data processing at each sensor (if time distortion is not considered).

1 ˆ[ ( , ), ( , )]| | G

D d S x y S x y dxdyG

( , )S x y ˆ( , )S x y ( , )x y

1

1 ˆ( , )N

n nn

D d S SN


Compression via Coding

QuestionWhat is the minimal data rate for the given distortion constraint D?

Rate-Distortion Function [4]

where the minimization is over all conditional distributions for which the joint distribution satisfies the expected distortion constraint and the average mutual information between and is

ˆ ˆ( | ): [ ( , )]

ˆ( ) min ( ; )p S S E d S S D

R D I S S

ˆ( | )p S Sˆ( , )p S S

ˆ

ˆ( , )ˆ ˆ( , ) ( , ) logˆ( ) ( )S

p S SI S S p S S dS

p S P S

S S


Compression via Coding (Cont’)

Quantization + Entropy Coding Rate-Distortion Theorem provides a lower bound for the

achievable information rate; - “ideal encoder” A more practical way

where Ik is the index of quantization cell in which X(k) lies. Entropy coding is lossless to quantized data; The size of quantization cell should satisfy distortion

requirement;

Sampler QuantizerEntropy

coder

samp/sec

X(t) X(k) Ik


Lossless Coding

Three types of lossless coding schemes for data aggregation in wireless sensor networks [1] Independent encoding and decoding; Conditional encoding and decoding; Distributed source coding (DSC) - Slepian-Wolf [4];


Lossless Coding (Cont’)

Independent encoding and decoding Each sensor encodes its quantization index

independently; Simplest, but no compression gain (blind to the

correlation); Number of generated bits from all sensors to be sent to

the sink

which is the same for any routing structure.

( )1 2

ˆ ˆ ˆ( ) ( ) ... ( )ICNT H S H S H S



Conditional encoding and decoding Each sensor encodes its local quantization index conditioned

on the received side information (indices) from its descendants in the routing tree;

Compression gain depends on the routing structure; Explicit communication among nodes; Partially takes advantage of the correlation structure; Number of generated bits from all sensors to be sent to the

sink ( )

1

ˆ ˆ( | the set of known at node ) N

ECn i

n

T H S S n



Distributed Source Coding - Slepian-Wolf [4,5] Theorem [4]: Let be jointly ergodic sources

with distribution . Then the set of rate vectors achievable for distributed source coding with separate encoders and a common decoder is defined by

for all where

and Example: 2-source case

1 2( , ,..., )i i miX X X1 2( , ,..., )mp x x x

( ) ( ( ) | ( ))cR U H X U X U{1,2,..., }U m

( ) ii U

R U R

( ) { : }jX U X j U

1 1 2 2 2 1

1 2 1 2

( ) ( | ) , ( ) ( | )

( , ) ( , )

R X H X X R X H X X

R X X H X X



Distributed Source Coding - Slepian-Wolf [4,5] Each sensor encodes its index without knowing other sensor

indices, but with the assumption that the decoder will know other sensor indices at the time of decoding;

Full knowledge of the correlation structure; Number of generated bits from all sensors to be sent to the

sink

No redundancy in generated traffic! is independent of the choice of the routing structure and

High complexity in encoding; distributed implementation?

( )1 2 1 1 1 1

ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ( ) ( | ) ... ( | ,..., ) ( ,..., )SWN N NT H S H S S H S S S H S S

( )SWT( ) ( ) ( )SW EC ICT T T


Aggregation with Compression

Based on different coding schemes, there are two types of aggregation (routing) strategies: Aggregation with Distributed Source Coding (Slepian-

Wolf) [5]; Aggregation with Explicit Communication (EC) [6, 7];

Routing Driven Compression (RDC); Compression Driven Routing (CDR);


Aggregation with DSC

If sensor nodes have perfect

knowledge about their correlation, they can encode/compress data so as to avoid transmitting redundant information; then each source can send its encoded data to the sink along the shortest path without the need for intermediate aggregation.


Aggregation with EC

Routing Driven Compression (RDC) [6]:The sensors do not have any

knowledge about their correlation and send data along the shortest paths to the sink while allowing for OPPORTUNISTIC aggregation

wherever the paths overlap.

Compression is not the

objective, only an opportunity.


Aggregation with EC

Compression Driven Routing (CDR) [6]:The sensors have no knowledge of

the correlations but the data is aggregated close to the sources and initially routed so as to allowfor maximum possible aggregationat each hop. Eventually, the collecteddata is sent to the sink along the shortest possible path.

Compression is the objective, but CDR in [6] is only an extreme case. No Optimality in transmission structure!


An Example

ObjectiveMinimize the total transmission cost of transporting the information collected by the sources, to the sink.

Optimization (Joint) Transmission (routing) structure from sources to the sink; (Information) Rate allocation at each source;

Strategies DSC (Slepian-Wolf); Explicit Communication (EC);

Assumptions Distortion is handled by quantization; Snap-shot aggregation (No temporal correlation is considered);


Problem Formulation [5]

Graph G=(V,E), |V|=N+1; Sources Vs={1,2,…,N};

Sink {N+1}; Edge e=(i,j) with weight we;

where Ri is the net traffic (i.e.,

rate) generated at source i and di is

1

* *1

{ , } 1

{ , } arg min ( )N

i i i

NN

i i i i iR d i

R d F R d

i ee E

d w


Strategy 1 - DSC

Corollary 1 - Optimality of the shortest path tree (SPT) for the single-sink data aggregation problem [5]: When there is a single sink in the data aggregation problem and Slepian-Wolf coding is used, the SPT is optimal, in terms of minimizing the total flow cost, for any given rate allocation.

The joint optimization problem is separated First, optimizing the transmission structure by building

SPT; Second, optimizing the rate allocation for the given SPT;


Strategy 1 - DSC (Cont’)

When the SPT is found, assume path weights from sources to the sink are ordered as

For a linear cost function F(R), the optimal rate allocation is i.e., most of load to nodes

close to the sink, small rates to nodes at extremity of the

network.

1 2( ) ( ) ... ( )SPT SPT SPT Nd X d X d X

1{ }min ( ) ( )

s.t. ( | )

Ni i

s

i SPTR i V

ci s

i Y

F R d i

R H Y Y Y V

*1 1

*2 2 1

*1 1

( ),

( | ),

......

( | ,..., ),N N N

R H X

R H X X

R H X X X



Difficulties for distributed implementation The order of the path weights is required as a-priori

knowledge to allocate rate (i.e., Global knowledge of SPT); Global knowledge of the correlation structure is required

for the node to calculate conditional entropy;

Approximated Slepian-Wolf coding [7] Assumption - correlation decays fast with distance; Algorithm - locally order the path weights:

Find SPT; For each node i: find in the neighborhood N(i) the set Ci of

nodes that are closer to the sink, on the SPT, than node i; Coding rate Ri = H(Xi | Ci) for its quantization index;



Performance loss of approximated Slepian-Wolf [7] Gaussian random field

(squared distance model); Area size: 100*100; 50 nodes, uniform dis. Intensity of correlation

c = 0.001(high) ~ 0.01(low)


Strategy 2 - EC

Conditional encoding is used; Rate allocation and transmission structure selection are

NOT separable; SPT is not necessary to be optimal in transmission structure; Example:

link weight = 1; Shortest Path Tree (SPT, left); Traveling Salesman Path (TSP, right); if r < 0.5R, TSP is better than SPT;


Strategy 2 - EC (Cont’)

Problem formulation: Find: the spanning tree ST = {T, L} with T (non-terminal

nodes) and L (leaves), T U L = V;Such that

where, a simple correlation model is assumed: the rate at leaves is R and is r at non-terminal nodes; correlation coefficient is

NP-complete for general [7]

arg min ( ) (1 ) ( )ST STL

l L i V

ST d l d i

1 /r R

0 1



Approximation Algorithms [7] SPT: reference scheme; Greedy algorithm: start from an initial subtree only with

the sink; then, successively, add the node who causes the least cost increment to the existing subtree.

Leaves deletion algorithm: start from SPT; check possible cost improvement by making leaf nodes change their parent to some other leaf node in their neighborhood;

Balanced SPT/TSP: builds SPT up to a radius away from the root and then builds TSP starting from the leaves of SPT in their respective sub-regions;

( )q



Performance of Approximation Algorithms [7]


Comparison: DSC vs. EC

One-dimensional case

DSC EXPLICIT COMMCoding Complexity High Low

Optimal Route Design Simple HardFull Knowledge of

Correlation StructureRequired Not required

Generated Bits DSC <= ECTotal Flow Cost ?


Comparison: DSC vs. EC (Cont’)

Cost ratio [5]:

If entropy rate > 0

If entropy rate = 0 and

SW

EC

cost ( )( )

cost ( )

NN

N

lim ( ) 1N N

1 1( | ,..., ) (1/ )pi iH X X X i

(0,1) lim ( ) 1

1 lim ( ) 0N

N

p N p

p N



A fundamental tradeoff is information fidelity and compression gain;

For the given distortion, different coding schemes combined with routing strategies can achieve different gains;

The joint optimization for routing and rate allocation (coding) is generally difficult;

DSC (Slepian-Wolf) is elegant in separation of the joint optimization problem, but its distributed implementation is still under investigation [8, 9];

Aggregation with EC is practical, but good approximation algorithms are needed for finding the optimal transmission structure for general correlated sources;



Multiple-Sink case is more complicated: [5] shows that the problem to find the

optimal transmission structure is NP-complete even with Slepian-Wolf coding;

Multiple rate problem at the single

node - increasing coding complexity; Lossy compression More interesting optimization

problems (e.g., sensor density, placement, etc.)


References

[1] D. Neuhoff, “Field-Gathering Sensor Networks, Distributed Encoding and Oversmapling”, Canadian workshop on Information Theory, May 2003.

[2] R. Cristescu and M. Vetterli, “On the Optimal Density for Real-time Data Gathering of Spatio-Temporal Processes in Sensor Networks”, in the Proc. of ACM IPSN’05, 2005.

[3] A. Scaglione and S. Servetto, “On the Interdependence of Routing and Data Compression in Multi-hop Sensor Networks”, to appear in ACM/Kluwer Journel on Mobile Networks and Applications (MONET), also see the short version in MobiCom 2002.

[4] T. Cover and J. Thomas, Elements of Information Theory, John Wiley and Sons, Inc., 1991.

[5] R. Cristescu, B. Beferull-Lozano and M. Vetterli, “Networked Slepian-Wolf: Theory, Algorithms and Scaling Laws”, to appear in IEEE Trans. on Information Theory, 2005.

[6] S. Pattem, B. Krishnamachari and R. Govindan, “The Impact of Spatial Correlation on Routing with Compression in Wireless Sensor Networks”, in the Proc. of ACM IPSN’04, Apr 2004.

[7] R. Cristescu, B. Beferull-Lozano and M. Vetterli, “On Network Correlated Data Gathering”, in the Proc. of IEEE Infocom’04, 2004.

[8] S. Pradham, J. Kusuma and K. Ramchandran, “Distributed Compression in a Dense Microsensor Network”, IEEE Signal Processing Magazine, pp.51 - 60, Mar 2002.

[9] Aaron, A., Girod, B.: Compression with Side Information Using Turbo Codes, in Proc.IEEE Data Compression Conference (DCC’02), Snowbird, UT, pp. 252-261, Apr 2002.


Thank You!


Backup

Optimal SPT radius for Balanced SPT/TSP algorithm

Documents

ECSE 6962 lecture presentation1 Compression in Correlated Data Aggregation Zhenzhen Ye Oct. 24th, 2005