1.The Impact Of Data Aggregation in Wireless Sensor Networks. 2.The ACQUIRE Mechanism for Efficient Querying In Sensor Networks. By: Kinnary Jangla Rishi

1.The Impact Of Data Aggregation in Wireless Sensor Networks.

2.The ACQUIRE Mechanism for Efficient Querying In Sensor Networks.

By:Kinnary Jangla

Rishi Kant Sharda

Paper By:- Bhaskar Krishnamachari- Deborah Estrin- Stephen Wicker

Presented By:- Kinnary Jangla- Rishi Kant Sharda

Basic Idea.. To exploit the data redundancy

Packets from different nodes, are combined in – network.

Implementation Who carries the data with redundancy

Data-centric routing

Differences Data-centric routing

Based on contents of the packets. Address-centric routing

Routing based on an end-to-end manner.

The Impact Of Data Aggregation On Wireless Sensor Networks

Sensor Network Models: Event-Radius Model Random Source Models

Impact of: Source-Destination Placements Communication Network Density

On :

- Energy Costs

- Delay

Overview

(Cont..)

Data Centric routing - Significant Performance

Gain

Complexity of Data Aggregation NP-Hard Problem.


Sub - Titles: Introduction. Routing Models.

AC DC

Data-Aggregation Optimal – Suboptimal Aggregation Sensor Network Models

Energy Savings Theoretical Results Simulation Results

Delay


Introduction.

Concepts. Sensor Network ? Sensor Node ? Unattended Operation ? Data Aggregation ?

Data Redundancy !

Wireless Sensor Network. Applications. Network Topology of a Sensor Network.

???

?

?

Introduction cont..

Network Topology of a Wireless Sensor Network.

(cont..)

Data Aggregation in WSN ?

- Address-centric approach

- Data-centric approach


Routing Models

Address Centric Approach


Data – Centric Approach


Data Aggregation

Result 1:

- The optimum number of transmissions required per datum for the DC protocol is equal to the number of edges in the minimum steiner tree in the network which contains the node set (s1, …. , Sk, D).- Hence, assuming an arbitrary placement of sources and a general network graph G, the task of doing DC routing with optimal data aggregation is NP-Hard.

{- Steiner Tree?- NP-Hard Problem?

}


Optimal Data Aggregation

The optimal data aggregation problem is NP-Hard. An optimal multicast problem

A well-known problem A minimum Steiner tree problem: NPC

So…NO optimal Solution Thus, sub-optimal solutions.

Data Aggregation

Section 1: 3 – suboptimal Schemes:

Center at Nearest Source: Aggregation center: nearest node to the

sink.

Shortest Paths Tree Shortest path routing with data aggregation

in the overlap nodes.

Greedy Incremental Tree Node closest to the tree connects to the path

and forms a new tree until all the source nodes are vertices.


(cont..)

Section 2:

Sensor Network Models:- for source placement.

{ Factors affecting the performance gains of sensor network..

1. Position of the sources

2. communication network topology.}

Event Radius Model. Random Sources Model.


Event Radius Model.

Location of an event.

Sensing Range, S. (Pi)*S^2*n –

average number of sources.


Random Sources Model. Sources not

clustered. K random nodes, that

are not sinks,are chosen to be sources


Energy Savings due to data aggregation

Notations:

di : the distance of the shortest path from source i tothe sink

NA: the total number of transmissions required for the optimaladdress-centric protocol

ND: the total number of transmissions required for the optimal data-centric protocol

X: the diameter of the graph formed by a set of connected nodes

K: the number of the sources in the RS model R: communication range S: sensing range in the ER model

Energy Savings Due to Data Aggregation

Main performance gain When sources are far away from the sink. NA = d1 + d2 + …. Dk = sum (di) Diameter X = max of pairwise shortest paths.

Theoretical Results:

Result 2:If the source nodes S1, S2, … , Sk have

a diameter X >= 1. The total number of transmissions (Nd) required for the optimal DC protocol satisfies the following bounds:

ND < = (k-1)X + min(di) …… X >= 1 ND >= min(di) + (k-1) …… X = 1

Corollary If diameter X < min(di), then ND < NA.


Proof:

data aggregation tree consists of(k − 1) sources sending their packets to the remainingsource which is nearest to the sink. This tree has no more than (k−1)X +min(di) edges,

Next result is obtained by considering the smallestpossible Steiner tree which would happen if thediameter were 1. The shortest path from the source node at min(di) must be part ofthe minimum Steiner tree, and there is exactlyone edge from each of the other source nodes tothis node.

Conclusion: The optimum data-centric protocol will perform strictly better than the Address-centric protocol.

Result 3:

Cont…

ND/NA = 1/k

- DC Protocol gives k-fold savings.

Cont… Result 4:

If the subgraph G” of the communication graph G induced by the set of source nodes (S1……Sk) is connected, the optimal data aggregation tree can be formed in polynomial time.

Corollary: In the ER model, when R > 2S, the optimal data aggregation

tree can be formed in polynomial time.


Proof:

The tree is initialized with the path from the sink to the nearest source.

At each additional step of the GIT, the next source to be connectedto the tree is always exactly one step away (such a source is guaranteed to exist since G is connected).

At the end of the construction, the number of edges in the tree is thereforedmin + (k − 1).

Therefore, the GIT construction runs in polynomial time w.r.t. the number ofnodes .

Summary:

Result 1: The number of transmissions for the DC protocol =

number of edges in the minimum Steiner tree.

Result 2: Nd <= (k-1)X + min(di) Nd >= (k-1) + min(di)

Result 3:

Result 4: The optimal data aggregation tree can be formed in

polynomial time.

ND/NA = 1/k

Simulation Results:

Figure 1:- Comparison of Energy costs versus R in the ER model.

Figure 2:

- Comparison of energy costs versus R in the RS model


Figure 3: Comparison of

energy costs versus S in the ER model

Figure 4: - Comparison of energy costs versus k in the RS model.


Sensing Range

Energy Savings.

Summary of experiments:

Energy Savings due to data aggregation can be quite significant, particularly when there are a lot of sources – (large S or large k) that are many hops from the sink - (small R).


Delay due to Data Aggregation

Tradeoff:

Greater Delay !!

Data from sources have to be held back at an intermediate node in order to be aggregated.

Worst Case:- Latency due to aggregation will be proportional to the number of hops between sink and

the farthest source.


Figure 5: Max(di) and

Min(di) versus R in the ER Model

Figure 6: Max(di) and

Min(di) versus S in the ER Model.


Conclusions:

The formation of an optimal data aggregation tree is NP – Hard.

Energy Gains possible with data aggregation.Large when - number of sources large- Sources located close to each.

Other and far from sink

Aggregation Latency (Delay) non-negligible


The ACQUIRE Mechanism for Efficient Querying in Sensor Networks

Written By:Narayanan Sadagopan

Bhaskar KrishnamachariAhmed Helmy

Presented By:

Rishi Kant ShardaKinnary Jangla

The Basics

A sensor network is a computer network of many, spatially distributed devices using sensors to monitor conditions at different locations, such as temperature, sound, vibration, pressure, motion or pollutants.

Each device is equipped with a radio transceiver, a small microcontroller, and an energy source, usually a battery. The devices use each other to transport data to a monitoring computer.

Usually these devices are small and inexpensive, so that they can be produced and deployed in large numbers, and so their resources in terms of energy, memory, computational speed and bandwidth are severely constrained.

Therefore not feasible to collect all measurements from each device for centralized processing.

Introduction

Best to view them as distributed databases. Central querier/data sink issues queries. Due to energy constraints it is desirable for

much of the data processing to be done in-network.

This leads to the concept of data centric information routing i.e. queries and responses are for named data.

Categories of Queries

Continuous Queries e.g Report the measured temperature for the next 7 days with a frequency of 1 measurement

per hour. One-Shot Queriese.g Is the current temperature higher than 70°? Aggregate Queriese.g Report the calculated average temperature of all nodes in region X. Non-Aggregate Queriese.g What is the temperature measured by node x? Complex Queriese.g What are the values of the following variables: X, Y , Z? Simple Queriese.g What is the value of the variable X? Queries for Replicated datae.g Has a target been observed anywhere in the area? Queries for Unique data

Flooding-based query mechanisms: (Directed Diffusion data-centric routing scheme)

Expanding Ring Search

Why ACQUIRE?

Earlier Flooding-based query methods such as “Directed Diffusion data-centric routing scheme” are well suited only for continuous-aggregate queries.

One-size-fits-all approach unlikely to provide efficient solutions for other types.

If it is not continuous then flooding can dominate the costs associated with querying.

Similarly in data aggregation duplicate responses can lead to suboptimal data collection in terms of energy costs.

Example: Bird Habitat Monitoring

Example: Continued

Task: “Obtain sample calls for the following birds in the reserve: Blue jay, Nightingale, Cardinal, Warbler”

Complex One-shot For replicated data

ACQUIRE

LEGEND

Active Query

Complete Response

Update Messages

Sensor

Analysis of ACQUIRE

Basic Model and Notation Local update Forward

Steps to Query Completion Local Update Cost Total Energy Cost Optimal Look Ahead

Basic Model and Notation

X number of sensors. V = {V1,V2,…VN} are the N variables tracked. Q = {Q1,Q2,…QM} consisting of M sub-queries, 1 < M ≤

N and for all i : i < M, Qi Є V. Let SM be the average number of steps taken to resolve

a query consisting of M sub-queries. d – Look ahead parameter Size of a sensors neighborhood f(d) Assumed that all queries Q are resolvable by this

network. x* be the querier which issues the query Q.

ACQUIRE Process

Local Update : If current information not up-to-date, x sends request to all

sensors d hops away. Request forwarded hop-by-hop. Sensors who get the request then forward their

information to x. Let the energy consumed in this phase be Eupdate

Forward : After answering the query based on information received. x forwards the remaining query to a randomly chosen

node d hops away.

ACQUIRE Process 2

Since updates are triggered only when the information is not fresh, it makes sense to try and quantify how often such updates will be triggered.

We model this as amortization factor c. An update is likely to occur at any given node only

once every c queries. c such that 0 < c ≤ 1. e.g if on average an update has

to be done once every 100 queries, c = 0.01. α denotes the expected number of hops from the

node where the query is completely resolved to x*

ACQUIRE Process 3

The average energy consumed to answer the query of size M with look-ahead d can be expressed as:

Case: d=D , where D is the diameter of the network.

Case: d too small. SM ↓ when d ↑ Eupdate ↑ when d ↑

Steps to Query Completion

If there are M queries to be resolved the probability of success in each trial is: p = M/N and failure is p = (N-M)/N.

Expected number of trials till 1st success 1/p=N/M. The whole experiment can be repeated with one

less query and time to answer another query is N/(M-1) and so on.

Let σM be the number of trials till M successes i.e complete resolution. Then:

Steps to Query Completion 2

H(M) is the sum of the first M terms of the harmonic series.

H(M) ≈ ln(M) + γ, where γ = 0.57721 Euler’s constant, thus:

and

Local Update Cost

Eupdate : Energy spent in updating the information at each active node.

The number of transmissions needed to forward this request is the no. of nodes within d-1 hops, f(d-1).

N(i) Number of nodes at hop i.

Total Energy Cost

If the response is returned along the reverse path i.e α <= dSM

Special case: d = 0 –Random Walk. E(σM) steps to resolve and return the

query.

Optimal Look-ahead

Ignoring boundary effects, it can be shown that N(i) = 4i and

f(d) = (2d(d+1))+1 for a grid of sensors, each node having 4 immediate neighbors.

Combining expression for SM, Eupdate, Eavg , N(i) and f(d) we get:

Optimal Look-ahead 2

We determine the value of the look-ahead parameter which minimizes this energy cost by taking the derivative with respect to d and set it equal to 0, we get d* by:

In general the lower c is, higher will be the look ahead parameter d*



0

500

1000

1500

2000

2500

3000

3500

4000

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Look-ahead Parameter (d) [N=1000, M=200]

Ave

rag

e E

ne

rgy

pe

r Q

ue

ry

c=0.01

c=0.02

c=0.03

c=0.04

c=0.05

c=0.06

c=0.07

COMPARISON

Conclusions

Proposed ACQUIRE as a scalable protocol for complex, one-shot queries for replicated data in sensor networks.

Developed an analytical comparison of ACQUIRE, FBQ and ERS.

With optimal parameter settings ACQUIRE outperforms all other schemes for complex, one-shot queries.

Optimal ACQUIRE performs many orders of magnitude better than flooding-based schemes.

Can reduce energy consumption by more than 60%.

Future Work

The efficiency of ACQUIRE can also be improved if the neighborhoods of the successive active nodes in the query trajectory have minimal overlap.

Guided trajectories may also be helpful in dealing with non-uniform data distributions

Taking into account that receptions can also influence energy consumption. This is the case especially for broadcast messages.

THANK YOU

Documents

1.The Impact Of Data Aggregation in Wireless Sensor Networks. 2.The ACQUIRE Mechanism for Efficient Querying In Sensor Networks. By: Kinnary Jangla Rishi