29
INT 598 Data Management for Sensor Networks Silvia Nittel Spatial Information Science & Engineering University of Maine Fall 2006

INT 598 Data Management for Sensor Networks Silvia Nittel Spatial Information Science & Engineering University of Maine Fall 2006

Embed Size (px)

Citation preview

INT 598Data Management for Sensor

Networks

Silvia NittelSpatial Information Science &

EngineeringUniversity of Maine

Fall 2006

INT598: Sensor System Foundation 2© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Overview Data Collection and Aggregation

Programming sensor networks In-network data aggregation In-network query processing In-network data storage and indexing Multi-Resolution Storage

INT598: Sensor System Foundation 3© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Data Collection Scenarios• Embed numerous distributed devices to monitor and interact with physical world

• Exploit spatially and temporally dense, in situ, sensing and actuation

• Network these devices so that they can coordinate to perform higher-level identification and tasks.

• Requires robust distributed systems of hundreds or thousands of devices.

Deborah Estrin, UCLA

INT598: Sensor System Foundation 4© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Indoor Applications Intel cubicle space Sensing:

Light and sounds sensors on

the ceiling or cubicle walls

Actuation: detecting occupied cubicles and disturbing conversation outside of cubicles

INT598: Sensor System Foundation 5© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Outdoor Applications Napa Valley

vineyard Sensing:

Humidity and temperature sensors at vines

Actuation: ventilators to remove fog, and localized heaters

Queries: monitoring micro-climates at vines

INT598: Sensor System Foundation 6© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Example A

“Macroscope” in the Redwoods.

ACM Sensys 2005

Observation over ca. 60 days.

INT598: Sensor System Foundation 7© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Database Management System

Database Collection of store and

streamed data Database Management

System Software to store,

manage, access, and query the data

With simple-to-use user interface and query language

Database System Both together.

Data Base(DB)

Data Base Management System

(DBMS)

INT598: Sensor System Foundation 8© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Viewing SN as DBS

INT598: Sensor System Foundation 9© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Viewing a SN as a DBS Assumption: A sensor network can be viewed as

distributed database system each sensor node is a database system that

can accept, process, and answer queries participate in execution of global, distributed queries

The user poses declarative queries to the SN as a whole. The dbms figure out how to process the query.

Tiny (foot-print) database management systems (DBMS) running on sensor nodes are available

Example: TinyDB (UCBerkeley), Cougar (Cornell University)

However, constrained computing environment Adapting existing database technology

In-network data storage, data aggregation and query processing

INT598: Sensor System Foundation 10© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Sensor Network DBMS Objectives:

Users: Model the application data and data needs

no low-level detail programming of the sensor nodes and the data gathering details

“What should be done”, not “how should it be done”

Approach: Declarative SQL-style

queries Intelligent query

processing Fault Mitigation

SELECT MAX(temperat) FROM sensors WHERE temperat > threshSAMPLE PERIOD 64ms

App

Sensor Network

TinyDB

Query, Trigger

Data

© S. Madden, 2005.

INT598: Sensor System Foundation 11© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Declarative QueriesEpochEpoch NodeidNodeid LightLight TempTemp AccelAccel SoundSound

0 1 455 x x x

0 2 389 x x x

1 1 422 x x x

1 2 405 x x x

Sensor Schema

EpochEpoch NodeidNodeid AGV(sAGV(sound)ound)

TempTemp SoundSound

0 1 360 x x

0 2 520 x x

1 1 370 x x

1 2 520 x x

Examples:SELECT nodeid, light

FROM sensors

WHERE light < 400

EPOCH DURATION 1s

SELECT roomId, AVG(sound)

FROM sensors

GROUP BY roomId

HAVING AVG(sound) > 400

EPOCH DURATION 1s

© S. Madden, 2005.

INT598: Sensor System Foundation 12© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Queries over Sensor Networks

Query types: “Snap shot” queries

Report the current temperature reading of sensor node #1?

Continuous queries Report the temperature readings of sensor node #1 to

#10 in the next 10 minutes at the interval of 1 min? Event queries

Report when temperature values are above threshold 1

Most common: Spatio-Temporal queries

Point queries (“report temperature in room 324”) Spatial window queries (“report temperature values

from region A”)

INT598: Sensor System Foundation 13© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Queries over Sensor Networks

Most common (cont.): ST Aggregation (average, max, min,

etc) Temporal aggregation (“max temperature

value in the last 24h”) Spatial aggregation (“average temperature

value of all sensors on the first floor”) Basic aggregation:

Min, max, average, sum, count, etc. Holistic aggregates: estimation

INT598: Sensor System Foundation 14© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

In-Network Data Aggregation

A

B C

D

FE

Query

{B,D,E,F}

{A,B,C,D,E,F}

Data stream processing :

•Sample rate

•Temporal aggregation

•Stream mining for local events

•Uncertainty/inaccuracy

Each sensor node: • production of data stream• processing of data stream locally• processing of aggregated data

• minimize communication Computation is pushed to data collection points: Local and locally-coordinated processing of data “in the network”

{D,E,F}

Partial state record

INT598: Sensor System Foundation 15© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Execution of Aggregates Flexible, ad-hoc communication topology

(network level) Aggregation computation over sensor networks

consists of two phases: a (query) distribution phase

in which aggregate queries are pushed down into the network, and

a (data) collection phase where the aggregate values are continually routed

up from children to parents. Query semantics partition time into epochs of duration,

and that we must produce a single aggregate value (when not grouping) that combines the readings of all devices in the network during that epoch.

INT598: Sensor System Foundation 16© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Data Distribution Phase 1. When a sensor node n receives a

request to aggregate r (e.g. max(temp)), it awakens, synchronizes its clock according

to timing information in the message, and prepares to participate in aggregation.

In the tree-based routing scheme, n chooses the sender s of the message as its parent. In addition the query r includes the interval when the sender s is expecting to hear partial state records from n .

INT598: Sensor System Foundation 17© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Data Distribution Phase 2. n then forwards the query request r down

the network, setting this delivery interval for children to be slightly before the time its parent expects to see n ’s partial state record. In the tree-based approach, this forwarding

consists of a broadcast of r , to include any nodes that did not hear the previous round, and include them as children (if it has any.)

These nodes continue to forward the request in this manner, until the query has been propagated throughout the network (n-coverage)

Special cases: geo-routing

INT598: Sensor System Foundation 18© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Data Collection Phase 3. During the epoch after query

propagation, each sensor node listens for messages from its children during the interval it specified when forwarding the query. It then computes a partial state record

consisting of the combination of any child values it heard with its own local sensor readings (aggregation).

Finally, during the transmission interval requested by its parent, the mote transmits this partial state record up the network

INT598: Sensor System Foundation 19© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

INT598: Sensor System Foundation 20© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

In-Network Data Estimation Query Processing

Types of phenomena: • discrete

•Example: did a truck pass or not?

• continuous (fields)•Example: temperature field, toxic clouds

Types of spatial queries: • window or point queries

•Discrete sensor readings•Estimation over continuous ph.

Computation is pushed to nodesStill: computational complex and expensive

Spatial window query:

INT598: Sensor System Foundation 21© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Detecting and Tracking Continuous Phenomena

spatial windowquery over a toxicplume

INT598: Sensor System Foundation 22© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Tracking continuous phenomena

Network configuration based on qualitative characteristics of a phenomenon

Collaboration with M. Worboys

INT598: Sensor System Foundation 23© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

In-Network Query Processing Key: Acquisitional Query

Processing Traditional query processing:

query processing on stored data. Sensor network query processing:

acquiring the data from sensors Acquisitional query processor controls

when, where, and with what frequency data is collected

INT598: Sensor System Foundation 24© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Acquisitional Query Processing Basic Strategies:

Continuous queries “SELECT temperature FROM sensors WHERE

location=ESRB AND EPOCH=1h UNTIL DATE=11/25/06”

with rates or lifetimes Events for asynchronous triggering

Optimization Strategies: E.g. avoiding unnecessary acquisition Sampling as a query operator Choosing Where to Sample via Co-acquisition Index-like data structures

INT598: Sensor System Foundation 25© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Lifetime Queries Lifetime vs. sample rate

SELECT …EPOCH DURATION 10 s

SELECT …LIFETIME 30 days

Extra: Allow a MAX SAMPLE PERIOD Discard some samples Sampling cheaper than transmitting

INT598: Sensor System Foundation 26© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Adaptive & Decentralized Operator Placement Main Idea

Place operators near data sources

Greater operator sample rate place operator closer

For each operator Explore candidate

neighbors Migrate to lower

cost placements Via extra messages

Rate A

Rate B

INT598: Sensor System Foundation 27© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

“Adaptivity” in Databases Adaptivity : changing query plans on the fly

Typically at the physical level Where the plan runs Ordering of operators Instantiations of operators, e.g. hash join vs merge

join Non-traditional

Conventionally, complete plans are built prior to execution

Using cost estimates (collected from history) Important in volatile or long running

environments Where a priori estimates are unlikely to be good E.g., sensor networks

INT598: Sensor System Foundation 28© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

In-Network Data Storage Storage challenges:

Method:transmit all measurements to central db for storage Advantage: unconstrained search on historic data Disadvantage: high power consumption

db

Queries on differentLevel of detail

HierarchicalIn-NetworkStorage

CentralizedStorage

INT598: Sensor System Foundation 29© Dr. Silvia Nittel, NCGIA, University of Maine, 2006

Summary Declarative Query Processing

Simplify data collection in sensornets In-network processing, query optimization for

performance Acquisitional Query Processing

Focus on costs associated with sampling data New challenge of sensornets, other streaming

systems? Adaptive Join Placement

In-network optimization Some benefit, but practicality unclear Operator pushdown still a good idea