Quasar Group
HiPC 2003 Tutorial System Support for Sensor Networks
Speakers:
Sharad Mehrotra, Univ. of California, Irvine
Nalini Venkatasubramanian, Univ. of California, Irvine
Rajesh Gupta, Univ. of California, San Diego
Quasar Group
Acknowledgements (for Slides)
• Nesime Tatbul Kevin Hoeschele, Anurag Shakti Maskey (AURORA team)
• Jennifer Widom, Rajeev Motwani (STREAM)• Sam Madden (TinyDB)• Anantha Chandrakasan (MIT uAMPS TEAM)• Qi Han, Iosif Lazaridis, Xingbo Yu (QUASAR team)• Srini Seshan (Irisnet)
• Slides for tutorial available at– http://www.ics.uci.edu/~quasar/tutorial/hipc.ppt
Quasar Group
Sensor Networks
Various Sensor Applications
Battlefield MonitoringHabitat MonitoringEarthquake Monitoring
Oceanographic current monitoring
Medical Condition Monitoring
Traffic Congestion DetectionTarget Tracking & Detection
Intrusion Detection
Video surveillance
Quasar Group
Taxonomy of Applications (1)
• Data Access needs of applications– Historical data
• Analysis to better understand the physical world
– Current data• Monitoring and control to optimize the processes that
drive the physical world
– Future data• Forecasting trend in data for decision making
Quasar Group
Taxonomy of Applications (2)
• Predictability of Data access– Fixed
• data access needs of applications known a-priori
– Unpredictable (ad-hoc)• Data access needs of applications not known at any
instance of time
– Predictable (continuous)• Data access needs of applications can be predicted for
some time in the future with high probability
Quasar Group
Application Landscape
no knowledge some knowledge full knowledge
Temporal property of data accessed
Predictability of data access
the present
the future
Each evening at 8pm predict the temperature for the next 5 days
Notify me immediately when there is a forest fire
Every month, calculate the average humidity in California for the last 30 days
Did the temperature rise above 40oC in the last year?
Is Mr. Doe’s newly proposed weather model accurate for 1996-2000?
How much snow is there in Aspen?
I’m going surfing on Sep. 30! Will it be windy?
Visualize current humidity with Mrs. Doe’s new interpolation scheme.
Predict noise levels around the airport if runway 2 becomes operational
the
past
Quasar Group
Basic architecture of sensor nodes
Quasar Group
Sensor Properties – Different Capabilities
• Storage– Built-in memory
• Sensing• Computing
– Micro-processor or micro-controller
• Communication– Short range radio for wireless communication
Quasar Group
Sensor Properties – Resource Constraints
• Lower transmission distances (< 10m) • Lower bit rates (typically < kbps) • Limited battery capacity
Radio mode Power consumption(mw)
Transmit 14.88
Receive 12.50
Idle 12.36
Sleep 0.016
Quasar Group
Sensor Devices today
• MIT uAMPS– 59Mhz to 206 Mhz processor– 2 radios , capable of transmitting at 1Mbps– 4KB RAM
• Berkeley Mica motes
– 8bit, 4Mhz processor
– 40kbit CSMA radio
– 4KB RAM,
– TinyOS based
• A series of sensor nodes developed
Quasar Group
Sensor OS Concepts
• Constrained Scheduling– Event-based(?)
• Constrained Storage Model– frame per component, shared
stack, no heap
• Very lean multithreading• Efficient Layering
Messaging Component
init
Po
we
r(m
od
e)
TX
_p
ack
et(
bu
f)
TX
_p
ack
et_
do
ne
(s
ucc
ess
)RX
_p
ack
et_
do
ne
(b
uff
er)
Internal
State
init
po
we
r(m
od
e)
sen
d_
msg
(ad
dr,
ty
pe
, d
ata
)
msg
_re
c(ty
pe
, d
ata
)
msg
_se
nd
_d
on
e)
internal thread
Commands Events
Quasar Group
Sensor Network Properties
small-scalesensor nodes
restrictedresources
environmental influence
prone to failure
depleted battery
unattended operation
frequent topology changesand network partitions
node mobility
dense deployment in large numbers
scalability issues
heterogeneity issues
concurrencyissues
fixed vs. mobilesensor grids
infrastructure based vs.ad-hoc communication
Quasar Group
Controversies with sensor networks
• How is this different from mobile ubiquitous computing?
• Network-centric vs. edge-centric architecture?– Passive sensors vs. smart sensors
• A new class of algorithms?– Traditional deterministic vs. probabilistic vs.
epidemic
Quasar Group
Wireless Networked Embedded Systems Characteristics
• Wireless– limited bandwidth, high latency (3ms-100ms)
– variable link quality and link asymmetry due to noise, interference, disconnections
– easier snoopingneed for more signal and protocol processing
• Mobility– causes variability in system design parameters: connectivity, b/w,
security domains, location awarenessneed for more protocol processing
• Portability– limited capacities (battery, CPU, I/O, storage, dimensions)need for energy efficient signal and protocol processing
Quasar Group
Capacity of Wireless Sensor Networks
• Sensor Networks– nodes can sense (actuate), compute, communicate
• at the next level, these nodes and networks can infer, track, correlate and correspond
– when such nodes can be composed, the application possibilities can be wildly imaginative
• highly intelligent real-time distributed systems
• However, there are fundamental limits to scaling that have to do with the ad hoc nature of such networks
– nodes building links and communicating (including relaying, setup and discovery) without a central control
Quasar Group
Communication in Sensor Networks
• Questions we seek to answer– How much information can wireless sensor networks transport?
• What can be done to maximize this transport?
– What is the right power level for transport?• Where is this control (best) exercised?
– What is the appropriate network configuration• Direct communication (single-hop)
• Multi-hop communication– Directed diffusion , LAR, GF
• Cluster-based communication– LEACH
Quasar Group
Challenges for Sensor Networks
Challenges forSensor Networks
Services for localization, discovery, storage,
agreement
Injection of application
knowledge into sensor network infrastructure
Integration of communication and application
specific data processing
Quality of data/service
Guarantees underresource
constraints
Automatic configuration
& error handling
Time & locationmanagement
Quasar Group
Projects on Sensor Networks
Sensor OS
UC-BerkeleyMIT muOS
StabilizationOhio-state
Univ. of IowaMichigan state
Univ.UT-Arlington
Kenn State Univ.
QoS in Surveillance
and Control UIUC
Univ. of VirginiaCMU
Network related
ISIUCLAUSC
NESTNEST
WebDustRutgers
CougarCornell
QuasarUC-Irvine Aurora
Brown, MIT, Brandeis Univ.
SensITMIT
Duke Univ.Univ. of Hawaii
Univ. of WisconsinNorthwestern Univ.
Penn State Univ.Auburn Univ.
SmartDustUC-Berkeley
XeroxTinyDB
UC-Berkeley
Quasar Group
What are the Choices?
Sensor networks Wireless networks
Specializedinfrastructure
COTS infrastructure
Smart sensors Passive sensors
Probabilisticguarantees
Deterministic solutions
Quasar Group
This tutorial – systems perspective
• Layered approach– Device level
• Challenges in design of sensor devices and OSs
– Distributed sensor networks• Challenges in managing large networks of sensors to
meet application requirements
– Sensor Database Management• Challenges in Query Processing over sensor networks
Quasar Group
Design of sensor nodes
• Sensor Node Components– Computation/communication tradeoff
• Energy Management within a sensor– Computation/communication tradeoff
• Power-aware OS design for sensors
Quasar Group
Distributed Computing Infrastructure for Sensors
• Designing Distributed Sensor Architectures – Server oriented -- data migrates to server from sensors
• Store or not store (stream)• When should data migrate • How should should data migrate in its original raw form or in
some aggregated form. – Distributed approach
• Data does not migrate, requests/Queries migrate • Tiny DB approach, Dimension Approach
• Designing Middleware Support for Sensor Networks– Energy-Efficiency– Real-time– Fault tolerance
Quasar Group
Query Processing in Sensor Networks
• Queries Processing over Sensor Databases – Taxonomy of queries
• Lifetime queries, aggregation queries, approximate queries, set based queries
– Where do queries arise• At the server, fully distributed at any node
– Query semantics• What does a query mean? Exact semantics not very clear.
– Query Processing techniques• Answering Approximate Queries over Approximate
Representation• Answering Queries in the network• Distributed Query Answering
• Data Stream processing & Dynamic Data
Quasar Group
Design Issues in Sensor Devices
HiPC 2003, Hyderabad, India
Quasar Group
Energy Availability Growth limited to 2-3% per year
Pro
cess
or (M
IPS
)
Har
d D
isk
(cap
acity
)
Memory (capacity
)
Battery (energy stored)
0 1 2 3 4 5 6
16x
14x
12x
10x
8x
6x
4x
2x1xIm
pro
ve
me
nt
(co
mp
are
d t
o y
ea
r 0
)
Time (years)
Need to be energy efficient at all levels and in all tasks.
J. Rabaey, BWRC
Quasar Group
Processor MHz Year SPECint-95 WattsP54VRT (Mobile) 150 1996 4.6 3.8P55VRT (Mobile MMX) 233 1997 7.1 3.9PowerPC 603e 300 1997 7.4 3.5PowerPC 604e 350 1997 14.6 8PowerPC 740 (G3) 300 1998 12.2 3.4PowerPC 750 (G3) 300 1998 14 3.4Mobile Celeron 333 1999 13.1 8.6
Computational Efficiency
• Speed power efficiency has indeed gone up
– 10x / 2.5 years for Ps and DSPs in 1990s
• between 100 mW/MIP to 1 mW/MIP since 1990
– IC processes have provided 10x / 8 years since 1965
– rest from power conscious IC design in recent years
• Lower power for a given function & performance
– e.g. 1.6x / year reduction since early 80s for DSPs (source TI)
• Most optimistic projections at best stop at 60 pJ/op (about 20X)
However, circuit gains are nearing a plateau
– circuit tricks & voltage scaling provided a large part of the gains
• while energy needs (functionality, speed) continue to climb
– 10x increases: in gate count (7 years); in frequency (9 years)
Quasar Group
Efficiency in Communications
• Power Efficiency (or Energy Efficiency) P = Eb/N0
– ratio of signal energy per bit to noise power spectral density required at the receiver for a certain BER
– high power efficiency requires low (E_b/N_0) needed for a given BER
• Bandwidth Efficiency B = bit rate / bandwidth = R_b/W bps/hz– ratio of throughput data rate to bandwidth occupied by the modulated
signal (typically range from 0.33 to 5)
• Often a trade-off between the two– e.g. for a given BER
• adding FEC reduces B but reduces required P
• modulation schemes with larger # of bits per symbol have higher B but also require higher P
Quasar Group
Communication vs. Computation
• Computation cost (2004 projected): 60 pJ/op • Minimum thermal energy for communications:
– 20 nJ/bit @ 1.5 GHz for 100 m• equivalent of 300 ops
– 2 nJ/bit @ 1.5 GHz for 10 m• equivalent of 0.03 ops
significant processing versus communication tradeoff
• Computation cost (2004 projected): 60 pJ/op • Minimum thermal energy for communications:
– 20 nJ/bit @ 1.5 GHz for 100 m• equivalent of 300 ops
– 2 nJ/bit @ 1.5 GHz for 10 m• equivalent of 0.03 ops
significant processing versus communication tradeoff
J. Rabaey, BWRC
Quasar Group
The Need
• Power consumption, energy efficiency is a system level design concern
– efficiency in computation, communication and networking subsystems
• The energy/power tradeoffs cut across– all system layers: circuit, architecture, software, algorithms– need to choose the right metric
• Power awareness goes beyond low power concerns– make tradeoffs against performance, quality measures against
application constraints
Quasar Group
Power Supply
Where does the Power Go?B
atte
ry
DC-DCConverter
Communication
RadioModem
RFTransceiver
Processing
ProgrammablePs & DSPs
(apps, protocols etc.) Memory
ASICs
Peripherals
Disk Display
Signaling protocols, choice of modulation, TX/RX architecture, RF/IF circuits
Baseband DSP
Quasar Group
Capabilities: vibration, acoustic, accelerometer, magnetometer, temperature sensing
Example 1: Power Measurements on Rockwell WINS Node
Processor Seismic Sensor Radio Power (mW)Active On Rx 751.6Active On Idle 727.5Active On Sleep 416.3Active On Removed 383.3Active Removed Removed 360.0Active On Tx (36.3 mW) 1080.5
Tx (27.5 mW) 1033.3Tx (19.1 mW) 986.0Tx (13.8 mW) 942.6Tx (10.0 mW) 910.9Tx (3.47 mW) 815.5Tx (2.51 mW) 807.5Tx (1.78 mW) 799.5Tx (1.32 mW) 791.5Tx (0.955 mW) 787.5Tx (0.437 mW) 775.5Tx (0.302 mW) 773.9Tx (0.229 mW) 772.7Tx (0.158 mW) 771.5Tx (0.117 mW) 771.1
Summary• Processor = 360 mW
– doing repeated transmit/receive
• Sensor = 23 mW• Processor : Tx = 1 : 2• Processor : Rx = 1 :
1• Total Tx : Rx = 4 : 3
at maximum range
Summary• Processor = 360 mW
– doing repeated transmit/receive
• Sensor = 23 mW• Processor : Tx = 1 : 2• Processor : Rx = 1 :
1• Total Tx : Rx = 4 : 3
at maximum range
CommunicationSubsystem
RadioModem
GPS
MicroController
Rest of the Node
CPU Sensor
Quasar Group
Power Consumption Notables
• Differences in radio “sleep” versus “shutdown” can be significant
– need power management strategies at module/subsystem level
• Generally RX power less than TX power.
• However, as TX get to lower power modes, under some circumstances, it may be less than RX power
– particularly true in “sensor” type nodes– need protocols that minimize listening needed– need very low power “paging” channels for wakeup
• Processing can be a significant fraction of total power– 30-50%
Quasar Group
Metrics for Power
• Absolute power (mW)– sets battery life in hours– problem: power frequency (slow the system!)
• uW/MHz– average energy consumed by the system
• Energy per operation– fixes obvious problem with the power metric– but can cheat by doing stuff that will slow the chip
– Energy/op = Power * Delay/op
• Metric should capture both energy and performance: e.g. Energy/Op * Delay/Op
• Energy*Delay = Power*(Delay/Op)2
• Therefore:– uW/MIPS: average energy per instruction– uW/MIPS^2: normalizes uW/MIPS with the architectural performance
• useful for comparing architectures for power efficiency.
Quasar Group
Node Level Power Management
• Choices: H/W, Firmware, OS, Application, Users• Hardware & firmware
– don’t know the global state and application-specific knowledge
• Users – don’t know component characteristics, and can’t make frequent
decisions
• Applications – operate independently– and the OS hides machine information from them
• OS is a reasonable place, but…– OS should incorporate application information in power management – OS should expose power state and events to applications for them to
adapt.
Quasar Group
Operating System Directed Power Management
• Significant opportunities in power management lie with application-specific “knobs”
– quality of service, timing criticality of various functions
• OS plays an important role in allocation, sharing of critical resource
– it is a logical place for dynamic power management
– application-specific constraints and opportunities for saving energy that can be known only at that level
• Needs of applications are driving force for OS power management functions & power-based API
– collaboration between applications and the OS in setting “energy use policy”
• OS helps resolve conflicts and promote cooperation
Quasar Group
Slowdown by reducing supply voltage – Dynamic Voltage Scaling
• Reduction in supply voltage reduces speed• Reduce supply voltage when
– slower speed can be tolerated– or use architectural techniques to combat slow operation
• e.g. concurrency, pipelining via compiler techniques
1.0 1.5 2.0 2.5 3.0
Supply Voltage, V
1.0
3.0
5.0
7.0
Nor
mal
ized
Del
ay
Quasar Group
Shutdown for Energy Saving
– Shutdown attractive for many wireless applications due to low duty cycle of many subsystems:
– Issues:• Cost of restarting: latency vs. power trade-off
– increase in latency (response time)– increase in power consumption due to startup
• When to Shutdown: – Optimal vs.Idle Time Threshold vs. Predictive
• When to Wakeup: – Optimal vs. On-demand vs. Predictive
• Two main approaches: (Reactive versus Predictive)
– “Go to Reduced Power Mode after the user has been idle for a few seconds/minutes, and restart on demand”
– “Use computation history to predict whetherTblock[i] is large enough ( Tblock[i] Tcost )”
Blocked“Off”
Active“On”
Tblock Tactive ideal improvement = 1 + Tblock/Tactive
Quasar Group
To Shutdown or Reduce Voltage?
• Observation: – better to lower voltage than to shutdown in case of digital logic
• Example: task with 100ms deadline, requires 50ms CPU time at full speed– normal system gives 50ms computation, 50ms idle/stopped time– half speed/voltage system gives 100ms computation, 0ms idle– same number of CPU cycles but 1/4 energy reduction
• Voltage gets dictated by the tightest (critical) timing constraint both on throughput and latency --> dynamically change voltage
– Use voltage to control the operating point on the power vs. speed curve• I.e., power and clock frequency are functions of voltage
– Main challenge here is algorithmic:• one has to schedule the voltage variation as well!
– via compiler or OS or hardware
Quasar Group
Current OSPM - ACPI
• Advanced Configuration and Power Management Interface (ACPI)– OS visible (SCI-based) as opposed to OS invisible (SMI-based)– OS/drivers/BIOS are in sync regarding power states
• Standard way for the system to describe its device config. & power control h/w interface to the OS
– register interface for common functions• system control events, processor power and clock control, thermal management,
and resume handling
• Info on devices, resources, & control mechanisms– Description Tables, linked in a "table of tables"– description data for each device:
• Power management capabilities and requirements
• Methods for setting and getting the power state
• Hardware resource settings
• Methods for setting hardware resources
Quasar Group
New power-aware interfaces required
• Provide ways by which Application, Operating System and Hardware can exchange energy/power and performance related information efficiently.
• Facilitate the continuously dialogue / adaptation between OS / Applications.
• Facilitate the implementation of power aware OS services by providing a software interface to low power devices
– A power-aware API to the end user that enables one to implement energy-efficient RTOS services and applications
Quasar Group
Power-aware API
The applications interface provides the following services:
• The application is able to– tell RT information to OS (period, deadlines, WCET, hardness)
– create new threads
– tell OS time predicted to finish a given task instance• depending on the conditions of the environment (application
dependent and not yet implemented)
• OS must be able to predict and tell applications the time estimated to finish the task
– depends on the scheduling scheme used
• A hard task must be killed if its deadline is missed.
Quasar Group
Power Management in Communication Subsystems
ComputationSubsystem
e.g. DynamicVoltage/Freq.
Scaling
CommunicationSubsystem
Modulation
coding
Power-awareTask Scheduling
OS/Middleware/Application
Power-awarePacket Scheduling
Quasar Group
Tiny OS Concepts
• Scheduler + Graph of Components– constrained two-level scheduling model:
threads + events
• Component:– Commands, – Event Handlers– Frame (storage)– Tasks (concurrency)
• Constrained Storage Model– frame per component, shared stack, no
heap
• Very lean multithreading• Efficient Layering
Messaging Component
init
Po
we
r(m
od
e)
TX
_p
ack
et(
bu
f)
TX
_p
ack
et_
do
ne
(s
ucc
ess
)RX
_p
ack
et_
do
ne
(b
uff
er)
Internal
State
init
po
we
r(m
od
e)
sen
d_
msg
(ad
dr,
ty
pe
, d
ata
)
msg
_re
c(ty
pe
, d
ata
)
msg
_se
nd
_d
on
e)
internal thread
Commands Events
Quasar Group
Application = Graph of Components
RFM
Radio byte
Radio Packet
UART
Serial Packet
ADC
Temp photo
Active Messages
clocks
bit
by
tep
ac
ke
t
Route map router sensor appln
ap
pli
ca
tio
n
HW
SWExample: ad hoc, multi-hop routing of photo sensor readings
3450 B code 226 B data
Graph of cooperatingstate machines on shared stack
Quasar Group
Part 2: Distributed Computing Infrastructure for Sensor Applications
**Supported in part by a collaborative NSF ITR grant entitled “real-time data capture, analysis, and querying of dynamic spatio-temporal events” in collaboration with UCLA, U. Maryland, U. Chicago
Quasar Group
Managing Distributed Sensor Infrastructures
• A data collection and management middleware infrastructure that– provides seamless access to data dispersed across a hierarchy of
sensors, servers, and archives
– supports multiple concurrent applications of diverse types
– adapts to changing application needs
• Fundamental Issues:– Where to store data?
• do not store, at the producers, at the servers
– Where to compute?• At the client, server, data producers
Quasar Group
Outline of this section
• Sensor network architectures• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework– Services that can be customized to address needs
• Case studies – accuracy/cost tradeoffs in collection– Accuracy/cost/timeliness tradeoffs in collection– Storage/accuracy tradeoffs in archival
Quasar Group
Architectural Configurations
• Server-centric
• Streams
• Hierarchical
• Distributed
Quasar Group
Sensor Network Architectures – 1: (server centric)
• Traditional data management– client-server architecture– efficient approaches to data storage & querying – query shipping versus data shipping– data changes with explicit update
• Limitations– Sensors generate continuously changing data
• Producers must be considered as “first class” entities
– Does not exploit the storage, processing, and communicating capabilities of sensors
data/query request
data/query result clientserverdata producers
Quasar Group
Sensor Network Architectures – 2: streams
• Stream model– Data streams through the server but is not stored– Continuous queries evaluated against streaming data– Deals with problems due to dynamic data on the server side
• Limitations– Does not converse sensor resources (e.g., power)– Does not exploit the storage and processing capabilities of sensors– Geared towards continuous monitoring and not archival
applications
stream processingengine
(Approximate) Answer
synopsis in memory
data streams continuous queries
Quasar Group
Sensor Network Architectures – 3: hierarchical
• Hierarchical architecture (e.g Quasar)– data flows from producers to server to
clients periodically– queries flow the other way:
• If client cache does not suffices, then• query routed to appropriate server• If server cache does not suffice, then
access current data at producer
– This is a logical architecture• producers could also be clients• A server may be a base station or a
(more) powerful sensor node• Servers might themselves be
hierarchically organized• The hierarchy might evolve over time
server
clientclient cache
server cache and archive
Producer & its cache
QU
ER
Y F
LO
W
DA
TA
FL
OW
Quasar Group
• Distributed architecture (e.g. Dimensions)– Store data at sensor nodes– Construct distributed load-
balanced quad-tree hierarchy of lossy wavelet-compressed summaries corresponding to different resolutions and spatio-temporal scales.
– Queries drill-down from root of hierarchy to focus search on small portions of the network.
– Progressively age summaries for long-term storage and graceful degradation of query quality over time.
PR
OG
RES
SIV
ELY
AG
E
Level 0
Level 1
Level 2
PR
OG
RES
SIV
ELY
LO
SS
Y
…
Sensor Network Architectures - 4: Fully Distributed P2P
Quasar Group
Outline of this section
• Sensor network architectures• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework– Services that can be customized to address needs
• Case studies – accuracy/cost tradeoffs in collection– Accuracy/cost/timeliness tradeoffs in collection– Storage/accuracy tradeoffs in archival
Quasar Group
Balancing Tradeoffs in Application Requirements
• Accuracy– More accurate context results in better application
performance– Very high accuracy may not be needed
• Cost– Minimize resources consumed
• Network (messaging)• Energy • Storage
• Timeliness– Late data may be useless
• Reliability– Wrong/missing data may cause problems
Quasar Group
Data Representation
• Instantaneous value• Range-based
– Static Interval– Dynamic range-based
• Probabilistic distribution– (mean, stdev) with decay
• Compressed formats– wavelet– histograms– sketches
Quasar Group
What is accuracy?
• Resolution– Temporal (Aurora)
• 1 value for a sliding window of size 5• Load-shedding, subsetting
– Spatial (ask Iosif about wkshp paper)• 1 value for a given region of dimension [x.y]
• Value laxity (Quasar)– Value represented as an interval
• 9 represented as [6,12]
– Value represented as a probability distribution
Quasar Group
Tasks of a Sensor Management Framework
• Translation: mapping application quality requirement to data quality requirements– Examples:
• Target tracking: quality of track --> accuracy of data• Aggregation Queries: accuracy of results --> accuracy of data
– Strategy should adapt to expected application load
• Collection – Minimize sensor resource consumption while guaranteeing required
data quality
• Storage• Dissemination/Delivery
Quasar Group
Middleware Components
Distributed Sensor EnvironmentDistributed Sensor Environment
mobile target
tracking
activity monitoring
....
location based service
Applications
-
Server Side Components
Adaptive Middleware
Sensor Side Components
sensor data management
sensordatabase
Sensor Statemanagement
sensor selection
fault tolerance
AQ DQ translation
precision drivenadaptation
adaptive precision
setting
prediction module
prediction module
Quasar Group
Adaptive Tracking of mobile objects
Track visualization
Base station 1
Base station 2 Base station 3
ServerShow me the approximate track of the object with precision
Wireless Sensor Grid
object
Wireless link
Tracking Architecture A network of wireless acoustic sensors arranged as a grid transmitting via a
base station to server
Objective
Track a mobile object at the server such that the track deviates from the real trajectory within a user defined error threshold track with minimum communication overhead.
Quasar Group
Basic Triangulation Algorithm
P: source object power, Ii = intensity reading at ith
sensor
(x-x1)2 + (y- y1)2 = P/4 I1
(x-x2)2 + (y- y2)2 = P/4 I2
(x-x3)2 + (y- y3)2 = P/4 I3
Solving we get (x, y)=f(x1,x2,x3,y1,y2,y3, P,I1, I2 , I3, )
(x1, y1) (x2, y2)
(x3, y3)
(x, y)
More complex approaches to amalgamate more than three sensor readings possible
Those are based on numerical methods -- do not provide a closed form equation between sensor reading and tracking location !
Server can use simple triangulation to convert track quality to sensor intensity quality tolerances and use a more complex approach to track.
Quasar Group
Track quality data quality
Intensity ( I1 )
time
Intensity ( I2 )
time
Intensity ( I3 )
time
t i t( i+1 )
t i t( i+1 )
t i t( i+1 )
X (m)
Y (m)
Case 1 (power constant)
Let Ii be the intensity value of sensor
If then, track quality is guaranteed to be within track
where and C is a constant derived from the known locations of the sensors and the power of the object.
Case 2 (power varies between [Pmin , Pmax ])
If then
track quality is guaranteed to be within track
where C’ = C/ P2 and is a constant .
The above constraint is a conservative
estimate. Better bounds possible
)ξI /(1ξI|IΔ| i2ii
track
][|| max'
22
2max
min PIC
IP
PI i
trackii
Ctrack /2
Quasar Group
DSDS
Components of an Information Collection Framework
InformationMediator
InformationMediator
DS
Information Consumer
consumer
consumer
……
InformationSource
source
source
source
……
source update requestconsumer request
Quasar Group
Sensor Model
Wireless sensors : battery operated, energy constrained
Intensity above threshold
Get
err
or b
ound
fro
m s
erve
r
Removed from “active list”
Removed from “active list”
S1: activeprocessor on,
sensor on, radio on
S2: quasi-activeprocessor on,
sensor on, radio intermittent
S0: monitorprocessor on,
sensor on, radio off
Quasar Group
Data Collection Protocols
Sensor-Side protocol:
• When not in use:
– tell server to remove it from “active list”, switch to monitor mode S0
• Upon external event:
– if in S0, change to active mode S1, and update every time instant
– if in S2, update only when error bound violated
Server-Side protocol:
• If sensor state changes to S1
– add it to “active list”
– compute an error bound for it, and send to the sensor
• else, when value received, update server cache if the sensor is in “active
list”
Quasar Group
Data Collection Problem
• Let P = < p[1], p[2], …, p[n] > be a sequence of environmental measurements
(time series) generated by the producer, where n = now
• Let S = <s[1], s[2], …, s[n]> be the server side representation of the sequence
• A within- quality data collection protocol guarantees that
for all i error(p[i], s[i]) <
is derived from application quality tolerance
Sensor time series…p[n], p[n-1], …, p[1]
Quasar Group
Answering Queries
• If query quality tolerance satisfied at server (more than )
– Answer query at the server
• Else
– Probe the sensor
– Sensor guaranteed to respond within a bounded time
• Approach guarantees quality tolerance of queries
Probe result
… sensor-initiated update(sensor time series: …p[n], p[n-1], …, p[1])
probe
query Q1
(A1)query Qm
(Am)
i=[li,ui]sensor si
Imprecise data representation
Quasar Group
Simple Data Collection Protocol
• sensor Logic (at time step n)
Let p’ = last value sent to server
if error(p[n], p’) > or on timeout
send p[n] to server --- sensor if switch radio on, if need be
• server logic (at time step n)
If new update p[n] received at step n
s[n] = p[n]
Else
s[n] = last update sent by sensor
– guarantees maximum error at server less than equal to
Sensor time series…p[n], p[n-1], …, p[1]
Quasar Group
Exploiting Prediction Models
• Producer and server agree upon a prediction model (M, )
• Let spred[i] be the predicted value at time i based on (M, )
• sensor Logic (at time step n)
if error(p[n], spred[n] ) >
send p[n] to server
• server logic (at time step n)
• If new update p[n] received at step n
s[n] = p[n]
Else
s[n] = spred[n] based on model (M, )
Quasar Group
Challenges in Prediction
• Simple versus complex models?
– Complex and more accurate models require more parameters (that will need to be transmitted).
– Goal is to minimize cost not necessarily best prediction
• How is a model M generated?
– static -- one out of a fixed set of models
– dynamic -- dynamically learn a model from data
• When should a model M or parameters be changed?
– immediately on model violation:
• too aggressive: violation may be a temporary phenomena
– never changed:
• too conservative: data rarely follows a single model
Quasar Group
Challenges in Prediction (cont.)
• who updates the model?
– Server
• long-haul prediction models possible, since server maintains history
• might not predict recent behavior well since server does not know
exact S sequence; server has only samples
• extra communication to inform the producer
– Producer
• better knowledge of recent history
• long haul models not feasible since producer does not have history
• producers share computation load
– Both
• server looks for new models, sensor performs parameter fitting given
existing models.
Quasar Group
Experiment (error tolerance 20m)
A restricted random motion : the object starts at (0,d) and moves from one node to another randomly chosen node until it walks out of the grid.
Models used: static and linear
Quasar Group
Energy Savings
total energy consumption over all sensor nodes for random mobility model with varying track or track error.
significant energy savings using adaptive precision protocol over non adaptive tracking ( constant line in graph)
for a random model, prediction does not work well !
Quasar Group
Energy Savings
total energy consumption over all sensor nodes for random mobility model with varying base station distance from sensor grid.
As base station moves away, one can expect energy consumption to increase since transmission cost varies as d n ( n =2 )
better results with increasing base station distance
Quasar Group
Outline of this section
• Sensor network architectures• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework– Services that can be customized to address needs
• Case studies – accuracy/cost tradeoffs in collection– Accuracy/cost/timeliness tradeoffs in collection– Storage/accuracy tradeoffs in archival
Quasar Group
Accuracy/Cost Tradeoff
• Applications can tolerate errors in sensor data– applications may not require exact answers:
• small errors in location during tracking or error in answer to query result may be OK
– data cannot be precise due to measurement errors, transmission delays, etc.
• Cost– Communication bandwidth – Energy drain
• Quasar Approach– exploit application error tolerance to reduce communication
between producer and server and/or to conserve energy– Two approaches
• Minimize resource usage given quality constraints • Maximize quality given resource constraints
Quasar Group
• Goal: Minimize network usage while meeting application-specific precision requirements
• Our solution: – Caches store
approximations of
exact source values• Queries have
precision constraints
stale cache
precisionpe
rfor
man
ceexact cache
you decide
Modeling cost as communication bandwidth (e.g.TRAPP)
Quasar Group
Modeling energy costs in sensors
• How should sensor state be managed to minimize energy consumption in maintaining data at required quality– Sensor State: error precision, power states
• Power consumption of sensors
0.016offsleeping
12.36idlelistening
12.50Rxlistening
14.88Txactive
Power consumption (mW)Radio modeSensor state
Quasar Group
Energy Efficient Sensor State Management
sleeping
listening
active
Upon first sensor-initiated updateOr after Ts
After Tl without traffic
Upon first sensor initiated update or probe
Ta after processing last sensor-initiated update or probe
Active-Listening-Sleeping Model (ALS):
Other Models: Always-Active (AA) [Ta is infinite]Active-Listening (AL) [Tl is infinite]Active-Sleeping (AS) [Tl is 0]
Quasar Group
Issues in Energy Efficient Data Collection
• Issues– How to maintain the precision range for each sensor
• Larger increases possibility of expensive probes• Small wastes communication due to sensor-initiated updates
– When to transition between sensor states (I.e, set Ta, Tl, Ts)
• Powering down might not be optimal if we have to power up immediately
• Powering down may increases query response time
• Objective – set values for Ta, Tl, Ts, and that minimizes energy cost
normalized energy cost= energy consumed at each state + state transition energy
Quasar Group
Addressing Accuracy/Energy Tradeoffs
• We solve the energy optimization problem by solving two sub-problems– Optimize energy consumption by adjusting range
size under the assumption that the state transition is fixed
• I.e., Ta, Tl, and Ts have been optimally set
– Optimize energy consumption by adapting sensor states while assuming that the precision range for sensor is fixed
Quasar Group
Range size Adjustment for the AA/AL Model
• Optimal precision range that minimizes E occurs when
– Optimal range can be realized by maintaining this probability ratio – Can be done at the sensor
• Assuming that is the ratio of sensor-initiated update probability to probe probability:
for sensor-initiated update:
with probability min{,1}, set ’= (1+);
for probe:
with probability min{1/ ,1}, set ’=/(1+ );
Quasar Group
Range Size Adjustment for the AS/ALS Model
• Sensor side– Keep track of the number of state transitions of the last k updates
– Piggyback the probability of state transitions with the Kth update
• Server side– Keep track of the number of sensor-initiated updates and probes of
the last k updates
– Upon receiving the Kth update from the sensor• Compute the optimal precision range • Inform the sensor about the new
Quasar Group
Adaptive State Management
• Consider the AS model for derivation of optimal Ta to minimize energy consumption– Assuming (t) is the probability of receiving a request at time
instant t, the expected energy consumption for a single silent period is
– E is minimized when Ta=0 if requests are uniformly distributed in interval [0, Ta+Ts].
• In practice, learn (t) at runtime and select Ta adaptively– Choose a window size w in advance– Keep track of the last w silent period lengths and summarizes
this information in a histogram– Periodically use the histogram to generate a new Ta
Quasar Group
Adaptive State Management (Cont.)
• ci : the number of silent periods for bin i among the last w silent periods
• estimate by the distribution which generates a silent period of length ti with probability ci/w
• Ta is chosen to be the value tm that minimizes the energy consumption as follows:
bin 0bin 1
bin 2bin n-1
t0 t1 t2 t3…… tn-1 tn=Ta+Ts
c0
c1
c2
cn-1
Quasar Group
System Performance Comparison
Query Response Time Comparison
0
100
200
300
400
500
600
700
800
AA AL AS ALS
av
era
ge
qu
ery
re
sp
on
e t
ime
(u
s)
Sensor Energy Consumption Comparison
0
2
4
6
8
10
12
14
16
AA AL AS ALS
no
rma
lize
d s
en
so
r e
ne
rgy
c
on
su
mp
tio
n(u
J)
Quasar Group
Impact of Ta adaptation on System Performance
Impact of Ta Selection on Query Response Time
700
720
740
760
780
800
820
840
static Ta(0) adaptive Taaver
age
qu
ery
resp
on
se t
ime(
us)
Impact of Ta Selection on Sensor Energy Consumption
0
1
2
3
4
5
6
7
8
9
static Ta(0) adaptive Ta
no
rmal
ized
sen
sor
ener
gy
con
sum
pti
on
(uJ)
Quasar Group
Impact of Range Size Adaptation on System Performance
Impact of Range Size Adjustment on Query Response Time
0
500
1000
1500
2000
2500
fixed(0) average accuracyconstraint
adaptiveadjustment
fixed(large)
av
era
ge
qu
ery
re
sp
on
se
tim
e (
ms
)
Impact of Range Size Adjustment on Sensor Energy Consumption
0
0.01
0.02
0.03
0.04
0.05
fixed(0) average accuracyconstraint
adaptiveadjustment
fixed(large)
no
rma
lize
d s
en
so
r e
ne
rgy
co
ns
um
pti
on
(uJ
)
Quasar Group
Outline of this section
• Sensor network architectures• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework– Services that can be customized to address needs
• Case studies – Accuracy/cost tradeoffs in collection– Accuracy/cost/timeliness tradeoffs in collection– Storage/accuracy tradeoffs in archival
Quasar Group
• Continuous stream of fast changing source data
• Diverse user requirements in terms of data accuracy and service timeliness
• Effective utilization of underlying computation, communication and storage resources
Competing goals of
Timeliness
Accuracy
Cost-effectiveness
Accuracy/Cost/Timeliness Tradeoffs
Quasar Group
Real-time Communication for sensors
• John A. Stankovic, Tarek Abdelzaher, Chenyang Lu, Lui Sha, Jennifer Hou, "Real-Time Communication and Coordination in Embedded Sensor Networks," Proceedings of the IEEE, 91(7): 1002-1022, July 2003. (invited paper)
• SPEED: a stateless protocol (ICDCS’03)• RAP (RTAS’02)
Quasar Group
Real-time Data Processing
• Supporting transaction timeliness and data freshness in databases– STRIP (STanford Real-time Information
Processor) – ARCS (databases for Active Rapidly Changing
data Systems)– QMF (QoS sensitive approach for Miss Ratio and
Freshness guarantees)
Quasar Group
Modeling Application Timeliness Needs
L U
source value
LULUPREC
1),(
timeliness requirements
( source ID, request issue time, periodicity, urgency, relative deadline )+
source update request
current value+
consumer request
(accuracy requirement, bias)
accuracy favoring2
s timelinesfavoring1
preference no0
bias
Quasar Group
QoS as a metric of user satisfaction
otherwise
UtcrscrVLcrAFidelity
0
).,.(1),(
QoS
timeliness satisfaction = deadline is met:
accuracy satisfaction = answer precision requirement is higher :
& answer fidelity is 1 :
RDLTT
reqanswer PRECPREC
bias) without requestsfor (QoS 3 w
s) timelinesfavoring requestsfor (QoS 1 wQoS
accuracy) favoring requestsfor (QoS 2 w
0|
1)(,,,03
j
jjjcrjjj
crj
crcrAcrcrcrj
Biascr
AFidelityPRECPRECRDLTTBiascrw
1|
,11
j
jjj
crj
crcrcrj
Biascr
RDLTTBiascrwQoS
2|
,1)(,2
2
j
jjcrjj
crj
crAcrcrj
Biascr
PRECPRECAFidelityBiascrw
timeliness satisfaction
accuracysatisfaction
timeliness& accuracysatisfaction
1|
,11
j
jjj
crj
crcrcrj
Biascr
RDLTTBiascrwQoS timeliness
satisfaction
accuracy) favoring requestsfor (QoS 2 w
bias) without requestsfor (QoS 3 w
2|
,1)(,2
2
j
jjcrjj
crj
crAcrcrj
Biascr
PRECPRECAFidelityBiascrw
bias) without requestsfor (QoS 3 w
accuracysatisfaction
1|
,11
j
jjj
crj
crcrcrj
Biascr
RDLTTBiascrwQoS timeliness
satisfaction
Quasar Group
DS Fidelity(DS vs. source value): DS Validity(DS vs. consumer needs):
),(),( TSAVATSAFIQoD dsds Overall QoD:
Ssidsiaccess
Ssfiiaccess
ds
i
i
TsFIsp
Tspsp
TSAFI),()(
),()(
),(
aggregate
DS fidelity
otherwise0
),( if1)),(( icr
ids
PRECULPRECtscrVA
k
tscrVAtsVA
k
iids
ds
1
)),((),(
s
k
u
t
tv vuds
jidsva k
tscrVAttsVATsVATsp
s j
i 1
)),((]),[,(),(),(
Ssidsiaccess
Ssvaiaccess
ds
i
i
TsVAsp
Tspsp
TSAVA),()(
),()(
),(
aggregate DS
validity
j
i
t
t dsji
dsfi
dttsFIT
ttsFI
TsFITsp
),(1
]),[,(
),(),(
prob. of accessing a faithful s
value during T
otherwise0
if1),(
UvLtsFI ds
fidelity of s at time instant t
Quality of Data Characterization
Quasar Group
Objectives of real-time data collection
• Given a set of sources S={s1,…,sl} and an Input instance I , which is a collection of m source update requests and n consumer requests I=SRCR={sr1,…,srm;cr1,…,crn}, our goal is to
– Maximize QoS
– Maximize QoD
– Minimize Cost
Quasar Group
Joint optimization of QoS, QoD and Cost
• Dynamicity– Highly dynamic system and network condition– Unpredictable application workload – Frequently changing information sources
• Inter-relationship between QoS and QoD is not straightforward: QoD QoS– Prioritize source update requests
QoD deadline miss ratio QoS & missing opportunities
– Prioritize consumer requests• QoS stale data QoD & making wrong decisions
?
Quasar Group
One approach
• Frame the tradeoffs as two sub-problems
– Manipulate QoS via a scheduling algorithm, assuming DS is well maintained (QoD)
– Adjust QoD via a DS maintenance algorithm, assuming an efficient scheduling algorithm is applied (QoS)
Quasar Group
Design of the Information Mediator
InformationMediator
InformationMediator
DS
Information Consumer
consumer
consumer
……
InformationSource
source
source
source
……
probe
updateDS
maintainer
feedback
answer
check value
stored range
consumer-initiated probe
consumer-initiated source update request
……source update request queue
consumer request orsource update request
requestservicer
schedulerrequest source update request
source update request
……consumer request queue
consumer request
Quasar Group
Design of the Scheduling Algorithm
• Issues – Decide on an ordering of the incoming source update
requests• The most recent update will be processed first
– Decide on a relative ordering of source update and consumer requests
Quasar Group
Scheduling Strategies
• CF (Consumer request First)• SF (Source update request First)• SU (Split Update)
– Updates from popular data are assigned higher priority than consumer requests
• OD (On-Demand Update)– Only when consumer requests encounter stale
data, will the corresponding source update requests be applied
Quasar Group
Timeliness-Accuracy Balanced Scheduling (TABS)
Assignment absolute deadline
TABS schedulabilityGiven a set of np periodic requests with processor utilization UP , a TB server with processor utilization UAP , the whole set of task is schedulable if UP+UAP<=1.
pn
i ii
iP PERRDL
EU
1 ),min{ PER
RDLtime
t
ADL=t+PER
ADL=t+RDL
periodic requests:
Processorutilization
PAP UU 1
ADLi=max(t, ADLi-1)+Ei/UAP
aperiodic requests: t
time
request iADLi-1
RDL
Apply Earliest-Deadline-First
Quasar Group
Minimized Cost Directory Service Maintenance (MC)
• Analyze cost involved in the collection process• Range adjustment
– Consumer-initiated update: shrink the range– Source-initiated update: curve fitting
w-1 w
time
sourcevalue
fittedcurve
monitoring window
slope:mw-1
mw > mw-1: increase range size
mw < mw-1: decrease range size
Quasar Group
Experiments
• Performance metrics– QoS, QoD, Cost (the number of messages exchanged)– Efficiency of System EoS (QoS QoD/Cost)
• Experiments– Evaluation of all the possible policy combination in terms of the
overall EoS– Evaluation of system heterogeneity in terms of source
capabilities and deadline variations– Evaluation of benefits by adding intelligence into each sub-
component of the mediator
Quasar Group
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
25 50 100 150
the number of sources
Eo
S
Benefits of Intelligent Policies
FCFS+SS
TABS+SS
TABS+MC
The EoS is improved as more intelligence is added to each component
• TABS ensure fairness among the requests
• MC decreases the DS maintenance overhead
Quasar Group
Fusing Energy Efficient Data Collection and In-network Aggregation
• Issues– Hierarchical precision range adjustment– Cluster forming and dynamic maintenance
access point access point……
……
Quasar Group
Value update -- 1
(a)
AP
C1: {200 -20, 200+20}
n1: {100 -10, 100+10}
n2: {100 -10, 100+10}
112
AP
C1: {212 -20, 212+20}
n1: {112 -10, 112+10}
n2: {100 -10, 100+10}
{212 -10, 212+10}
(b)
Quasar Group
Value update -- 2
112
AP
C1: {200 -20, 200+20}
n1: {113.7 -10, 113.7+10}
n2: {86.3 -10, 86.3+10}
85 113.7 86.3
(d)
112
AP
C1: {224 -20, 224+20}
n1: {112 -10, 112+10}
n2: {112 -10, 112+10}
112
224
(c)
Quasar Group
Error Adjustment
• When?
– (fmax - fmin)/fmax >= rth
• How?– dfmax = a* dfmax +(1-a)*(dfmax + dfmin)*(fmax /(fmax + fmin))
– dfmin = a* dfmin +(1-a)*(dfmax + dfmin)*(fmin /(fmax + fmin))
Quasar Group
Fault Tolerance Issues
• Communication– Routing
• SPIN: disseminate data to all the sensors• Braided Diffusion: maintain multiple braided paths as backup• GRAB (Gradient Broadcast): controlled mesh forwarding
– Transport protocol• PSFQ (pump slowly, fetch quickly): store-and-forward, multi-
hop forwarding• ESRT (event to sink reliable transmission): adjust source
reporting frequency to avoid congestion and maintain enough reliability
• RMST (reliable multi-segment transport): MAC layer
• Storage– R-DCS (Resilient Data Centric Storage): store event data at
the closest R replica nodes
Quasar Group
Outline of this section
• Sensor network architectures• Sensor application needs
– Accuracy, timeliness, cost, reliability
• Tasks of a middleware framework– Services that can be customized to address needs
• Case studies – Accuracy/cost tradeoffs in collection– Timeliness/accuracy/cost tradeoffs in collection– Storage/accuracy tradeoffs in archival
Quasar Group
Archiving Sensor Data
• Often sensor-based applications are built with only the real-time
utility of time series data.
– Values at time instants <<n are discarded.
• Archiving such data consists of maintaining the entire S sequence,
or an approximation thereof.
• Importance of archiving:
– Discovering large-scale patterns
– Once-only phenomena, e.g., earthquakes
– Discovering “events” detected post facto by “rewinding” the time series
– Future usage of data which may be not known while it is being collected
Quasar Group
Quality Sensitive Archival
• Let P = < p[1], p[2], …, p[n] > be the sensor time series• Let S = < s[1], s[2], …, s[n] > be the server side representation
• A within archive quality data archival protocol guarantees thaterror(p[i], s[i]) < archive
• Trivial Solution: modify collection protocol to collect data at quality guarantee of min(archive , collect)– then data collection protocol described earlier will provide a archive quality data
stream that can be archived.
• Better solutions possible since – archived data not needed for immediate access by real-time or forecasting
applications (such as monitoring, tracking) – compression can be used to reduce data transfer
Quasar Group
Addressing Cost/Quality Tradeoffs in Data Archival – Sample Protocol
• Sensors compresses observed time series p[1:n] and sends a lossy compression to the server
• At time n :
– p[1:n-nlag] is at the server in compressed form s’ [1:n-nlag] within-
archive
– s[n-nlag+1:n] is estimated via a predictive model (M, )
• collection protocol guarantees that this remains within- collect
– s[n+1:] can be predicted but its quality is not guaranteed
• it is in the future and thus the sensor has not observed these values
…p[n], p[n-1], .. compress
Sensor memory buffer
Sensor updates for data collection
Compressed representation for archiving
processing at sensor exploited to reduce communication cost and hence battery drain
Quasar Group
Piecewise Constant Approximation (PCA)
• Given a time series Sn = s[1:n] a piecewise constant approximation
of it is a sequence
PCA(Sn) = < (ci, ei) >
that allows us to estimate s[j] as:
scapt [j] = ci if j in [ei-1+1, ei]
= c1 if j<e1
Time
Value
e1 e2 e3 e4
c1
c2
c3
c4
Quasar Group
Online Compression using PCA
• Goal: Given stream of sensor values, generate a within-archive PCA
representation of a time series
• Approach (PMC-midrange)
– Maintain m, M as the minimum/maximum values of observed samples
since last segment
– On processing p[n], update m and M if needed
• if M - m > 2archive , output a segment ((m+M )/2, n)
Time
Value
Example: archive = 1.5
1 2 3 4 5
23
4
2.5
6
Quasar Group
Online Compression using PCA
• PMC-MR …
– guarantees that each segment compresses the corresponding
time series segment to within-archive
– requires O(1) storage
– is instance optimal
• no other PCA representation with fewer segments can meet the
within-archive constraint
• Variant of PMC-MR
– PMC-MEAN, which takes the mean of the samples seen thus far instead
of mid range.
Quasar Group
Improving PMC using Prediction• Observation
– Prediction models guarantee a within- collect version of the time series at
server even before the compressed time series arrives from the
producer.
• Can the prediction model be exploited to reduce the overhead of
compression.
– If archive> collect no additional effort is required for archival --> simply
archive the predicted model.
• Approach:
– Define an error time series E[i] = p[i]-spred[i]
– Compress E[1:n] to within-archive instead of compressing p[1:n]
– The archive contains the prediction parameters and the compressed
error time series
– Within-archive of E[I] + (M, ) can be used to reconstruct a within- archive
version of p
Quasar Group
Combing Compression and Prediction (Example)
-5
0
5
10
15
20
25
30
0 10 20 30 40 50 60
Predicted Time Series
Actual Time Series
-5
0
5
10
15
20
25
0 10 20 30 40 50 60
Actual Time Series
Compressed Time Series
(7 segments)
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
Compressed Error
(2 segments)
Error =
Actual – Predicted
Quasar Group
Estimating Time Series Values
• Historical samples (before n-nlag) is maintained at the server within-archive
• Recent samples (between n-nlag+1 and n) is maintained by the sensor
and predicted at the server.
• If an application requires q precision, then:
– if q collect then it must wait for time in case a parameter refresh is en route
– if q archive but q < collect then it may probe the sensor or wait for a
compressed segment
– Otherwise only probing meets precision
• For future samples (after n) immediate probing not available as an
option
Quasar Group
Distributed Computing Infrastructure for Sensors
• Designing Distributed Architectures for Sensor Networks– Server oriented -- data migrates to server from sensors
• Store or not store (stream)• Useful for all types of applications -- archival, analysis, monitoring• When should data migrate -- periodically, application quality-based
way based on application (quasar approach ) • should data migrate in its original raw form or in some aggregated
form. – Distributed approach
• Data does not migrate to any single server but remains in the sensor network. Queries migrate from the server to the network
• Tiny DB approach, dimension Approach
• Real-time• Fault tolerance
Quasar Group
Part 3: Query Processing in Sensor Applications
Quasar Group
Outline
• Need for a declarative query language for sensor applications
• Query Taxonomy• Issues impacting sensor query processing
– Sensor database research landscape
• Sample query Processing techniques
Quasar Group
Programming Sensor Nets Is Hard
• Applications must be “energy aware”– Naive implementations may result in battery drain in days while
careful programming may conserve power for months• interleave sleep with processing and transmission
– Recharging battery frequently not feasible
• Lossy, multi-hop, low-bandwidth, short range communication– 20% loss @ 5m– often desirable to trade computation for communication– 200-800 instructions per bit transmitted!!– applications must be “network aware”
• Highly distributed environments• Once deployed, applications cannot be easily administered• Limited development and debugging tools
High-Level Abstraction Is Needed!
Quasar Group
Declarative Queries
• Users specify the data they want– Simple, SQL-like queries– Using predicates, not specific addresses
• Challenge is to provide:– Expressive & easy-to-use interface– High-level operators
• Well-defined interactions• “Transparent Optimizations” that many programmers would miss
– Sensor-net specific techniques
– Power efficient execution framework
Quasar Group
Database View of Sensor Data
• Sensors viewed as a single table– Columns are sensor data– Rows are individual sensors
• Sensors table is an unbounded, continuous data stream– Operations such as sort and
symmetric join are not allowed on streams
– They are allowed on bounded subsets of the stream (windows)
• SQL (with minor extensions) can be used as a declarative query language
timetime NodeidNodeid LocationLocation valuevalue
0 1 17 455
0 2 25 389
1 1 17 422
1 2 25 405
SELECT nodeid, nestNo, lightFROM sensorsWHERE light > 400
“Find the sensors in bright nests.”
Quasar Group
Taxonomy of Queries
• Query Generality– Simple selection, aggregation, full-blown SQL
• Continuous queries– query evaluated continuously on sensor data streams– Issues:
• How long– For a specified period, for lifetime of sensor
• how often– adaptive rate (based on load/utility/value), fixed rate
• Event based queries
Quasar Group
Aggregation Queries
Epoch region CNT(…) AVG(…)
0 North 3 360
0 South 3 520
1 North 3 370
1 South 3 520
“Count the number occupied nests in each loud region of the island.”
SELECT region, CNT(occupied) AVG(sound)
FROM sensors
GROUP BY region
HAVING AVG(sound) > 200
EPOCH DURATION 10s
3
Regions w/ AVG(sound) > 200
SELECT AVG(sound)
FROM sensors
EPOCH DURATION 10s
2
Quasar Group
General SQL Query
General: Is there anyone in the building?
RoomID = RoomID
Join
Value>10dB
Value>10lm
SELECT roomidFROM lightsensors as L, soundsensors as SWHERE L.roomid = S.roomid
Quasar Group
Event-Based Queries
• An alternative to continuous polling for data• Example
ON EVENT bird-detector(loc):SELECT AVG(light), AVG(temp), event.locFROM sensors AS sWHERE dist(s.loc, event.loc) < 10mSAMPLE INTERVAL 2s FOR 30s
Quasar Group
Lifetime Queries
• Lifetime querySELECT …
LIFETIME 30 days
May not be able to transmit all the data
Estimate sampling rate that achieves this
SELECT …
LIFETIME 10 days
MIN SAMPLE INTERVAL 1s
Adapted from slides ©Sam Madden
Quasar Group
•
Adapted from slides ©Sam Madden
Processing Lifetimes: Issues
• Provide formulas for estimating power consumption: set maximum per-node sampling rates
• What makes this difficult?– multiple sensing types (temp, accel) with different drain– estimating the selectivity of predicates– amount transmitted by a node varies widely– root is a bottleneck: all nodes rates must correspond to it– aggregation vs. sending individual values– conditions change: multiple queries, burstiness, message
losses
• What to do when can’t transmit all the data
Quasar Group
Issues impacting Query Processing
• Where Does data resides?– sensor/server
• Where does the query originate?– sensor/server
• Where should the results be delivered?– sensor/server
• How is data represented?– Continuous data streams require unbounded storage
• Represent data as a synopses (spatial/temporal aggregation) – Sliding Windows, Samples, Sketches, Histograms, Wavelet
representation– Precise / approximate representation
• with or without error guarantees• guarantees can be deterministic or probabilistic
Quasar Group
Sensor Database Research Landscape
Data & Query Location
•server•Sensor network
Data representation•precise representation•Approximate value•Specified spatial/temporal resolution
Type of query•Aggregation•selection•General SQL•continuous•Event-based
Query Evaluation•At server•In network•At both server and network
Quasar Group
Classification of Query Processing Techniques (1) • Data and query @ server
– Data Stream Model• Data streams from data sources to servers• server maintains a synopses• continuous queries at server
Quasar Group
Stream Data Management
• Data streams through the server• Load shedding
– at input: sampling– at server: if load exceeds capacity
• Continuous queries evaluated against streaming data at sensor• Data represented as a synopses
– sliding window, Sketches, histograms, wavelets, sampling
• Deals with problems due to dynamic data on the server side• But
– Does not converse sensor resources (e.g., power)– Does not exploit the storage and processing capabilities of sensors– Geared towards continuous monitoring and not archival applications
• Examples:Aurora (Brown/MIT), Streams (Stanford), Hancock (AT&T), OpenCQ (Georgia) Tapestry (Xerox), Telegraph (Berkeley), ...
stream processingengine
(Approximate) Answer
synopsis in memory
data streams continuous queries
Quasar Group
Classification of Query Processing Techniques (1) • Data and query @ server
– Data Stream Model• Data streams from data sources to servers• server maintains a synopses• continuous queries at server• Examples:Aurora (Brown/MIT), Streams (Stanford), Hancock (AT&T), OpenCQ
(Georgia) Tapestry (Xerox), Telegraph (Berkeley), …– Quality-Aware Query answering
• quality aware data collection at the server– attempts to minimize communication/energy consumption in network during data
collection • Applications/ Queries have quality tolerance
– query tolerance converted to data quality requirement • If query’s error tolerance met by data at server, query computed @ server• Else, either more accurate data brought to server, or servers and sensors
collaborate to answer query• Error tolerance of applications exploited for minimizing resource utilization• Examples: Quasar (UCI), TRAP (Stanford).
– Quasar exploits in-network processing when query cannot be answered at server
Quasar Group
Classification of Query Processing Techniques (2)
• In network query processing– Query originates and results needed at base station
• Two steps:– Push query to sensor network– gather results
• Trades computation to reduce communication among sensors.• Examples: TinyDB (Berkeley), Cougar (Cornell)
– Query originates and results required anywhere in network• Distributed query processing within sensor network• Example: SURGE (UCI), research @ UCLA
Quasar Group
Quality Aware Queries (QaQ)
• Data represented at server at a given error tolerance
– Actual sensor values: Pi = pi[1], pi[2], …, pi[n]…. for sensor i
– Server representation: Si = si[1], si[2], … si[n] …. for sensor I
– Error guarantee: for all I, j error(pi[j], si[j]) < i for a given value of i
• Queries have an associated level of error tolerance.
• If query quality tolerance satisfied at server (more than )– Answer query at the server
• Else
– Probe the sensor
– Sensor guaranteed to respond within a bounded time
• Approach guarantees quality tolerance of queries
Probe result
… sensor-initiated update(sensor time series: …p[n], p[n-1], …, p[1])
probe
query Q1
(A1)query Qm
(Am)
i=[li,ui]sensor si
Imprecise data representation
Quasar Group
Overview of QaQ Processing Research
• Mapping application quality requirement to data quality requirements– Target Tracking using acoustic sensors [MW ‘03]– Spatial range queries [DEXA ‘03]
• Quality-based data collection – General framework [DS Online ‘03]– To support monitoring queries over current data [Qi+03]– For sensor data archival [ICDE ‘03]– With real-time constraints [RTSS ‘03]– With support for in-network aggregation [Yu+03]
• Quality-cognizant query processing– Aggregation queries [Quasar-1, Trap-1, Trap-2]– Continuous aggregation queries [Trap-3]– Selection Queries [ICDE ‘04]– General SQL queries (open problem)
Quasar Group
QaQ Selection: Problem Definition
• There is a collection T of imprecise objects– E.g., { [1,3], [2,5], [4,9] } represents {2, 3, 5}
• The query is: “Retrieve objects from T which satisfy predicate ”
– The query specifies quality requirements
– The system must return some approximate result that meets the quality requirements and with minimum overall cost.
Quasar Group
Impact of Data Imprecision
• Objects are classified as:– a is a NO object– b, f are MAYBE objects– c, d, e are YES objects
• The exact set is E = { b,
c, d,
e}
Imprecise Object o
Precise Object o can
be retrieved with a probe
Selection
a b c d e f
Quasar Group
Defining Quality
• Measures the accuracy of an Approximate answer A• Set-based Quality
– Precision: p = |A E | / | A |. • E.g., p = 4/5 (if b, c, d, e, f returned as answers)
– Recall: r = | A E | / | E |.• E.g., r = 4/4 = 1 (if b, c, d, e, f returned as answers)
• Value-based Quality– Laxity of an object is l (o ). E.g., l ([2,3]) = 3-2=1
– Laxity of A is l max = max xA l (x)
• Query specifies upper bounds pq, rq, lmaxq
Selection
a b c d e f
Quasar Group
Evaluating QaQ Selection Operator
Read Object
YESNO
MAYBE
• Probe
• Forward
• Ignore• Probe
• Forward
• Ignore
•Another possibility is to store the object and deal with it later
•Might be good under certain situations based on available memory at the server
Quasar Group
The Decision Problem
• How should the QaQ selection operator decide – When to probe– When to forward– When to ignore
• Objective:– Meet query quality requirement – Minimize cost
Quasar Group
Constraints on the Decision
• Some decisions are fixed -- we have no choice!
• No objects with l(o) greater than the query tolerance lqmax must be forwarded
• The precision guarantee pG must never be less than the query tolerance pq
– If no new YES objects are seen might lead to pq violation
• If |A Y | / (|Y |+|Ms-A|) is less than the query tolerance rq you can’t ignore an object– This might lead to an rq violation if no new YES objects are seen
Quasar Group
Two Naïve Approaches
• Two simple heuristics:– STINGY avoids probes: it ignores MAYBE objects and
objects exceeding the lqmax threshold.
• STINGY is conservative, but sometimes it is forced to probe to meet the quality guarantees.
– GREEDY forwards all MAYBE objects and probes all objects that exceed the lqmax threshold.
• GREEDY tries to produce the result quickly by not ignoring objects, but sometimes it uses too many probes and forwards too many objects
Quasar Group
Impact of Probe, Forward, Ignore actions to quality
• + increase, - decrease, = remains the same
Quasar Group
The “decision” Plane (ICDE 2004)
s(o): probability of a MAYBE object satisfying the selection
Laxit
y l(o
)
s(o)=0 0<s(o)<1 s(o)=1
1
lqmax
2 3
4 5
6
7s3
s5
Forward with probability pfm
or ignore
Ignore Probe
Probe
Probe with probability ppy
or ignore
Forward
No Maybe Yes
Quasar Group
The Optimization Problem
• Free parameters ppy, s3, s5 , pfm
• Estimate:– Number of YES/MAYBE/NO objects– Number of YES/MAYBE objects exceeding the
lqmax threshold
– Distribution of s (o )
• Minimize cost W in parameter space (ppy, s3 ,
s5 , pfm) subject to Precision, Recall, Laxity guarantees
Quasar Group
Query Aware Query Processing (Review)
• Quality aware data collection
• Queries have error tolerance
• QaQ query processing optimizes resource consumption while ensuring query quality requirement.
• A Dual problem: – optimize quality given resource constraints
• Aurora Stream Processing system explores such an approach
Quasar Group
Quasar Group
AURORA in the Sensor DatabaseLandscape
Data & QueryLocation
•server
Data representation•time sampled
Type of query•continuous
Query Evaluation•At server
Quasar Group
Aurora System Model
• Input Streams are unpredictable– If system processing capacity is reached load must be dropped by invoking the
Load Shedder
• The Output Streams must be useful to applications. – Specified by Quality of Service (QoS)
• The Goal: shed load intelligently so that– system operates within processing capacity– QoS of output streams maximized
Quasar Group
Quality of Service
Types of QoSLatency
Shows utility drop as answers take longer to achieve (Handled by Scheduler)
Value-basedShows which output values are most important (Handled by Load Shedder)
Loss-toleranceShows how approximate answers affect a query (Handled by Load Shedder)
utility
values0 80 120 200
1.0
0.4
utility
% delivery100 50 0
1.0
0.7
Value-based QoS
Loss-tolerance QoS
Quasar Group
Key Questions
how is load measured?Via static load coefficients and dynamic monitoring of stream rates
when to shed load?When processing capacity does not suffice for handling the system load
where to shed load?In which segments of the query processing graph?
how much load to shed?What fraction of tuples will be discarded?
which tuples to drop?Do tuple values affect the decision of whether to drop them or not?
Quasar Group
How to Measure Load: Load Coefficients
Load Coefficients (L)the number of processor cycles required to push a single tuple through the network to the outputs
c1
s1
c2
s2
cn
sn
…I O
• n operators
• ci = cost
• si = selectivityTotal Load (Load)
Depends on load coefficients Li and input stream rates
• m input streams
• ri = stream rate
Load =
Quasar Group
Load Coefficient (Example)
L1 = 10 + (0.5 * 10) + (0.5 * 0.8 * 5) + (0.5 * 10) = 22
L2 = 10 + (0.8 * 5) = 14
1
c1 = 10
s1 = 0.5
2
c2 = 10
s2 = 0.8
3
cn = 5
sn = 1.0
I
O1
4
c2 = 10
s2 = 0.9
O2
L1 = 22
L2 = 14 L3 = 5
L4 = 10L(I) = 22
Quasar Group
N: networkI: input streamsC: processing capacity
Shed load when:
Load(N(I)) > C
When to Shed Load
Quasar Group
How to Shed Load: Drop Tuples
Dropk %
Random Drop
FilterP(value)
Semantic Drop
QoS
QoS
σ
π
U σ
π
σ
Drop tuples randomlyDrop tuples based on the utility of their value
Modify N into N’ by inserting “drop” operators, such that:
Load(N’(I)) < H * C
Quasar Group
2
1
3
Where to Shed Load
Usually at the inputs, butPlacing a drop in 1 relieves all three operatorsQoS of both output streams is affected
Quasar Group
Random Drops
Greedy approach:Order drop locations in ascending Loss/Gain ratiosInsert drops in location with the minimum Loss/Gain ratio first; repeat until enough capacity has been retrievedThe amount of the drop is in increments of STEP_SIZE
The drop operator has a cost: inserting a drop for <STEP_SIZE does not retrieve any processing capacity!
Quasar Group
Semantic Drops
Greedy approach:Each value interval has a frequency fi and a utility ui
Start dropping from the interval with minimum ui
First drop from interval with utility 0.2 and relative frequency 0.4You can drop at most 40% of the tuples using the first interval
If this suffices, drop as many as neededElse, choose the interval with next minimum ui
Quasar Group
In network Query Processing
• Two steps:– Query Dissemination
• Exploit broadcast based routing to disseminate query to sensors– Query execution and Result accumulation
• Gather and compute results in network en-route to the root (base station)
• Plusses– In network computation reduces periodic communication of raw results.– Trades computation for communication – a very worthwhile goal for sensor nets
• 1 bit communication approx. equivalent to 800 instructions!
• Minuses– Query dissemination and execution synchronization overheads.
• Benefit must exceed cost!– Applicable only when sensor data does not need to be archived.– Scalability to really large networks not studied.
• Examples– TinyDB (Berkeley)
• TAG – in-network aggregation• AQP – in network SQL
– SURGE (UCI)• distributed in-network aggregation
Quasar Group
Query Propagation in TAG
SELECT SELECT COUNT(*)…COUNT(*)…
1
2 3
4
5
Epoch
Comm. Slot
Broadcast based communication
Quasar Group
Basic Aggregation
• In each epoch:– Each node samples local sensors once– Generates partial state record (PSR)
• local readings • readings from children
– Outputs PSR during its comm. slot.
• At end of epoch, PSR for whole network output at root
• Many optimizations possible– grouping, pipelining
1
2 3
4
5
Quasar Group
Illustration: Aggregation
4
3
2
11
1
54321
1
2 3
4
5
1
Sensor #
Slo
t #
Slot 1SELECT COUNT(*) FROM sensors
Quasar Group
Illustration: Aggregation
4
3
22
11
1
54321
1
2 3
4
5
2
Sensor #
Slo
t #
Slot 2SELECT COUNT(*) FROM sensors
Quasar Group
Illustration: Aggregation
4
313
22
11
1
54321
1
2 3
4
5
31
Sensor #
Slo
t #
Slot 3SELECT COUNT(*) FROM sensors
Quasar Group
Illustration: Aggregation
54
313
22
11
1
54321
1
2 3
4
5
5
Sensor #
Slo
t #
Slot 4SELECT COUNT(*) FROM sensors
Quasar Group
Illustration: Aggregation
54
313
22
11
11
54321
1
2 3
4
5
1
Sensor #
Slo
t #
Slot 1SELECT COUNT(*) FROM sensors
Quasar Group
Aggregation Framework
• As in extensible databases, TAG support any aggregation function conforming to:
Aggn={finit, fmerge, fevaluate}
finit{a0} <a0>
Fmerge{<a1>,<a2>} <a12>
Fevaluate{<a1>} aggregate value
(Merge associative, commutative!)Example: Average
AVGinit {v} <v,1>
AVGmerge {<S1, C1>, <S2, C2>} < S1 + S2 , C1 + C2>
AVGevaluate{<S, C>} S/C
Partial State Record (PSR)
Quasar Group
Types of Aggregates
• SQL supports MIN, MAX, SUM, COUNT, AVERAGE
• Any function can be computed via TAG
• In network benefit for many operations– E.g. Standard deviation, top/bottom N, spatial
union/intersection, histograms, etc. – Compactness of PSR
Quasar Group
Taxonomy of Aggregates
• TAG insight: classify aggregates according to various functional properties– Yields a general set of optimizations that can automatically be applied
Hypothesis Testing, SnoopingCOUNT : monotonicAVG : non-monotonic
Monotonic
Applicability of Sampling, Effect of Loss
MAX : exemplaryCOUNT: summary
Exemplary vs. Summary
Routing RedundancyMIN : dup. insensitive,AVG : dup. sensitive
Duplicate Sensitivity
Effectiveness of TAGMEDIAN : unbounded, MAX : 1 record
Partial State
AffectsExamplesProperty
Quasar Group
TAG Advantages
• Communication Reduction– Important for power and contention
• Continuous stream of results– Smooth transient faults across epochs
• Lots of optimizations– Via operator semantics
Quasar Group
Simulation Environment
• Evaluated via simulation
• Coarse grained event based simulator– Sensors arranged on a grid– Two communication models
• Lossless: All neighbors hear all messages• Lossy: Messages lost with probability that increases with
distance
Quasar Group
Benefit of In-Network Processing
Total Bytes Xmitted vs. Aggregation Function
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
EXTERNAL MAX AVERAGE COUNT MEDIANAggregation Function
Tota
l Byt
es X
mitt
ed
Simulation Results
2500 Nodes
50x50 Grid
Depth = ~10
Neighbors = ~20
Some aggregates require dramatically more state!
Quasar Group
Processing in Network SQL Processing (Berkeley)
• Query Disseminated to sensors
• Results gathered en-route to the root (base station)
• Issues:
– How should the query be processed?How should the query be processed?• Sampling as an operator, Power-optimal orderingSampling as an operator, Power-optimal ordering• Frequent events as joinsFrequent events as joins
– Which nodes have relevant data?• Semantic Routing Tree for effective pruning
– Nodes that are queried together route together
– Which samples should be transmitted?• Pick most “valuable”?• Adaptive transmission & sampling rates
Quasar Group
Power-Optimal Operator Ordering: Interleave Sampling + Selection
SELECT light, mag FROM sensorsWHERE pred1(mag) AND pred2(light)SAMPLE INTERVAL 1s
• Energy cost of sampling mag >> cost of sampling light
1500 uJ vs. 90 uJ
• Correct ordering (unless pred1 is very selective):2. Sample light
Apply pred2Sample magApply pred1
1. Sample light Sample magApply pred1Apply pred2
3. Sample mag
Apply pred1
Sample light
Apply pred2
Adapted from slides ©Sam Madden
Quasar Group
Attribute Driven Topology Selection
• Observation: internal queries often over local area
– Or some other subset of the network• E.g. regions with light value in [10,20]
• Idea: build topology for those queries based on values of range-selected attributes
– For range queries– Relatively static trees
• Maintenance Cost
Adapted from slides ©Sam Madden
Quasar Group
Attribute Driven Query Propagation
1 2 3
4
[1,10]
[7,15]
[20,40]
SELECT …
WHERE a > 5 AND a < 12
Precomputed intervals = Semantic Routing Tree (SRT)
Early pruning
Adapted from slides ©Sam Madden
Quasar Group
Attribute Driven Parent Selection
1 2 3
4
[1,10] [7,15] [20,40]
[3,6]
[3,6] [1,10] = [3,6]
[3,6] [7,15] = ø
[3,6] [20,40] = ø
Even without intervals, expect that sending to parent with closest value will help
Adapted from slides ©Sam Madden
Quasar Group
Simulation Result
Nodes Visited vs. Query Range
0
50
100
150
200
250
300
350
400
450
0.001 0.05 0.1 0.2 0.5 1Query Size as % of Value Range
(Random value distribution, 20x20 grid, ideal connectivity to (8)
neighbors)
# o
f N
odes
Vis
ited (
40
0 =
Max
)
Best Case (Expec ted)Closest P arentNearest ValueSnooping
Random Parent
Adapted from slides ©Sam Madden
Quasar Group
Acquisitional Query Processing
• How should the query be processed?– Sampling as an operator, Power-optimal ordering– Frequent events as joins
• Which nodes have relevant data?– Semantic Routing Tree for effective pruning
• Nodes that are queried together route together
• Which samples should be transmitted?– Pick most “valuable”?– Adaptive transmission & sampling rates
Adapted from slides ©Sam Madden
Quasar Group
Adaptive Transmission Rates
Sample Rate vs. Delivery Rate
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12 14 16Samples Per Second (Per Mote)
Ag
gre
gat
e D
eliv
ery
Rat
e (P
acke
ts/S
eco
nd
)
1 mote
4 motes
4 motes, adaptive
Adaptive = 2x % Successful Xmissions
TinyDB monitors channel contention & backs-off as needed
Adapted from slides ©Sam Madden
Quasar Group
Prioritizing Data Delivery
• Score each item
• Send largest score– Out of order -> Priority Queue
• Discard or aggregate when buffer is full
[1,2]
Adapted from slides ©Sam Madden
Quasar Group
Choosing Data To Send
Delta encoding
[1,2]
Time vs. Value
0
2
4
6
8
10
12
14
16
1 2 3 4
Time
Val
ue(time, value)
Adapted from slides ©Sam Madden
Quasar Group
Choosing Data To Send
[2,6] [3,15] [4,1]
[1,2]
|2-6| = 4
|2-15| = 13
|2-4| = 2
Time vs. Value
0
2
4
6
8
10
12
14
16
1 2 3 4
Time
Val
ue
Delta encoding
Select which of the 3 to send
Adapted from slides ©Sam Madden
Quasar Group
Choosing Data To Send
[2,6]
[3,15]
[4,1]
[1,2]
Time vs. Value
0
2
4
6
8
10
12
14
16
1 2 3 4
Time
Val
ue
|2-6| = 4 |15-4| = 11
Delta encoding
Keep selectinguntil hit maxdelivery rate
Adapted from slides ©Sam Madden
Quasar Group
Choosing Data To Send
[2,6]
[3,15] [4,1][1,2]
Time vs. Value
0
2
4
6
8
10
12
14
16
1 2 3 4
Time
Val
ue
Delta encoding
Adapted from slides ©Sam Madden
Quasar Group
Choosing Data To Send
[2,6] [3,15] [4,1][1,2]
Time vs. Value
0
2
4
6
8
10
12
14
16
1 2 3 4
Time
Val
ue
Delta encoding
If manageto send all
Adapted from slides ©Sam Madden
Quasar Group
Delta + Adaptivity
• 8 element queue• 4 motes
transmitting different signals
• 8 samples /sec / mote
Adapted from slides ©Sam Madden
Quasar Group
SURCH in the Sensor Database Landscape
Data & Query Location
•At sensors
Data representation•Precise
Type of query•ad hoc aggregation
Query Evaluation•In network•distributed
http://www.ics.uci.edu/~quasar
Quasar Group
SURCH Query Processing
• SURCH Query:
• Event based Query – may initiate at any node in network
• Results accumulated at a specified destination• Region specifies selection on sensors• In network (fully distributed) query processing
ON EVENT e
SELECT Attributes or AggregatesFROM Sensors SWHERE S.loc є RegionDESTINATION nodeID
UPON Predicate
Quasar Group
SURCH Query Processing
• Three Phases– Neighborhood discovery
• broadcast based communication
– Query Propagation• a sensor propagates if its neighborhood contains sensors to
which query not yet propagated
– Capture Partial results and route to destination• a node holds partial results if it contains aggregate values that
are not broadcasted furtherdestination
r2
Q
r1
Q
result1
result2 initiator2
initiator1
generator
Quasar Group
Neighborhood Discovery
– A node ns broadcasts query(e.g. MAX) and current result to all neighbors.
– Neighbor nni responds with its value vni after waiting for a time period (TTR) based on fitness of value
• node having data with highest “fitness” value responds first.
– If partial results change, immediate rebroadcast by ns to neighbors• high likelihood that all neighbors learn the new MAX even without
responding
ns
nn1
nn2
nnk
broadcast
re-broadcast
response
Quasar Group
Query Propagation
• 1-Dimensional illustration for a MAX query• ni initiates a query
sensors
value
ni
radio range
1
Quasar Group
Query Propagation
• 1-Dimensional illustration for a MAX query • ni initiates a query
sensors
value
ni
radio range
12 2
Quasar Group
Query Propagation
• 1-Dimensional illustration for a MAX query • ni initiates a query• nr1 and nr2 hold partial results.
sensors nr1
value
ni
radio range
nr2
12 2 33456
Quasar Group
Capture Partial Results
• Who have the partial results?– Nodes whose results are not propagated further
• boundary of the query region• irregular propagation frontier
– detected by remembering if any neighbor propagates the query at next level.
• The partial results will be sent to a destination node for final processing.
Quasar Group
Issue in Query Propagation
• Which nodes should broadcast query in network? • Choose the broadcasting nodes based on
optimization goals:– minimal overall cost
• minimum number of broadcasting nodes• minimum size connected dominating set
– maximum network lifetime (uniform workload)• take into account energy level of individual node.
• Heuristics to achieve optimization goals – minimal overall cost
• choose based on number of undiscovered neighbors
– maximize lifetime• battery threshold
Quasar Group
Simulation Results
• SURCH is very efficient at processing queries that do not need response from every node:
Quasar Group
Summary of Query Processing
• Queries provide an expressive and easy to use interface for programming sensors– Rapid application development – Transparent optimization
• Application writers can focus on the application logic and not how to optimize it for sensor networks
• Query processing in sensor networks a difficult challenge • Highly dynamic data, Energy/power constraints, Lossy, low bandwidth
broadcast based communication
– Standard approach of layering and isolating functionality into relatively independent software components will not work. OS, middleware, network, queries will require to be co-optimized
• Issues in query processing– Where data resides, how is data represented, where queries are
initiated, where results need to be delivered, where queries are processed
Quasar Group
Future Work in Query Processing in Sensor Databases
• A rich sensor database research landscape– No clear winners yet
• Many important open issues– A formal semantics of query language– A scalable architecture for sensor data gathering and query
processing– Fault-tolerance and real-time constraints in query processing– Integrating sensor data (and queries) with
• other sensor data (sensor data fusion)• Other relational information
– XML and its role in sensor data
Quasar Group
Summary
• Sensor networks present a very wide range of system optimization opportunities for power, application quality and performance
• Energy efficiency is a system level concern that cuts across subsystem components, functionality layers and its implementations
• Key components – Low power sensor microarchitectures– Careful partitioning of functionality in distributed sensor network architecture– Energy aware operating systems– Query driven sensor data management– dynamic power management that coordinates capabilities against application
needs• Real-time, fault-tolerance, application quality needs
– energy efficient communications and networking• energy aware MAC, routing, transport
Quasar Group
Questions??