David Chu--UC Berkeley Amol Deshpande--University of Maryland Joseph M. Hellerstein--UC Berkeley...

David Chu--UC Berkeley Amol Deshpande--University of Maryland Joseph M. Hellerstein--UC Berkeley Intel Research Berkeley Wei Hong--Arched Rock Corp. Approximate Data Collection in Sensor Networks using Probabilistic Models ICDE 2006 1 klhsueh 09.11.03

Outline Introduction Ken architecture Replicated Dynamic Probabilistic Model Choosing the Prediction Model Evaluation Conclusion 2

Sensing data Introduction 3 Kept in sync

Ken Operation Is the expected values accurate enough? Find the attributes that are useful to the prediction. No 5 sourcesink

Ken Operation 1. Compute the probability distribution function (pdf) 2. Compute the expected value according to the pdf 3. If then stop. 4. Otherwise: a. Find the smallest such that the expected value according to the pdf is accurate enough. a. Send the values of attributes in X to the sink. source (at time t) 6

Ken Operation 1. Compute the probability distribution function 2. If the sink received from the source values of attributes in, then condition p using these values as described in sources Step 4(a) above. 3. Compute the expected values of the attributes, and use them as the approximation to the true values. sink (at time t) 7

Replicated Dynamic Probabilistic Model Ex1: very simple prediction model Ex2: linear prediction model Assume that the data value remains constant over time. 9 It utilizes the temporal correlations, but ignores spatial correlations. Considering both correlations Ken uses dynamic probabilistic model.

Dynamic Probabilistic Model A probability distribution function (pdf) for the initial state A transition model The pdf at time t+1 Replicated Dynamic Probabilistic Model observations communicated to the sink. 10

Ex3: 2-dimensional linear Gaussian model Replicated Dynamic Probabilistic Model Compute expected values Wonly have to communicate one value to the sink because of spatial correlations. 11 Not accurate!

Choosing the Prediction Model Total communication cost : intra-source Checking whether the prediction is accurate. source-sink Sending a set of values to the sink. 13

Choosing the Prediction Model Ex3: Disjoint-Cliques Model Exhaustive algorithm for finding optimal solution Greedy heuristic algorithm 14 Reduce intra-source cost & Utilizing spatial correlations between attributes

Choosing the Prediction Model Ex4: Average Model 15

Evaluation 17 Real-world sensor network data Lab: Intel Research Lab in Berkeley consisting of 49 mica2 motes Garden: UC Berkeley Botanical Gardens consisting of 11 mica2 motes. Three attributes: temperature, humidity, voltage time-varying multivariate Gaussians We estimated the model parameters using the first 100 hours of data (training data), and used traces from the next 5000 hours (test data) for evaluating Ken. error bounds of 0.5 o C for temperature, 2% for humidity and 0.1V for battery voltage.

Evaluation 18

Evaluation 19 Comparison Schemes TinyDB: always reports all sensor values to the base station Approximate Caching: caches the last reported reading at the sink and source, and sources do not report if the cached reading is within the threshold of the current reading. Ken with Disjoint-Cliques (DjC) and Average (Avg) models: Greedy-k heuristic algorithm to find the Disjoint-Clique model (DjCk)

Evaluation 20 Ken and ApC both achieve significant savings over TinyDB Average reports at a higher rate than Disjoint-Cliques with max clique size restricted to 2 (DjC2). Capturing and modeling temporal correlations alone may not be sufficient to outperform caching. Utilizing spatial correlations Garden dataset have more data reduction 21% 36%

Evaluation 21 Disjoint-Cliques Models

Evaluation 22 Quantify the merit of various clique size Physical deployment may not have sufficiently strong spatial correlations.

Evaluation 23 Base station resides at the east end of the network. The areas closer to the base station do not benefit from larger cliques

Evaluation 24

Conclusion 25 We propose a robust approximate technique called Ken that uses replicated dynamic probabilistic models to minimize communication from sensor nodes to the networks PC base station.

David Chu--UC Berkeley Amol Deshpande--University of Maryland Joseph M. Hellerstein--UC Berkeley...

Documents

Query Processing and Networking Infrastructures Day 1 of 2 Joe Hellerstein UC Berkeley Septemer 20, 2002

Masterutveksling UC Berkeley

Query Processing and Networking Infrastructures Day 2 of 2 Joe Hellerstein UC Berkeley September 27, 2002

Matei Zaharia UC Berkeley Parallel Programming With Spark UC BERKELEY

Online Aggregation Joe Hellerstein UC Berkeley Online Aggregation: Motivation Select AVG(grade) from ENROLL; A “fancy” interface: + Query Results AVG

Online Query Processing Joseph M. Hellerstein UC Berkeley

Jerry Zhao UC Berkeley jzh@berkeley

MapReduce Online - Berkeley Database Researchdb.cs.berkeley.edu/papers/nsdi10-hop.pdf · MapReduce Online Tyson Condie, Neil Conway, Peter Alvaro, Joseph M. Hellerstein UC Berkeley

Federated Facts and Figures Joseph M. Hellerstein UC Berkeley

Databases 101 & Multimedia Support Joe Hellerstein Computer Science Division UC Berkeley

SafeBricks: Shielding Network Functions in the CloudRishabh Poddar UC Berkeley Chang Lan UC Berkeley Raluca Ada Popa UC Berkeley Sylvia Ratnasamy UC Berkeley Abstract With the advent

Autoscaling Tiered Cloud Storage in Anna - VLDB · 2019-07-12 · Autoscaling Tiered Cloud Storage in Anna Chenggang Wu, Vikram Sreekanti, Joseph M. Hellerstein UC Berkeley fcgwu,

Induced Churn as Shelter from Routing-Table Poisoning Tyson Condie, Varun Kacholia, Sriram Sankararaman, Joseph M. Hellerstein, Petros Maniatis UC Berkeley

We Lose Joe Hellerstein UC Berkeley HPTS 2001. History Generic.com, HPTS 1999 Everyone, et al., HPTS 2001

Harrison Liew UC Berkeley harrisonliew@berkeley

UC Berkeley

Cloud Programming: From Doom and Gloom to BOOM and Bloom Neil Conway UC Berkeley Joint work with Peter Alvaro, Ras Bodik, Tyson Condie, Joseph M. Hellerstein,

UC Berkeley Research

Maelstrom: Churn as Shelter - EECS at UC Berkeley...Maelstrom: Churn as Shelter Tyson Condie Varun Kacholia Sriram Sankararaman Petros Maniatis Joseph M. Hellerstein Electrical Engineering

MapReduce Online Tyson Condie and Neil Conway UC Berkeley Joint work with Peter Alvaro, Rusty Sears, Khaled Elmeleegy (Yahoo! Research), and Joe Hellerstein