20
NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data Mariusz Wisniewski, Gianluca Demartini, Apostolos Malatras, and Philippe Cudré-Mauroux University of Fribourg, Switzerland

NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Embed Size (px)

DESCRIPTION

Presentation at MobiWIS 2013.

Citation preview

Page 1: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

NoizCrowd:A Crowd-Based Data Gathering and Management System for Noise Level

Data

Mariusz Wisniewski, Gianluca Demartini, Apostolos Malatras, and Philippe Cudré-

MaurouxUniversity of Fribourg, Switzerland

Page 2: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 2

Motivation - Big Data

• Large dataset are necessary to enable analytics and support decision making– Meteorological station / car traffic

• Set up a large-scale sensing infrastructure is costly and time-consuming

• Create a large amount of valuable data– Crowdsourcing– Data generation models– Smartphones as sensors– Big Data analytics

Page 3: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 3

NoizCrowd

• A crowd-sensing approach to big data generation using commodity sensors

• Crowd-source noise level in a geo region• Noise propagation models to generate data• Array data management techniques to scale• Results accessible via a visual interface

• Support decisions (e.g., where to live)

Page 4: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 4

Outline

• Related approaches• NoizCrowd Architecture Overview– Data Gathering– Storage– Modeling– Export and Visualization

• Data Models• Performance Evaluation

Page 5: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 5

Related Work

• Participatory Sensing vs Sensor Networks– Low cost / High cost– Mobile phones / Sensors– Distributed / Centralized management– Privacy, data quality

• Applications: Environment, vehicle routing

Page 6: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 6

Related Work

• Noise Mapping Apps– NoiseTube: opensource, widespread usage– NoiseMap: control over data– SoundSense: machine learning to classify sounds

• NoizCrowd– Data in RDF linkable to other datasets

(linkeddata.org)– Scalable storage: generate data by interpolation

Page 7: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 7

NoizCrowd Architecture

Page 8: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 8

Data Gathering

• By means of Crowd-sourcing– GPS: location– Microphone: noise level– Internet connection: send data to server

• Microphone Calibration– Sound level meter– Sharing conversion table for smartphone models

Page 9: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 9

Data Storage

• App sends median and peak dB values over few seconds

• Spatio-temporal data: non-relational storage system (SciDB)– Durable storage– Retrieve data to build models– Export data for visualization

• Multi-dimensional array (space and time)• Distributed storage

Page 10: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 10

Noise Modeling

• Data from crowd is noisy and skewed/sparse• Raw data is not shown to the end users• Models to deal with– Overlapping data– Missing data

Page 11: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 11

Data Export and Visualization

• From SciDB data is– converted to RDF– stored in dipLODocus[RDF]– Available via SPARQL

• Visualization– Overlay noise level on a map– Additional chart for time evolution

Page 12: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 12

Page 13: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini

Data Models

• Spatial Interpolation– In the same time interval, data from different

locations– Need to be computational simple (large volume)– Bi-dimensional range queries in space (SciDB)– K-nearest neighbor interpolation– Computed in parallel

Page 14: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 14

Data Models

• Temporal interpolation– Short ranges (minutes) like spatial interp. in 3D– Long ranges, look for patterns and infer• E.g., every Monday at 11am we have 50dB and we miss

a Monday measurement• E.g., same measurement (50dB) in same area 2h ago

and now

Page 15: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 15

Noise Propagation Models

• We adopt an existing model that takes into account:– Sound power– Distance from source– Directivity– Atmospheric absorption– Excess attenuation (we use meteo conditions)

• Difficult to measure with smartphone• Constant in a given region (and use GPS info)

Page 16: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 16

Materialization of Models

• Data from models– Is computationally expensive to generate– May be a lot since we can cover any region

• We do late materialization– At query time– Only for the specific request– Cached and indexed for future requests– Incremental updates of views, if possible

Page 17: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 17

Performance Evaluation (1)

• 30 outdoor deployments– 2,3,4 smartphones– Multiple noise sources– Urban setting, flat area of 50x50 meters

• Professional-grade noise level meter as gold standard measurement

• 85% of interpolated data +-6dB error• 63% of interpolated data +-4dB error

Page 18: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 18

Performance Evaluation (2)

• Sound propagation and source location• 3 smartphones, 100dB source

Page 19: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 19

Performance Evaluation (3)

• Sound level of source error– 16% with 3 measurements– 10% with 4 measurements– 9% with 5 measurements

• Source location– 3m error on average

Page 20: NoizCrowd: A Crowd-Based Data Gathering and Management System for Noise Level Data

Gianluca Demartini 20

NoizCrowd - Conclusions

• Large scale data is key for decision making• Crowd-source noise level data using mobiles– Scale-out using an array backend– Generate missing data and visualize

• Next steps– Android app– Data recording as background feature– Additional materialization strategies

http://exascale.info