View
217
Download
1
Category
Tags:
Preview:
Citation preview
Large-Scale Data Collection Using Redis
C. Aaron Cois, Ph.D. -- Tim PalkoCMU Software Engineering Institute
© 2011 Carnegie Mellon University
Us
C. Aaron Cois, Ph.D.
Software Architect, Team LeadCMU Software Engineering InstituteDigital Intelligence and Investigations Directorate
Tim Palko
Senior Software EngineerCMU Software Engineering InstituteDigital Intelligence and Investigations Directorate
© 2011 Carnegie Mellon University
@aaroncois
Overview
• Problem Statement• Sensor Hardware & System Requirements• System Overview– Data Collection– Data Modeling– Data Access– Event Monitoring and Notification
• Conclusions and Future Work
The Goal
Critical infrastructure/facility protection
via
Environmental Monitoring
Why?
Stuxnet• Two major components:
1) Send centrifuges spinning wildly out of control2) Record ‘normal operations’ and play them back to operators during the attack 1
• Environmental monitoring provides secondary indicators, such as abnormal heat/motion/sound
1 http://www.nytimes.com/2011/01/16/world/middleeast/16stuxnet.html?_r=2&
The Broader Vision
Quick, flexible out-of-band monitoring
• Set up monitoring in minutes• Versatile sensors, easily repurposed • Data communication is secure (P2P VPN) and
requires no existing systems other than outbound networking
A CMU research project called Sensor Andrew
• Features: – Open-source sensor platform– Scalable and generalist system supporting a
wide variety of applications– Extensible architecture• Can integrate diverse sensor types
The Platform
Sensor Andrew
Gateway
Gateway
Server
End Users
Sensor Andrew Overview
Nodes
What is a Node?
Environment Node Sensors• Light• Audio• Humidity• Pressure• Motion• Temperature• Acceleration
Power Node Sensors• Current• Voltage• True Power• Energy
A node collects data and sends it to a collector, or gateway
Radiation Node Sensors• Alpha particle
count per minute
Particulate Node Sensors• Small Part. Count• Large Part. Count
What is a Gateway?
• A gateway receives UDP data from all nodes registered to it
• An internal service:– Receives data continuously– Opens a server on a specified
port– Continually transmits UDP data
over this port
Gateway
Requirements
1. Collect data from nodes once per second2. Scale to 100 gateways each with 64 nodes3. Detect events in real-time4. Notify users about events in real-time5. Retain all data collected for years, at least
We need to..
What Is Big Data?
What Is Big Data?
“When your data sets become so large that you have to start
innovating around how to collect, store, organize, analyze and share it.”
Problems
Size Transmission
StorageRate
Problems
Size Transmission
StorageRate
Problems
Size Transmission
StorageRate
Problems
Size Transmission
StorageRate
Problems
Size Transmission
StorageRate
Problems
Size Transmission
StorageRateRetrieval
Collecting DataProblem:
Data cannot remain on the nodes or gateways due to security concerns.Limited infrastructure.
Constraints:
Store and retrieve immense amounts of data at a high rate.
?Gateway
8 GB / hour
Complex Analytics
We Tried PostgreSQL…
• Advantages:– Reliable, tested and scalable– Relational => complex queries => analytics
• Problems:– Performance problems reading while writing at a
high rate; real-time event detection suffers– ‘COPY FROM’ doesn’t permit horizontal scaling
Q: How can we decrease I/O load?
Q: How can we decrease I/O load?
A: Read and write collected data directly from memory
Enter Redis
Commonly used as a web application cache or pub/sub server
Redis is an in-memory NoSQL database
Redis
• Created in 2009• Fully In-memory key-value store– Fast I/O: R/W operations are equally fast– Advanced data structures
• Publish/Subscribe Functionality– In addition to data store functions– Separate from stored key-value data
Persistence
• Snapshotting– Data is asynchronously transferred from memory
to disk• AOF (Append Only File)– Each modifying operation is written to a file– Can recreate data store by replaying operations– Without interrupting service, will rebuild AOF as
the shortest sequence of commands needed to rebuild the current dataset in memory
Replication
• Redis supports master-slave replication• Master-slave replication can be chained• Be careful: – Slaves are writeable!– Potential for data inconsistency
• Fully compatible with Pub/Sub features
Redis Features Advanced Data Structures
List Set Sorted Set Hash
[A, B, C, D]
“A”
“B”
“C”
“D”
D
C
B
AA:3
C:1
D:2
B:4
{A, B, C, D} {C:1, D:2, A:3, D:4}
“A”
“B”
“C”
“D”
field1
field2
field3
field4
{field1:“A”, field2:“B”…}
{value:score} {key:value}
Our Data Model
Constraints
Our data store must:
– Hold time-series data
– Be flexible in querying (by time, node, sensor)
– Allow efficient querying of many records
– Accept data out of order
Tradeoffs: Efficiency vs. Flexibility
MotionAudioLight
PressureHumidity
AccelerationTemperature
MotionVS
Light
Audio
Pressure
Temperature
Humidity
Acceleration
One record per timestamp
One record per sensor data type
A
Our Solution: Sorted Set
Score
Value
Datapoint sensor:env:1011357542004000{“bat”: 192, "temp": 523, "digital_temp": 216, "mac_address": "20f", "humidity": 22, "motion": 203, "pressure": 99007, "node_type": "env", "timestamp": 1357542004000, "audio_p2p": 460, "light": 820, "acc_z": 464, "acc_y": 351, "acc_x": 311}
Our Solution: Sorted Set
Score
Value
Datapoint sensor:env:1011357542004000{“bat”: 192, "temp": 523, "digital_temp": 216, "mac_address": "20f", "humidity": 22, "motion": 203, "pressure": 99007, "node_type": "env", "timestamp": 1357542004000, "audio_p2p": 460, "light": 820, "acc_z": 464, "acc_y": 351, "acc_x": 311}
Our Solution: Sorted Set
Score
Value
Datapoint sensor:env:1011357542004000{“bat”: 192, "temp": 523, "digital_temp": 216, "mac_address": "20f", "humidity": 22, "motion": 203, "pressure": 99007, "node_type": "env", "timestamp": 1357542004000, "audio_p2p": 460, "light": 820, "acc_z": 464, "acc_y": 351, "acc_x": 311}
Our Solution: Sorted Set
Score
Value
Datapoint sensor:env:1011357542004000{“bat”: 192, "temp": 523, "digital_temp": 216, "mac_address": "20f", "humidity": 22, "motion": 203, "pressure": 99007, "node_type": "env", "timestamp": 1357542004000, "audio_p2p": 460, "light": 820, "acc_z": 464, "acc_y": 351, "acc_x": 311}
Sorted Set
1357542004000: {“temp”:523,..}1357542005000: {“temp”:523,..}
1357542007000: {“temp”:530,..}1357542008000: {“temp”:531,..}1357542009000: {“temp”:540,..} 1357542001000: {“temp”:545,..}…
Sorted Set
1357542004000: {“temp”:523,..}1357542005000: {“temp”:523,..}1357542006000: {“temp”:527,..} <- fits nicely1357542007000: {“temp”:530,..}1357542008000: {“temp”:531,..}1357542009000: {“temp”:540,..} 1357542001000: {“temp”:545,..}…
Know your data structure!A set is still a set…
Score
Value
Datapoint1357542004000{“bat”: 192, "temp": 523, "digital_temp": 216, "mac_address": "20f", "humidity": 22, "motion": 203, "pressure": 99007, "node_type": "env", "timestamp": 1357542004000, "audio_p2p": 460, "light": 820, "acc_z": 464, "acc_y": 351, "acc_x": 311}
Requirement Satisfied
RedisGateway
There is a disturbance in the Force..
Collecting Data
RedisGateway
“In Memory” Means Many Things
• The data store capacity is aggressively capped – Redis can only store as much data as the server
has RAM
Collecting Big Data
RedisGateway
We could throw away data…
• If we only cared about current values• However, our data– Must be stored for 1+ years for compliance– Must be able to be queried for historical/trend
analysis
We Still Need Long-term Data Storage
Solution? Migrate data to an archive with expansive storage capacity
Winning
Redis
Gateway
PostgreSQL
Archiver
Winning?
Redis
Gateway
PostgreSQL
Archiver
??
?Some Poor Client
Yes, Winning
Redis
Gateway
PostgreSQL
ArchiverAPI
Some Happy Client
Gateway
Redis
PostgreSQL
ArchiverAPI
Best of both worlds
Redis allows quick access to real-time data, for monitoring and event detection
PostgreSQL allows complex queries and scalable storage for deep and historical analysis
We Have the Data, Now What?
Incoming data must be monitored and analyzed, to detect significant events
We Have the Data, Now What?
Incoming data must be monitored and analyzed, to detect significant events
What is “significant”?
We Have the Data, Now What?
Incoming data must be monitored and analyzed, to detect significant events
What is “significant”?
What about new data types?
Gateway
Django App
App DB
API
New guy: provide a way to read the data andcreate rules
motion > x && pressure < y&& audio > z
Redis
PostgreSQL
Archiver
Gateway
Event MonitorEvent
MonitorDjango
AppApp DB
Redis
PostgreSQL
ArchiverAPI
New guy: read the rules and
data, trigger alarms
motion > x pressure < yaudio > z
All true?
Gateway
Event MonitorEvent
MonitorDjango
AppApp DB
Redis
PostgreSQL
ArchiverAPI
Event monitor services can be scaled independently
Getting The Message Out
Getting The Message Out
Considerations
• Event monitor already has a job, avoid re-tasking as a notification engine
Getting The Message Out
Considerations
• Event monitor already has a job, avoid re-tasking as a notification engine
• Notifications most efficiently should be a “push” instead of needing to poll
Getting The Message Out
Considerations
• Event monitor already has a job, avoid re-tasking as a notification engine
• Notifications most efficiently should be a “push” instead of needing to poll
• Notification system should be generalized, e.g. SMTP, SMS
If only…
Gateway
Event MonitorEvent
MonitorDjango
AppApp DB
ArchiverAPI
Redis Data
Redis Pub/Sub
WorkerWorkerNotification
Worker
SMTP
Pub/Sub with synchronized workers is an optimal solution to real-time event notifications.
No need to add another system, Redis offers pub/sub services as well!
PostgreSQL
Conclusions
• Redis is a powerful tool for collecting large amounts of data in real-time
• In addition to maintaining a rapid pace of data insertion, we were able to concurrently query, monitor, and detect events on our Redis data collection system
• Bonus: Redis also enabled a robust, scalable real-time notification system using pub/sub
Things to watch
• Data persistence– if Redis needs to restart, it takes 10-20 seconds
per gigabyte to re-load all data into memory 1
– Redis is unresponsive during startup
1 http://oldblog.antirez.com/post/redis-persistence-demystified.html
Future Work
• Improve scalability through:– Data encoding– Data compression– Parallel batch inserts for all nodes on a gateway
• Deep historical data analytics
Acknowledgements
• Project engineers Chris Taschner and Jeff Hamed @ CMU SEI
• Prof. Anthony Rowe & CMU ECE WiSE Labhttp://wise.ece.cmu.edu/
• Our organizationsCMU https://www.cmu.eduCERT http://www.cert.orgSEI http://www.sei.cmu.eduCylab https://www.cylab.cmu.edu
Thank You
Thank You
Questions?
Slides of Live Redis Demo
A Closer Look at Redis Data
redis> keys *
1)"sensor:environment:f80”2)"sensor:environment:f81”3)"sensor:environment:f82"4)"sensor:environment:f83"5)"sensor:environment:f84"6)"sensor:power:f85"7)"sensor:power:f86"8)"sensor:radiation:f87"9)"sensor:particulate:f88"
A Closer Look at Redis Data
redis> keys sensor:power:*
1)"sensor:power:f85"2)"sensor:power:f86”
A Closer Look at Redis Data
redis> zcount sensor:power:f85 –inf +inf
(integer) 3565958(45.38s)
A Closer Look at Redis Data
redis> zcount sensor:power:f85 1359728113000 +inf
(integer) 47
A Closer Look at Redis Dataredis> zrange sensor:power:f85 -1000 -1
1)"{\"long_energy1\": 73692453, \"total_secs\": 6784, \"energy\": [49, 175, 62, 0, 0, 0], \"c2_center\": 485, \"socket_state\": 1, \"node_type\": \"power\", \"c_p2p_low2\": 437, \"socket_state1\": 0, \"mac_address\": \"103\", \"c_p2p_low\": 494, \"rms_current\": 6, \"true_power\": 1158, \"timestamp\": 1359728143000, \"v_p2p_low\": 170, \"c_p2p_high\": 511, \"rms_current1\": 113, \"freq\": 60, \"long_energy\": 4108081, \"v_center\": 530, \"c_p2p_high2\": 719, \"energy1\": [37, 117, 100, 4, 0, 0], \"v_p2p_high\": 883, \"c_center\": 509, \"rms_voltage\": 255, \"true_power1\": 23235}”
2)…
Redis Python APIimport redis
pool = redis.ConnectionPool(host=127.0.0.1, port=6379, db=0)r = redis.Redis(connection_pool=pool)
byindex = r.zrange(“sensor:env:f85”, -50, -1) # ['{"acc_z":663,"bat":0,"gpio_state":1,"temp":663,"light”:…
byscore = r.zrangebyscore(“sensor:env:f85”, 1361423071000, 1361423072000)
# ['{"acc_z":734,"bat":0,"gpio_state":1,"temp":734,"light”:…
size = r.zcount(“sensor:env:f85”, "-inf", "+inf") # 237327L
Recommended