View
75
Download
0
Embed Size (px)
Citation preview
Improving data center management operations using wireless sensor
networks
The IEEE International Conference on Internet of Things November 2012, Besançon, France
Panagiotis Garefalakis and Kostas Magoutis
Institute of Computer Science (ICS)
Foundation for Research and Technology – Hellas (FORTH)
Heraklion, Greece
Motivation
Motivation
• Challenges:
High complexity configuration
Hardware maintenance
Software changes
• Several systems proposed to reduce management complexity
AutoPilot (Microsoft), SmartFrog (HP), OpenView (HP), etc.
• Several problems remain unsolved thus keeping the complexity and the cost of running a DC high.
Goal
• Address three important problems:
Automatically determine the physical location of servers.
Notify administrators of any location changes.
Determine status of servers even if network is down.
• Our solution to these problems relies on :
Auto-configuring wireless sensor network.
Distributed monitoring and management system.
Wireless technology used
• Zigbee IEEE 802.15.4:
Up to 65536 Personal area networks with 16 channels each.
Specific roles of each device ( coordinator, slave).
• Two types of messages:
Transparent mode (broadcast only, simple).
API communication mode (unicast, reliable, RSSI).
IEEE 802.15.4Low power250Kbit/secRange ~100m
Prototype Wireless Sensor
Nagios: an open-source distributed discovery, monitoring, and control system
Nagios: remote plugin execution
Nagios state machine
Host/service statestate type
Challenge: Determining host status
A typical Data Center
System Architecture
Auto-configuration
Data Collection
Server integration
• Access to a variety of sensors:
– Temperature
– Airflow
– Power consumption
– Rack information
• Current technology : BMC
• Intelligent Platform Management Interface (IPMI)
Server Localization: Trilateration
• RSSI values: -40dB (strong) … -90dB (weak).
Event correlation
Evaluation
• Office environment
• Data Center environment
• Use of management interface
Office environment
• Server S movement over a 2 meter distance.
• We compare the means of RSSI time series before and after movement using the unpaired student t-test.
• The mean of the time series for the moved server has a statistically reliable shift.
Data Center
• Metallic enclosures, electromagnetic interference introduce noise.
• Management server continuously evaluates the RSSI of messages received from all coordinators.
Server movement accuracy
• Coordinator movement over a 1.5 meter distance.
• We compare the means of RSSI time series before and after movement using the unpaired student t-test.
• The mean of the time series for the moved coordinator has a statistically reliable shift.
• Known techniques can increase accuracy using LQI(signal filtering).
Data Center Topology. Groups of servers sharing a coordinator are show in dashed boxes. Slave Zigbees are omitted.
Use of management interface
Server state is UNREACHABLE, but server state is UP (network partition)
Wireless sensor reports location change
Conclusions
• Extended Nagios to take advantage of auto configuring WSN .
Easy to deploy.
Low capital costs.
Helps administrators by:
o Collecting sensor data – monitoring status.
o Alert them in a case of location changes.
o Identifies types of failure.
o Sophisticated correlation of DC states.
• In line with trends in server management technology.
Security Considerations
128-bit symmetric key encryption (AES)Hardware support by Zigbee on top of IEEE 802.15.4Coordinator performs key management (trust center)
Nagios event correlation
Implementation - Extending Nagios
Status code Explanation and status message
OK
The plugin was able to check the service and it appeared to be
functioning properly :
“Signal-Fine Distance + distance (m)”
Warning
The plugin was able to check the service, but it appeared to
violate a warning threshold or not working properly :
“Signal-Low Distance + distance (m)” or
“Sensor Changed Position + distance (m)”
Critical
The plugin detected that either the service was not running or
it was violating a critical threshold:
“Sensor Disconnected!”
Unknown
Invalid command line arguments were supplied to the plugin
or low-level failures internal to the plugin (such as unable to
fork or to open a TCP socket) that prevent it from performing
the specified operation.
“Unknown State!”
• WSN plug in for localization.