Upload
snow
View
25
Download
0
Embed Size (px)
DESCRIPTION
Analyzing the Energy Efficiency of a Database Server . Hanskamal Patel SE 521. Article. Analyzing the Energy Efficiency of a Database Server Dimitris Tsirogiannis – University of Toronto Stavros Harizopoulos – HP Labs Mehul A. Shah – HP Labs. Introduction . - PowerPoint PPT Presentation
Citation preview
Analyzing the Energy Efficiency of a Database Server
Hanskamal PatelSE 521
Article• Analyzing the Energy Efficiency of a
Database Server– Dimitris Tsirogiannis – University of
Toronto – Stavros Harizopoulos – HP Labs–Mehul A. Shah – HP Labs
Introduction • Evaluating database system in terms of performance is
measured in task per second or queries per second. • Similarly, energy-efficiency is determined by the
measure of completed task per energy/Queries per Joule.• Improving performance is hardware/platform oriented or
workload-management oriented.• Exploring ways to improve energy efficiency of a single-
machine database server.
Test Machine ConfigurationComponent Min (W) Max (W)
Two Intel Xenon E5430 Quad Core 2.66 GHz 48 W 160 W
Four 4GB FB-DIMMS (RAM) 40 W 40 W
Three 300 GB Seagate Savvio 10k.3 2.5” 14W 24W
Four 64 GB Intel X-25E 2.5’ (SSD) 0.2 W 10W
System board components 54W 54W
Power Breakdown• About half of the peak power
is idle system– Two CPU’s– Fixed RAM Power– Board components– SDD and HDD Minimal Power
• Left side of the chart is active power consumption– CPU is dominant component– SSD and HDD draw similar
power
CPU Usage vs. Power
What affects energy efficiency?
• EE = Work/Energy = Performance/Power• Several options affect power-use and potentially
affect energy efficiency– CPU cycles to fetch data from disk– Scans, record access, compressions, sorting, and
joining
• Energy efficiency can be improved but it may sacrifice performance
Energy efficiency vs. Performance
• Experimented with five different overhead kernels– Parallel performing, cache-conscious hash join,
sorting, alphasort and parallel merging
• High performance storage engine that supports column and row oriented database scans.
• PostgreSQL and System-X DBMS
Performance vs. Energy
Performance vs. Energy
Assembling data-management architectures
• Scale-up– Shared memory and shared disk – Choosing the balance of components and power down
unneeded resources
• Scale-out– Share nothing– Single node configurations connected by scaled network– Choose energy efficient components for one node and
performance optimized for another
Power Profiles of Hardware Components• RAM
– RAM is responsible for 20% of the power consumption and stays the same throughout
– Only way to vary power usage by memory is to physically remove the modules from the board
Power Profiles of Hardware Components
• Disks – Both HDD and SSD in the configuration– Supports active and idle stages, consuming
different amount of power – 15% in the active stage
• Test Configuration– Raid-0 configuration for both HDD and HDD– Reading 100GB file @ block size of 128KB
Power Consumption of Disks
Power Profiles of Hardware Components
• CPU– The two CPU’s are responsible for the 85% of power
increase in the system while active– Interested in understanding:
• How CPU power is affected by database operations and the efficacy of hardware and software power management
• Developed a set of micro-benchmarks that performs three classes of database operations: hashing, sorting, and scans.
Micro-benchmarks• Custom Join Kernel
– Hash join algorithm for computing join of two relations in parallel.
• Sort Kernel– Two in-memory parallel sorting algorithm
• Scan kernel– Scan uncompressed rows in memory– Scan compressed column on disk
Analyzing Power Consumption
Memory bus utilization
Hashjoin Operator
Sort Operator
Scan Operator
Energy vs. Performance
• Parameters that have greatest impact on energy– Algorithm/plan selection – Intra-operator parallelism – Inter-query parallelism
Algorithm/Plan selection
• Access Methods• Join Algorithms• Complex Queries and Join Ordering
Intra-operator and Inter-query Parallelism
• Intra-operator parallelism– Parallel hash join– Parallel Sorts
• Inter-query parallelism– Executing multiple queries at the same
time
Implications for Database Computing
• One size fits all– Collection of nodes, where each node is optimized for
specific task– High parallelism, low-frequency, small cache, and simple
design CPU– Solid state drives
• Shared nothing, everything, or in-between– Shared nothing and shared disk
• Controlling peak power
Conclusion• CPU power usage by different operators can vary by
up to 60%• The best performing system was the most energy
efficient• Future investigations:
– Improving resources across unutilized nodes to save power– Alternative energy efficient hardware for lower fixed-power
cost
Questions?