Upload
du-shenglin
View
2.178
Download
0
Embed Size (px)
DESCRIPTION
this is database performance forecast and capacity analysis for freshman.
Citation preview
Forecasting Database Performance
Du Shenglin
June 6th 2011
What do Capacity do?
• Manager ask “Can our database survive in next year or the
new promotion program”?
• Performance Tuning != Capacity planning
• DBA is not totally equal to Capacity Analyst
What do Capacity do?
• How Much headroom do we still have for further increasing, how many days can we hold without add/upgrade hardware?
• What’s the costs or impact to the site with adding or changing application code
• What kind of platform/database/OS should we use for the new introduced applications
• How to survive for the sudden performance deviations?
Agenda
• Resource• Model and Theory• Response Time Analysis• Steps to do Capacity Analysis• Case Study
Resource
• Site Level Machine, License, database, Storage, Manpower…
• System Level CPU, IO, Memory, Disk, Network, Kernel Settings
• Database Level Latch, Enqueue, Lock, Physical IO, Logical IO…
Modeling - Making the Complex Simple
• The world is much too complex for us to understand. • Mathematical Model
– Queue theory– Line modeling– Regression analysis– Utilization – Baseline
• Model is not perfect, not 100% precision
The Linear modeling
Linear Regression – Scalability
The Response Time Curve
Response Time=Service Time + Queue Time
Queuing Theory• Remain in the queue until its turn to be serviced • Common FIFO or priority queue• Queue length• Wait times and wait events• CPU queue and IO queue
Response Time Drill Down
CPUQueue
NetworkTransfer
CPUUsr+Sys
MemoryQueue
MemoryAccess
DiskQueue
DiskTransfer
NetworkQueue
Response Time=Service Time + Queue TimeRt =St + Qt
Utilization and HeadroomHeadroom is available usable resources
-Total Capacity minus Peak Utilization and Margin-Applies to CPU, RAM, Net, Disk and OS-Can be very complex to determine, it depends
CPU Capacity Measurements• CPU utilization is defined as busy time divided
by elapsed time for each CPU
• CPU time = CPU Queue + CPU usr+sys
• Processes wait on a run queue, causing high load averages, then run on a CPU in user and system mode.
• More CPUs reduce queue wait. Faster CPUs reduce usr+sys time.
CPU Capacity Measurements• U=λ*St*M• CPU Utilization=Arrival rate*
cpu_time_exec(us)/POWER(10,6)/number_of_CPU
• CPU Utilization=buffer_gets* buffer_gets_time_per_exec(us)/POWER(10,6)/number_of_CPU
• We can use this format for many cases
The Response Time Curve - Multiple CPU
IO Response Time Profile
• RAM 60ns• HDD 5-10ms• SSD – 100 -500us • IO Service Time Includes 3 Components:
- Access Time – Time It Takes To Move Heads To The Desired Track. - Rotation Time – Time It Takes To Locate The Desired Sector on The Track. -Transfer Time – Time It Takes To Read/Write The Data
• Access Time Constitutes 70% of The Service Time
IO Average Wait Time• db file sequential read – less than 15 ms• log file sync – less than 4ms
20
06
-11
-27
01
20
06
-11
-27
13
20
06
-11
-28
01
20
06
-11
-28
13
20
06
-11
-29
01
20
06
-11
-29
13
20
06
-11
-30
01
20
06
-11
-30
13
20
06
-12
-01
01
20
06
-12
-01
13
20
06
-12
-02
01
20
06
-12
-02
13
20
06
-12
-03
01
20
06
-12
-03
13
20
06
-12
-04
01
20
06
-12
-04
13
20
06
-12
-05
01
20
06
-12
-05
13
20
06
-12
-06
01
20
06
-12
-06
13
20
06
-12
-07
01
20
06
-12
-07
13
20
06
-12
-08
01
20
06
-12
-08
13
20
06
-12
-09
01
20
06
-12
-09
13
20
06
-12
-10
01
20
06
-12
-10
13
20
06
-12
-11
01
20
06
-12
-11
13
20
06
-12
-12
01
20
06
-12
-12
13
20
06
-12
-13
01
20
06
-12
-13
13
20
06
-12
-14
01
20
06
-12
-14
13
20
06
-12
-15
01
20
06
-12
-15
13
20
06
-12
-16
01
20
06
-12
-16
13
20
06
-12
-17
01
20
06
-12
-17
13
05
101520253035404550
db file sequential read everage wait Time
ms
Trend Capacity Measurements• SQL Execution Increase -> Traffic Growth
• Buffer Gets Per Execution Fluctuates -> Normal Buffer Contentions
• Buffer Gets Per Execution Increase Gradually -> SQL Efficiency Change
• CPU Increase Only -> System Overhead Increase, Latch Spinning…etc
• Find the commons during daily, weekly and yearly exexutions
Daily Executions
WoW Executions
YOY Executions
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
2004
2005
2006
2007
Data Collection• What kind of data to collect• When to collect the data• Where to put the data• How often to collect the data• How long to keep the data• How to interpret and present the data A Picture Is Worth A Thousand Words Script and automate is necessary
Capacity Monitoring in database level
• Peak executions • Sessions• Shared pool usage• LIO/exec• PIO/exec• CPU_time/LIO• Redo size• Free memory• Commits• Disk space usage• …..
Risk Mitigation Strategies
Capacity Analyst is not only DBA
• Tuning/fixing the issue - DBA or SA’s task?
• Balancing existing workload
• Upgrade and buy more CPU capacity
• Split and Sharding
Steps to take for Capacity Analysis• 1. Determine the question
• 2. Gather workload data - What, how and how often• 3. Characterize the workload data
- Map, Interpret the data• 4. Develop and use appropriate model
- Present your data, Graph• 5. Validate the forecast
• 6. Forecast
Case Study #1 – Delete Performance
Questions:
1. How many data can we delete every day?2. If the delete will catch up in no-peak time?3. How many thread can we use to do delete?4. What’s the main cost for delete job?
Case Study #1 – Delete Performancedelete performance is decided by IO response time/ PIO_Per_row.
• SNAP_TIME EXEC_PER_SEC LIO Per Exec PIO Per Exec Rows Per Exec• -------------------- ------------ ------------ ------------ -------------• 2011/02/10 15:49 .04 10194.44 2323.02 1000• 2011/02/10 16:04 .05 10200.82 2322 1000• 2011/02/10 16:19 .06 10198.03 1967.9 999.9• 2011/02/10 16:34 .06 10201.81 1985.98 1000• 2011/02/10 16:49 .06 10194.11 2088.38 999.9
1/6m/(2323/1000)=1000000/6/2323=71 rows
The real case:
deletion started at: 2011-02-10 15:38:45rows to delete: 1232177rows deleted: 1232171deletion ended at: 2011-02-10 20:47:51
1232171/((TO_DATE('2011-02-1020:47:51','YYYY-MM-DDHH24:MI:SS')-TO_DATE('2011-02-1015:38:45','YYYY-MM-DDHH24:MI:SS'))*24*60*60)
66.4386391
Case Study #2 – Using MySQLWhat capacity analysis Should we do to evaluate MySQL ?
1) MySQL version2) Machine 3) OS4) LOCAL DISK/SSD5) Kernel configuration6) MySQL Parameters
mysql\MySQL InnoDB setup best practice.doc
Answers to what Capacity need to do
• Measure the capacity of the site correctly and accurately.
• Be able to predict the growth of site, identify future performance problem
• Define what is balance and find a strategy to keep dynamic balance.
• Impact analysis of system level change.
• Identify dangerous performance deviations.