1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Who Ate My DB Time? Advanced Techniques For Databa se Performance Analysis
Prabhaker Gongloor (GP) & Kurt EngeleiterProduct Management, Database Manageability, Oracle Corporation
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s
2 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
2 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
3 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences: Boeing Inc.
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
4 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12.1)
– Offline Analysis
• Customer Experiences
DB Time is Pervasive, Familiar, Evolutionary
• DB Time is the fundamental measure of Oracle performance– Arrived at progressively but designed-in from 10g
• You are likely using DB Time methods already– AWR and ASH reports
5 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
– AWR and ASH reports– Enterprise Manager
• DB Time is performance analysis science matured– Wait event analysis, YAPP and tracing are all its ancestors
Why Do We Care About Time?
• Human time is critical to the enterprise
• Systems performance affects business goals– Human time + technology resource time
• “Time is money”
6 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• “Time is money”
• Performance improvement means doing things faster
Performance is always and only about time
DB Time Definition• Total time in database calls by
foreground sessions
• Includes – CPU time – IO time Wait
DB Time
7 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
– non-idle wait time
• DB Time is not called response time on purpose
• Common currency for Oracle performance analysis
CPU
I/O
DB Time: Quiz
Question: On a four CPU system, for one hour elapsed time, what is the maximum amount of CPU time that can be accumulated?
8 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Answer: Four hours of CPU time maximum.
DB Time: QuizQuestion: On a four CPU system, for one hour elapsed time, what is the maximum amount of DB Time that can be accumulated?
Hint: How many sessions can wait in a queue?
Answer: Number of ‘sessions’ * elapsed time
9 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Answer: Number of ‘sessions’ * elapsed time
The maximum number of sessions that are on CPU or in a non-idle wait is the init.ora parameter ‘sessions’. If ‘sessions’ is 300 then:
300 * 1 hour = 300 hours of DB Time
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
10 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
11 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
DB Time on the Performance Page
12 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Area under curve is equal to DB time • DB Time displayed as CPU, I/O or non-idle wait class
• Wait class breakdown indicates solution space
DB Time on the Top Activity Page
DB Time
13 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Plots DB Time from ASH data• Shows top DB Time consumers by SQL and Session
DB Time in AWR Report
14 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• If you tune by Wait Events you are tuning by DB Time• Top 5 Timed Events in the AWR report shows CPU time and non-idle
waits for foregrounds
Boost Administrator Productivity
Eliminate Performance Bottlenecks
Lower IT Management Costs
Deliver Higher Quality of
Automatic Database Diagnostics Monitor
Automatic Workload Performance &
HighLoadSQL
RACIssues
IO / CPUIssues
Self-Diagnostic
15 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Deliver Higher Quality of Service
• DB Time data collected in Automatic Workload Repository snapshots• Immediate DB Time analysis done by Automatic Database Diagnostic Monitor• Resolve performance issues faster with drill-down root-cause analysis• Classification tree based on Oracle performance tuning expertise• Performance expert; now a RAC specialist too in Oracle Database 11g
Automatic WorkloadRepository Snapshots
Performance & Management Advisors
IssuesSelf-Diagnostic Engine
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
16 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
Compare Period ADDM
17 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Why Compare Period ADDM
• Existing ‘Compare Period Report’ compares across two AWR time ranges
• Displays a large number of metrics
• Organized by topic – interpreting each section requires a distinct and significant base of knowledge (a lot to learn)
18 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
knowledge (a lot to learn)
• Challenging to separate the insignificant from the critical (no common denominator)
• Focused on the WHAT and the HOW, omits the WHY
Compare Period ADDM
AWR Snapshot Period 1
SQL Commonality
Regressed SQL
I/O Bound
Undersized SGA
19 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Analysis performance across two AWR time ranges• Executes full ADDM analysis and makes actionable recommendations along with impact• Identifies workload changes, configuration changes, regressed SQL, resource limits • Correlate changes to root causes – encapsulates years of expert Oracle experience
AWR Snapshot Period 2
Analysis Report
Undersized SGA
Compare Period ADDM
• 30% smaller Buffer cache
• 10% new SQL
Compare Period ADDM: Method
Identify what changed• Configuration changes, workload
changes
Quantify performance differences
• Uses DB Time as basis for measuring
• Top SQL increased 45%
• Read I/O up 55%
20 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Uses DB Time as basis for measuring performance
Identify root cause• Correlate performance differences
with changes
• Read I/O up 55%
• Buffer cache reduction caused read I/O increase
Correlation Rules Examples
•Physical memory size
•SGA/PGA TargetPaging
•Cursor sharing
•Shared Pool size
•Session cached cursors
Hard/Soft Parse Time
Symptom Cause
21 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
•Session cached cursorsParse Time
•Optimizer parameters
•SQL parameters
SQL Regressions
•PGA Aggregate TargetTemp I/O
Compare Period ADDM: Report
22 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• SQL commonality offers a sanity test – are the workloads comparable?• Not impacted by SQLs with low DB time
Compare Period ADDM Examples
23 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
ADDM Examples
© 2011 Oracle Corporation – Proprietary and Confidential –Oracle Internal Only
Real-Time ADDM
24 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Real-Time ADDM - Motivation• Extremely slow or hung databases
– Cannot connect to or diagnose unresponsive databases
– Current remedy: bounce the database
• All diagnostic information is lost
• Problem can occur again since root causes not identified
– Need quick, lightweight analysis mechanism
25 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
– Need quick, lightweight analysis mechanism
• No I/O or global resources such as enqueues or latches
• Real-time analysis– System under performance stress
– AWR snapshot may exacerbate problem
– Normal ADDM time interval may be too long to diagnose short spikes
Real-Time ADDM – Architecture
Enterprise Manager
EM Agent
JDBC Connection
Diagnostic Connection
Database
ADDM Analysis
Hangs
Latches
Deadlocks
Real time analysis
Unresponsive DB
26 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Manager Database
• Uses a pre-established diagnostic connection for unresponsive systems• Initiates a standard JDBC connection for real-time analysis
• Diagnostic connection collects data without holding latches or running SQL
• First intelligent advisor to diagnose problems in real-time as they occur, no matter how sick the system is
Top Issues Identified by Real-Time ADDM
Resource Constraints
• CPU Bound Hosts
• Memory Paging
• I/O Bound
• Interconnect Bound
Memory Issues
• Library Cache Issues
• Memory Allocation Issues
• Excessive PGA growth
27 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Interconnect Bound
Hangs
• Instance Shutting Down
• Top Blocker Analysis
• Memory Pool Flushing
• ASM Unresponsive
Resource Limits Reached
• Sessions
• Processes
Deadlocks
Real-Time ADDM – Data Collected
JDBC Collection• Database and Instance Metadata
• Instance Metrics
• ASH Data
• Top SQL from ASH
Diagnostic Collection• Hang analysis data
• Session activity by wait class
• I/O metrics
• Host metrics
28 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Top SQL from ASH
• Top Sessions from ASH
• Database Initialization Parameters
• Host Hardware Information
• Instance and Host Memory Usage
• Host Metrics
• Host metrics
Real-Time ADDM
29 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Real-Time ADDM screen shows database activity over the last hour• Analysis done for most recent five minutes
• Click ‘Start’ button to initiate analysis
Real-Time ADDM
• EM communicates with RDBMS in a tight loop
30 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• EM communicates with RDBMS in a tight loop• Analysis runs in real-time and updates as each UI refresh brings more data
• Progress tab shows present and upcoming collections, gives an insight into the analysis yet to be performed
• Completes quickly, but no need to wait for the final results – finding set is cumulative
• Data supporting the analysis is presented alongside the findings
Real-Time ADDM Examples
31 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Examples
© 2011 Oracle Corporation – Proprietary and Confidential –Oracle Internal Only
ASH Analytics
Flexib le Time Picker
Flexib le Activity Chart
32 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Flexib le Top Chart Flexib le Top Chart
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
33 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences
Offline Analysis
• Can be used to reproduce, analyze and resolve production problems in test
• Provides comprehensive offline testing and DB Time analysis
Real Application Testing
34 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
analysis
• Allows you to thoroughly investigate a large set of performance options before deploying to production
• Enables root cause analysis of production problems in test
Real Application Testing – Two Components
ReplayWorkload
CaptureWorkload
Deploy Replay
End-to-end testing with real workloadsCreate Test System
35 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• SQL Performance Analyzer• SQL unit testing for response time• Identify and tune regressed SQL• Integrated into SQL Tuning Advisor and
SQL Plan Baseline
Workload Deploy Replay Clients
• Database Replay• Load, performance testing for throughput • Remediate application concurrency problems• Identify operations using the most DB Time
Real Application Testing Database Replay • Database load and
performance testing with real production workloads– Production workload characteristics
such as timing, transaction dependency, think time, etc., fully maintained
• Test and measure transaction
TestProductionReplay DriverClients
36 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Test and measure transaction throughput improvements
• Identify application scalability and concurrency problems
• Remediate issues pre-production for risk-free migration
Capture Process ReplayAnalysis &
Reporting
Storage Storage
Offline Analysis
• The most common use of Database Replay has been as a testing tool for assessing impact of system changes
• It can be also used for offline troubleshooting in situations where AWR/ASH data alone may not suffice
Database Replay
37 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• It enables faster root cause analysis through comprehensive testing and DB Time analysis
Offline Analysis
• Reproducing hang situations in test - e.g., when a batch job or certain pattern of events is known to cause a hang
• Reproducing significant problems observed in production on test - e.g., RAC node eviction, cache brown out, fail-over, ORA-600, ORA-7445 issues
Database Replay Offline Examples
38 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Performing destructive testing with real workload that may not be possible in production - e.g., hot swapping disks, memory boards, etc.
• Enabling tracing with production workload in test production because of the high performance overhead of tracing. Tracing can sometimes be necessary to provide support diagnostics - e.g., setting diagnostic events
Offline Analysis – ORA -600 Example
39 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• An ORA-600 was occurring during a production batch process• Rather than troubleshoot on the production system Database Replay was used
to capture and replay the batch process in test
• The ORA-600 reproduced in test
Offline Analysis – ORA -600 Example
40 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• EM Support Workbench Guided Workflow used for the ORA-600 alert• SQL Repair Advisor was run using the Guided Resolution Workflow
• SQL Patch was applied as recommended by SQL Repair Advisor
Offline Analysis – ORA -600 Example
41 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Batch job replayed with SQL Patch implemented• ORA-600 did not reproduce – patch verified for production
Agenda
• What is DB Time?
• Performance Tuning with DB Time– Online Analysis
• Current Features
42 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Current Features• New Features (Enterprise Manager 12)
– Offline Analysis
• Customer Experiences: Boeing Inc.
Database ReplayDatabase Replay
A Powerful Oracle Performance and Load Test ToolFirst-hand experience by Boeing
Andre Scharkowski
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Introduction
• About myself• Oracle Database Administration • Performance & Scalability Testing
• About the Boeing Company
Copyright © 2008 Boeing. All rights reserved.
• About the Boeing Company• 200+ DBAs• Oracle, SQL Server, DB2, Teradata
• About my work group - DCAC/MRM• 16 Oracle DBAs• Applications for the design, configuration & manufacturing of airplanes• 30 production and about 400 non-production databases
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Introduction (cont.)
• Database sizes & load• From 100s MB to 3TB• 20K sessions 15K active
• Architecture and equipment
Copyright © 2008 Boeing. All rights reserved.
• Architecture and equipment• Oracle 11.2.0.2 on AIX 6.1, IBM pSeries (Power 6)• RAC and HACMP / BCV
• Oracle features in use • RAC, DataGuard, EBR (Edition-Based Redefinition)• DB-links, Materialized Views• Triggers, PL/SQL, Synonyms, Complex Views & Grants, • EM-grid, OCM, rman• Partitioning, some Parallel Query
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Why is Database Replay such a big deal?
• Tool is able to reproduce true production load in all its variety and complexity
• DBA has ability to test database changes without having to involve other, non-DBA groups
• Test tool of choice for Oracle patches, Oracle upgrades, parameter changes, architecture and hardware changes
Copyright © 2008 Boeing. All rights reserved.
architecture and hardware changes
• Tool has the potential of saving large amounts of resources normally spent on test script development
• Tool can be used to generate a background database load that can supplement load generated by traditional test tools (save licenses)
• Tool can reduce the size and complexity of test environments -- no need for application servers, web servers, client machines
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Deployment status of Database Replay at Boeing
• Performed about 50 production captures and over 100 replays
• Replays helped validate 10.2.0.4 �11.2.0.1. upgrade
• Using the tool to validate new EMC V-Max tiered storage
• Plan on making Database Replay part of regular test cycle• issue: few outages � many types of changes happen at once
• Plan on working with Oracle to improve the product• timeout feature• explore Oracle’s Application Testing Suite (ATS)• make use of Workload Analyzer
Copyright © 2008 Boeing. All rights reserved.
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Managementcapture in progress
Copyright © 2008 Boeing. All rights reserved.
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Active Session View in OEM during Replay
Start of replay
Contention
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Replay Finished
Copyright © 2008 Boeing. All rights reserved.
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Conclusion
• Exciting performance and load test tool
• true production load at very low overhead
• rapid testing possible• rapid testing possible
• simple setup
• resource savings (HW, scripting)
Copyright © 2008 Boeing. All rights reserved.
Engineering, Operations & Technology | Information TechnologyEngineering, Operations & Technology | Information Technology Architecture & Information Management
Acknowledgement
Many Thanks To:
Ashish Agrawal (Oracle)Prabhaker Gongloor (Oracle) Prabhaker Gongloor (Oracle) Yuri Grinshteyn (Oracle)
Oracle Support
The DCAC/MRM DBA team at Boeing
Copyright © 2008 Boeing. All rights reserved.
Q&A
53 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Q&A
54 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
55 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.