Upload
donguyet
View
287
Download
2
Embed Size (px)
Citation preview
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Extreme Data Warehouse Performance with Oracle Exadata Kasey Parker Enterprise Architect [email protected]
Managed Services
Cloud Services
Consul3ng Services
Licensing
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Who is Centroid?
§ Centroid is a leading provider of Oracle Technology, Applica8ons and Infrastructure/Hos8ng solu8ons
§ Established in 1997
§ Office loca8ons: Troy, MI (HQ); San Francisco, CA; Los Angeles, CA; Dallas, TX
§ 200+ Consultants
§ Oracle Pla8num Partner • Selected to Oracle’s Top 25 Strategic Partner Program • Top 5 Oracle Partner for Hardware/Storage
§ 100% Oracle “Red Stack” Focused
§ “Clients for life” approach to customer rela8onships
§ Oracle Exadata Center of Excellence established in 2011 • Centroid Authored -‐ Oracle Exadata Recipes (Published Feb-‐2013)
QUICK FACTS
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Agenda § Exadata Overview § Why Exadata? § Exadata’s Secret Sauce § GeAng the Most out of Exadata DW
§ Avoiding the 3X Club § Other Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
EXADATA OVERVIEW
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Architecture Database hardware and soIware plaKorm “in a box”
Scale-‐Out Database Servers • 8x 2-‐socket, or 2x 8-‐socket Xeon database servers • Oracle Database, ASM, RAC; Linux or Solaris • Standard Ethernet to data center
Scale-‐Out Intelligent Storage Servers • 2-‐socket storage servers, Exadata Storage SoIware • Up to 672 terabytes disk per rack • 56 PCI Flash memory cards per rack
InfiniBand Network • Unified internal connec3vity ( 40 Gb/sec )
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Configura3on Op3ons Start small and grow as needed – upgraded onsite
Half Rack Full Rack Quarter Rack Eighth Rack
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Hardware Summary X4-2 Full X4-2 Half X4-2 Quarter X4-2 Eighth
Database Servers 8 4 2 2
Database Grid Cores 192 96 48 24
Database Grid Memory (GB) 2048 (max 4096) 1024 (max 2048) 512 (max 1024) 512 (max 1024)
InfiniBand switches 2 2 2 2
Ethernet switch 1 1 1 1
Exadata Storage Servers 14 7 3 3
Storage Grid CPU Cores 168 84 36 18
Raw Flash Capacity 44.8 TB 22.4 TB 9.6 TB 4.8 TB
Raw Storage Capacity High Perf 200 TB 100 TB 43.2 TB 21.6 TB
High Cap 672 TB 336 TB 144 TB 72 TB
Usable mirrored capacity High Perf 90 TB 45 TB 19 TB 9 TB
High Cap 300 TB 150 TB 63 TB 30 TB
Usable Triple mirrored capacity High Perf 60 TB 30 TB 13 TB 6.3 TB
High Cap 200 TB 100 TB 43 TB 21.5 TB
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Hardware Exadata X4-‐2 SQL IO Performance
1 -‐ Bandwidth is peak physical scan bandwidth achieved running SQL, assuming no compression. Effec3ve data bandwidth will be much higher when compression is factored in. 2 -‐ IOPS – Based on read IO requests of size 8K running SQL, typically with sub-‐millisecond latencies. Note that the IO size greatly effects flash IOPS. Others quote IOPS based on 2K, 4K or smaller IOs that are not relevant for databases and measure IOs using low level tools instead of SQL. 3-‐ Actual Performance varies by applica3on. 4 –Load rates are typically limited by database server CPU, not IO. Rates vary based on load method, indexes, data types, compression, and par33oning
X4-2 Full Rack
X4-2 Half Rack
X4-2 Quarter
X4-2 Eighth
Flash Cache SQL Bandwidth1,3
High Cap Disk 100 GB/s 50 GB/s 21.5 GB/s 10.7 GB/s
High Perf Disk 100 GB/s 50 GB/s 21.5 GB/s 10.7 GB/s
Flash SQL IOPS2,3 8K Reads 2,660,000 1,330,000 570,000 285,000
8K Writes 1,960,000 980,000 420,000 210,000
Disk SQL Bandwidth1,3
High Cap Disk 20 GB/s 10 GB/s 4.5 G/s 2.25 GB/s
High Perf Disk 24 GB/s 12 GB/s 5.2 GB/s 2.6 GB/s
Disk SQL IOPS High Cap Disk 32,000 16,000 7,000 3,500
High Perf Disk 50,000 25,000 10,800 5,400
Data Load Rate4 20 TB/hr 10 TB/hr 5 TB/hr 2.5 TB/hr
• Managed Services • Cloud Services • Consul3ng Services • Licensing
WHY EXADATA?
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Exadata is designed to eliminate the most common bomleneck for large databases…
Timely transfer of large data sets from storage subsystem to database server
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Solving the IO BoTleneck Solu3on 1: Enlarge the pipe
• Physical disks, on all cells, work in parallel to serve IO requests • Large Infiniband pipe (40GB/Sec)
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Can’t we do that with other high performance storage soluVons?
YES… There is nothing Magical about Exadata hardware, and it’s s3ll the same Oracle Database
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Why Exadata? Solving the IO BoTleneck Solu3on 2: Reduce the IO opera3ons
• Done using Exadata’s Secret Sauce: Smart Storage, Smart Flash Cache and Hybrid Columnar Compression
• 10X reduc3on in data sent to database servers is common
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Exadata Innova3ons • Some are automa3c, with limited configura3on ability – Storage Indexes – Smart Flash Cache
• Some may require some effort – Smart Scans – Hybrid Columnar Compression (HCC) – IORM (Resource Manager)
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Storage Indexes
• Exadata Storage Indexes maintain summary information about table data in memory
• Store MIN and MAX values of columns • Typically one index entry for every MB of disk
• Eliminates disk I/Os if MIN and MAX can never match “where” clause of a query
• Completely automatic and transparent
A B C D
1 3 5 5 8 3
Min B = 1 Max B =5
Table Index
Min B = 3 Max B =8
Select * from Table where B<2 - Only first set of rows can match
Transparent I/O Elimination with No Overhead
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Flash Cache • Caches Read and Write I/Os in PCI flash • Transparently accelerates read and write intensive
workloads – Up to 2.66 million 8K read IOPS from SQL – Up to 1.96 million 8K write IOPS from SQL
• Persistent write cache speeds database recovery • Exadata Flash Cache is much more effec3ve than
flash 3ering architectures used by others – Caches current hot data, not yesterday’s – Caches data in granules 8x to 16x smaller than 3ering
• Greatly improves the effec3veness of flash
I/Os
2.66 Million 8K Read 1.96 Million 8K Write IOPS from SQL
Other Flash Features can be configured if needed E.g. Cache compression, Cache pinning, Flash Disks (for Temp)
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club
Some Exadata op3miza3ons may require a limle effort – but they’re worth it. Data Warehouse workloads should
improve >7X on Exadata
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club • Tune for Smart Scans • Wisely use Parallelism • Compress with HCC where appropriate • Invoke Resource Management (IORM) • S3ll follow Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club – an Example EDW for Large Organiza3on in Salt Lake valley • Moved to Exadata beginning September 2012 • Configured/Tuned Exadata op3miza3ons for October 2012
Average Response Time
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club • Tune for Smart Scans • Wisely use Parallelism • Compress with HCC where appropriate • Invoke Resource Management (IORM) • S3ll follow Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scan Processing
Select name, customer#... Where city=‘SALT LAKE CITY’
• Smart Scan idenVfies rows / columns in the 1 TB tables that match the SQL (1000 rows)
• IO is executed and 20MB returned from storage to PGA
Who are my customers in
Salt Lake City?
Oracle DB Grid Exadata
Storage Grid
• 1000 rows returned to client
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scan Comparison
8K Blocks
SGA
Rows and Columns
PGA
Standard Operations Smart Scans
22
Storage Servers
Database Servers
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scan Requirements
• Full table scan or index fast full scan – No IOTs, Clustered Tables or LOBs
• Direct path reads – Direct path reads happen for
• Serial queries of “large” tables (11gR2) – Func3on of Buffer Cache Size, threshold and object size
» _small_table_threshold
• Parallel queries • Queries when _serial_direct_read = TRUE!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scans – How do you know?
Execu3on Plan • TABLE ACCESS STORAGE FULL • Storage() predicate • Only indicates Smart Scan is eligible to be performed; does not mean it is
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scans – How do you know?
• Sta3s3c views (V$MYSTAT, V$SESSTAT) – cell physical IO bytes eligible for predicate offloading – cell physical IO interconnect bytes – cell physical IO interconnect bytes return by smart scan
• V$SQL views (IO_ columns) – IO_CELL_OFFLOAD_RETURNED_BYTES – IO_CELL_OFFLOAD_ELIGIBLE_BYTES
• Wait events – cell smart table scan – cell smart index scan
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scans – How do you know?
A Easier Way… SQL Monitor – Accessed through DBMS_SQLTUNE or OEM
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scans – Why don’t they happen?
• Index scan used instead • Buffer cache too large
– Many table blocks in buffer cache
• Chained rows – Tables with more than 255 columns
• Certain func3ons (see v$sqlfn_metadata) • Table "too small” (_small_table_threshold)!• Read consistency • Delayed block cleanout
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Smart Scans – How to get them? • Accurate, Up-‐to-‐date Sta3s3cs
– Are ETL jobs gathering stats appropriately? – Use auto sample size – Exadata System stats
• This is how the op3mizer becomes Exadata aware • exec dbms_stats.gather_system_stats('EXADATA');!
• Right Sized SGA – Most Data warehouses shouldn’t need more than 16GB
• Avoid row by row processing • Appropriate use of Indexes • Wise use of Parallelism
• Managed Services • Cloud Services • Consul3ng Services • Licensing
To Index or Not to Index
So if Smart Scans are so great do we even need indexes anymore?
YES!... You s3ll need indexes for queries with single/few out of many row reads Also keep many FK indexes – especially if used for Star Transforma3ons
• Managed Services • Cloud Services • Consul3ng Services • Licensing
To Index or Not to Index
• Many indexes will be obsolete and should be removed to help drive smart scans
• Test by: – Making indexes invisible and tes3ng queries – Comparing ETL without indexes
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club • Tune for Smart Scans • Wisely use Parallelism • Compress with HCC where appropriate • Invoke Resource Management (IORM) • S3ll follow Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Parallelism on Exadata • Parallelism executes the same on or off Exadata • PX works much bemer on Exadata and can be a big performance boost – Pushes Direct Path Reads to enable smart scans – Exadata architecture enables parallelism through storage cell CPUs and disks all working together
• Load split across DB and Cell CPUs • Allows lower DOP on Exadata to achieve op3mal performance
• Easy to overwhelm a system with Parallelism – But on Exadata, it can be controlled effec3vely
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Parallelism Guidelines • Control parallel load
– Parallel init parameters – Parallel Statement Queuing – DBRM resource plans
• Set parallel degree limits and max % targets
• Set parallel degree on large tables – ALTER TABLE [TABLE NAME] PARALLEL 12;
• Use parallelism for direct path loads in ETL – CTAS, IAS or Merge with Append Hint, Bulk Load API – ALTER SESSION ENABLE PARALLEL DML;!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Key Parallel Init Parameters • PARALLEL_MAX_SERVERS
• Max # of instance parallel workers • Recommend leaving at default (CPU_COUNT * PARALLEL_THREADS_PER_CPU*10)
• PARALLEL_MIN_SERVERS • Min # of instance parallel workers (default 0) • Helps control overhead of crea3ng and destroying workers
• Recommend seAng to high daily average of workers
See Oracle Support Note 1274318.1 for Exadata
best prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Parallel Init Parameters AUTO DOP • Enabled by parallel_degree_policy !
• Manual (Default), Limited, Auto
• Each statement automa3cally evaluated as a candidate for parallelism; whether or not statements contain parallel hints or objects have a DOP set
• Controlled by parallel_min_time_threshold • 10 seconds by default • Statements expected to run longer are candidates for automa3c paralleliza3on
• Use with Cau3on!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Parallel Statement Queuing • Limits concurrent parallel processes un3l enough slaves are available
• Protects against overwhelming the server with parallel processes
• Delivers a more consistent performance profile • Can be enabled without Auto DOP by seAng _parallel_statement_queuing = TRUE!
• Control when queuing starts by using PARALLEL_SERVER_TARGET!
• Statements queued in FIFO method !
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Parallel Statement Queuing
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Parallel Statement Monitoring • OEM / Grid Control!
– SQL Monitoring specifically • GV$PX PROCESS
– One record per Parallel Worker
• GV$SQL_MONITOR – Also shows queued parallel statements
See Oracle Support Note 135043.1 for more monitoring queries
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club • Tune for Smart Scans • Wisely use Parallelism • Compress with HCC where appropriate • Invoke Resource Management (IORM) • S3ll follow Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Hybrid Columnar Compression • Data is organized and compressed by
column in compression units (CU)
• Speed Optimized Query Compression for Data Warehousing • 5X to 10X compression typical • Runs faster because of Exadata offload!
• Space Optimized Archival Compression for infrequently accessed data • 10X to 50X compression typical
Que
ry
Faster and Simpler Backup, DR, Caching, Reorg, Clone
Benefits Mul3ply
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Hybrid Columnar Compression VENDOR_ID VEND_NAME STATE VNDR_RATING VENDOR_TYPE ========== =========== ===== =========== ========== 100 ACME ONE MI 100 DIRECT 101 ACME ONE CA 90 DIRECT 102 NORTON IA 95 INDIRECT 103 WINGDINGS MI 96 INDIRECT 104 WINGDINGS GA 96 INDIRECT
100ACME ONEMI100DIRECT|101ACME ONECA()DIRECT|102NORTONIA95INDIRECT|103WINGDINGSMS96INDIRECT|104WINGDINGSGA96INDIRECT
Free space
Uncompressed
Hybrid Columnar Compression
Logical Compression Unit
<-‐ Header -‐>
CU Header-‐>
VENDOR_ID
VEND_NAME
VNDR_RATING
STATE
VENDOR_TYPE
COL6
COL7
COL8 COL9
COL10
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Hybrid Columnar Compression Performance Benefits • If queries select a single or subset of columns, Oracle
will only need to read from blocks on which the columns exist – This is different than other types of compression and un-‐
compressed tables
• Not only is space saved, but also IO
• Saving IO means bemer performance!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
HCC – Why Not? • HCC requires direct path loads
– Conven3onal inserts use OLTP compression
• Deletes against HCC tables lock en3re CU • When upda3ng HCC tables:
– The updated row is migrated (i.e., deleted + re-‐inserted into a new block, leaving a pointer behind)
– New row is OLTP-‐compressed – Locks impact en3re CU, not just row!
• DML on HCC tables is very expensive!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
HCC Use Cases • Use OLTP compression for DW tables by default, and then use HCC compression when – Data is direct path loaded (CTAS, Insert /*+ APPEND */) – Data is not updated
• Or rarely updated and truncated and reloaded periodically
• Par33on tables with different compression ra3os – Updated Data = OLTP compression – Heavily Queried Data = Query / Archive Low compression – Cold / Archive Data = Archive High compression
• Use compression advisor to preview compress ra3o – DBMS_COMPRESSION.GET_COMPRESSION_RATIO
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club • Tune for Smart Scans • Wisely use Parallelism • Compress with HCC where appropriate • Invoke Resource Management (IORM) • S3ll follow Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM • IO Resource Management (IORM) governs and meters IO from different workloads in the Exadata Storage Servers
• A common challenge with shared storage infrastructure is that of compe3ng IO workloads – Batch vs. OLTP – Warehouse vs. OLTP – Produc3on vs. Test and Development
• Compe3ng priori3es can be mi3gated by over-‐provisioning storage, but this becomes expensive
• Exadata addresses this challenge with IORM
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM and DBRM • Oracle DBRM allows managing CPU and other internal DB resources, e.g. parallelism, among compe3ng workloads in a single database – DBRM is not Exadata Specific
• With Exadata IORM integra3on, IO resources are also controlled by DBRM
• A DBRM resource plan is also called an “intra-‐database resource plan”
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Plans Approaches for managing resource allocaVons • Intra-‐database resource plans manage mul3ple workloads in a single database – If only one database on the Exadata machine, only an intra-‐database resource plan is needed
• Inter-‐database resource plans manage resources among mulVple databases on Exadata – Specifies alloca3ons to databases, not consumer groups – Category plans allow resource control across databases by the type of workload
– An IORM plan is the combina3on of an inter-‐database plan and a category plan
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM and DBRM
Database DBM
OM OLTP Consumer group
Other OLTP Consumer group
Repor3ng Consumer group
Database XBM
Online query Consumer group
Batch query Consumer group
DBRM Example
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM and DBRM Category Plan Example
Database DBM
OM OLTP Consumer group
Other OLTP Consumer group
Repor3ng Consumer group
Database XBM
Online query Consumer group
Batch query Consumer group
Interactive category
Batch category
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Example All User IO = 100%
Category Plan
Interdatabase Plan
Intradatabase Plan
IORM Allocation
70% Interactive
30% Batch
40% XBM
60% DBM
40% XBM
60% DBM
DBM OM OLTP” 26.25%
DBM OTHER OLTP: 15.75%
XBM: ONLINE QUERY 28.00%
XBM: BATCH QUERY 12.00%
DBM: REPORTING 18.00%
30% 70% 20% 30% 50%
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Rules • IORM is only “engaged” when needed • LeIover disk alloca3on is made available to other workloads in rela3on to the configured resource plans – max limits can be set
• Background IO is priori3zed rela3ve to user IO – Redo and control file writes always take precedence – DBWR writes are scheduled at the same priority as user IO
• If no intra-‐database plan is set, all non-‐background IO requests are grouped into the default OTHER_GROUPS consumer group
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Plan Syntax
IORM plans created using CELLCLI / DCLI
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Monitoring • IORM Metrics using CELLCLI / DCLI
• Metric IORM script
• See Oracle Support Note: “Tool for Gathering I/O Resource Manager Metrics: metric_iorm.pl [ID 1337265.1]”
• OEM (Grid Control) Exadata plugin
Metric Name Meaning DB_IO_RQ_SM DB_IO_RQ_LG
Total number of IO requests issues by the database since any resource plan was set
DB_IO_RQ_SM_SEC DB_IO_RQ_LG_SEC
IO requests per second issued by the database in the last minute
DB_IO_WT_SM DB_IO_WT_LG
Total number of seconds that IO requests issued by the database waited to be scheduled
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Unless you only have one database with a single type of workload on Exadata – then you should use IORM
In other words… Everyone using Exadata should use IORM!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
IORM Benefits EDW for Large Organiza3on in Salt Lake valley
3.5 days before and aIer enabling IORM/DBRM plans
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Avoid the 3X Club • Tune for Smart Scans • Wisely use Parallelism • Compress with HCC where appropriate • Invoke Resource Management (IORM) • S3ll follow Data Warehouse Best Prac3ces
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Follow DW Best Prac3ces
Oracle data warehousing on Exadata is s3ll data warehousing on Oracle
(With a few incredible innova3ons J)
So… Data Warehouse Best Prac3ces s3ll apply!
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Follow DW Best Prac3ces Key Best PracVces • Dimensional Model (Star Schema) • Well-‐wrimen SQL • Table Par33oning (par3cularly fact tables)
– Par33on by load frequency, sub par33on by join hash – Par33on Exchange loading
• Parallel, Direct-‐Path (possibly nolog) Data Loading – Including Constraint and Index management
• Query Rewrite • Materialized Views and OLAP cubes
• Star Transforma3on Joins
• Managed Services • Cloud Services • Consul3ng Services • Licensing
GeAng the Most Out of Your Exadata DW
DW Best PracVces
Parallelism
Hybrid Columnar
Compression
Smart Scans (Storage Offloading)
• Managed Services • Cloud Services • Consul3ng Services • Licensing
Ques3ons?