Upload
jemimah-lawrence
View
221
Download
2
Embed Size (px)
Citation preview
Building a Terabyte Data Warehouse, Using Linux and RAC
George Lumpkin
Director Product Management
Oracle Corporation
Session id: 40177
Do More with Less
More performance More scalability More users Less capital cost Less administration cost
RAC for Scalability, Availability,
and Flexibility
Linux and RAC for DWScalability
Data Warehouse DB
Linux ‘Starter’ Cluster:-Two nodes-One shared database
Linux and RAC for DWScalability
As the Business Grows …
Data Warehouse DB
Linux and RAC for DWScalability
As the Business Grows …
… so does yourEnvironment:-Three Nodes-One Database
Data Warehouse DB
Linux and RAC for DWScalability
As the Business Grows …
Data Warehouse DB
… and again:-Four Nodes-One Database
Linux and RAC for DWAvailability
When one node fails …
Data Warehouse DB
Linux and RAC for DWAvailability
When one node fails …
… the load is rebalanced and
3/4th of the cluster continues the work
Data Warehouse DB
Linux and RAC for DWFlexibility
The Cluster can share all workload ubiquitously …
QueryQueryQueryQueryETL ETL ETL ETL
Data Warehouse DB
Linux and RAC for DWFlexibility
… or do workloadpartitioning
QueryQueryQueryETL ETL
ETLQuery
ETL
Data Warehouse DB
Linux and RAC for DWFlexibility
QueryQueryQueryETL ETL
ETLQuery
ETL
Workload Management and Provisioning made easy
ETLETL
Data Warehouse DB
Christmas – “Data Season”for Retail
Linux and RAC for DWFlexibility
QueryQueryQueryETL Query
ETL
Workload Management and Provisioning made easy
ETLETL
Data Warehouse DB
January – “Analysis Season”
QueryQuery
RAC and Parallel Execution
RAC and Parallel Execution
• Very large queries utilize all resources on the cluster
Large Query
RAC and Parallel Execution
• Many large-scale DWs have many concurrrent jobs– Multiple “small-to-medium” size queries – Degree of parallelism < CPUs-per-node
• With Oracle, queries will automatically run on a single node, eliminating traffic over the interconnect
Q1 Q2 Q4Q3
Q5 Q7Q6 Q8
Q9 Q12Q11Q10
Recipe for a RAC Linux DW
Processors I/O Interconnect
Data warehouse workload determines total number of CPU’s
– Same sizing considerations as non-clustered DW
How many processors per node? – Enough CPU’s so that a single node can handle
most database operations Often, 4 cpu’s is a good balance
Recipe for a RAC Linux DW:Processors
Recipe for a RAC Linux DW:I/O I/O is typically the primary determinant of data
warehouse performance– Storage configurations for a data warehouse
should always be chosen based on I/O bandwidth not storage capacity
Rule of thumb: at least 100 MBytes/sec of IO bandwidth per gigahertz of processing power
Every component of the IO system should provide enough bandwidth: disks, IO channels, IO adapters
Recipe for a RAC Linux DW:I/O
CPU power and IO bandwidth should be balanced within a server
– Example: Each node has 4 x 2ghz processors each node can utilize
at least 800 MB/sec Each node should have enough slots to accommodate the
necessary IO throughput If one host bus adapter drives 150 MB/sec, then 6 HBA’s
should accommodate the needed IO bandwidth Note that at least one slot is required for the interconnect
Recipe for a RAC Linux DW:Interconnect Gigabit ethernets are generally sufficient for
data-warehouse workloads– Oracle minimizes interconnect traffic for multi-
user workloads
Workloads requiring inter-node parallel query will utilize more interconnect bandwidth
– 10Gb ethernet, fibre channel, Infiniband
‘Typical’ Cluster configuration
16-port switch
16-port switch
1 Gigabit ethernet
16 Storage arrays, each with
10-20 disks
4 nodes, each with 4 x 2 Ghz CPUs 5 PCI slots
Oracle Linux/RAC DW Customers
RAC/Linux DW Customers Euronext
– Database size: 1.5 TB– Hardware: 2 x HP DL580 (4 CPUs)– Storage: HP MSA 1000– Interconnect: 1 Gb ethernet– OS: Red Hat
AOK Berlin– Database size: 780 GB– Hardware: 2 x HP DL580 (4 CPUs)– Storage: EMC Symmetrix– Interconnect: 2 x 1Gb ethernet– OS: SuSE
Vanderbilt University– Database size: 50 GB– Hardware: 3 x HP DL580 (4 CPUs)– Storage: EMC Symmetrix– Interconnect: 1 Gb ethernet– OS: Red Hat
National Bank AG– Database size: 75 GB– Hardware: 3 x IBM Express5800 (2
CPUs)– Interconnect: 100 Mb ethernet– OS: SuSE
Ellis Island Foundation– Database size: 60 GB– Hardware: 2 x HP DL580 (4 CPUs)– Storage: NetApp– Interconnect: 1Gb ethernet– OS: Red Hat
Linux-RAC and the Grid
Increasingly common customer theme these days is “provisioning”
Customers want more value out of their hardware expenditures – they want to take advantage of unused capacity
Oracle’s architecture is unique in being able to truly support flexible provisioning of processing power across multiple databases
Oracle will be widely deployed in large commercial computing “grids” in the future
Evolution of Business Intelligence with Oracle
ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis
Order Entry, Shipments, Procurement, Inventory, …
Real Application Clusters
Resource ProvisioningDecember: Order Processing Heavy – Analytics Light
ETL processing, Query & Reporting, Data Mining, …
Order Entry, Shipments, Procurement, Inventory, …
Order Entry, Shipments, Procurement, Inventory, …
ETL processing, Query & Reporting, Data Mining and Scoring, Cube Creation and OLAP Analysis
Resource ProvisioningJanuary: Order Processing Light – Heavy Analytics
Oracle RACBrings Flexible Processing Power to Databases on the Grid
Next Steps …Data Warehousing DB Sessions
11:00 AM
#40153, Room 304
Oracle Warehouse Builder:
New Oracle Database 10g Release
3:30 PM
#40176, Room 303
Security and the Data Warehouse
4:00 PM
#40166, Room 130
Oracle Database 10g
SQL Model Clause
8:30 AM#40125, Room 130
Oracle Database 10g: A Spatial VLDB Case Study
3:30 PM#40177, Room 303
Building a Terabyte Data Warehouse,Using Linux and RAC
5:00 PM
#40043, Room 104
Data Pump in Oracle Database 10g:Foundation for Ultrahigh-Speed Data
Movement
TuesdayMonday
For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html
8:30 AM #40179, Room 304
Oracle Database 10g Data Warehouse Backup and Recovery
11:00 AM#36782, Room 304
Experiences with Real-Time Data Warehousing using Oracle 10g
1:00PM#40150, Room 102
Turbocharge your Database, Using the Oracle Database 10g SQLAccess
Advisor
Thursday
Oracle Database 10g
Oracle OLAP
Oracle Data Mining
Oracle Warehouse Builder
Oracle Application Server 10g
Business Intelligence and Data Warehousing Demos All Four DaysIn The Oracle Demo Campground
For More Info On Oracle BI/DW Go To http://otn.oracle.com/products/bi/db/dbbi.html
Next Steps …Data Warehousing DB Sessions
Reminder – please complete the OracleWorld online session survey
Thank you.