21
MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

Embed Size (px)

Citation preview

Page 1: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyteLeveraging InfiniBand to Build a Powerful, Scalable

Oracle Database and Application Platform

Brian Dougherty

Chief Architect, CMA

Page 2: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

Background

• Exploding data volumes are presenting new challenges to small and medium sized organizations.

• These organizations need a new generation of technology that delivers powerful analytical capability with reduced cost and complexity.

• CMA, in partnership with Dell, QLogic, Oracle, and EMC, has developed a sophisticated solution to address this growing need.

Page 3: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte PlatformWhat is It ?

• A pre-configured, integrated, fault tolerant, high performance, commodity hardware and software based runtime environment

• Leverages the power of the Oracle database running on commodity Linux servers, and a unified InfiniBand fabric over QLogic multi-protocol fabric directors

• Provides a scalable, reliable, lower cost platform for Business Intelligence, Custom Software, or Commercial Software

Page 4: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte PlatformFeatures

• High Performance• Scalable• Fault Tolerant• Commodity Hardware• Lower Cost• Smaller Footprint • Reduced Power Consumption

Page 5: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte PlatformAttributes

• Red Hat Linux Operating System• Commodity Hardware (Dell / EMC CX / QLogic)• 42U Rack Footprint with Fault Tolerant 42U rack• Clustered Oracle Database • Clustered Storage Provisioning Layer• Unified Storage and Interconnect Fabric

Page 6: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Single Rack Configuration

Sample Hardware Components (per rack)• (2) Dell 1950 Database Servers

• 32GB RAM Each

• 16 Processor Cores

• (1) Dell 1950 ETL Server

• 16GB RAM

• 8 Processor Cores

• (1) Dell 1950 Business Intelligence Server

• 16GB RAM

• 8 Processor Cores

• (1) QLogic 9020 InfiniBand Fabric Director

• (2) FVIC Modules

• (2) EMC CX3-40 Storage Arrays

• 2TB – 5TB Storage

• (1) Dell PowerConnect 48 Port GigE Switch

• (1) Belkin KVM

• (1) Dell Console

Physical Data Guard

Page 7: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Mid-Range ConfigurationHardware components (two racks)• (4) Dell 1950 Database Servers

• 32GB RAM Each

• 32 Processor Cores

• (1) Dell 1950 ETL Server

• 32GB RAM

• 8 Processor Cores

• (1) Dell 1950 Business Intelligence Server

• 32GB RAM

• 8 Processor Cores

• (1) QLogic 9040 InfiniBand Fabric Director

• (2) FVIC Modules

• (2) EMC CX3-40 Storage Arrays

• 8TB – 15TB Storage

• (2) EMC SATA 1TB Drives (for backup)

• (1) Dell PowerConnect 48 Port GigE Switch

• (1) Belkin KVM

• (1) Dell Console

Page 8: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Large Scale ConfigurationHardware components (three racks)

• (8) Dell 1950 Database Servers

• 32GB RAM Each

• 64 Processor Cores

• (2) Dell 1950 ETL Server

• 32GB RAM Each

• 8 Processor Cores

• (2) Dell 1950 Business Intelligence Server

• 32GB RAM Each

• 8 Processor Cores

• (1) QLogic 9040 InfiniBand Fabric Director

• (4) FVIC Modules

• (4) EMC CX3-40 Storage Arrays

• 24TB – 32TB Storage

• (2) EMC SATA 1TB Drives (for backup)

• (1) Dell PowerConnect 48 Port GigE Switch

• (1) Belkin KVM

• (1) Dell Console

Page 9: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte ArchitectureMid-Level Diagram

Storage Nodes

Unified Fabric Layer

Compute Nodes

QLogic Multi-Protocol Fabric Director

Dell 1950 PowerEdge Servers / RHEL v5

EMC CX3 / CX4 Flare OS / Navisphere

Page 10: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte ArchitectureDetailed Diagram

Public Networking Dell PowerConnect 6248

Server Compute Dell 1950/2950 + QLogic 7104 HCA

Unified Fabric QLogic 9020/9040 Multi-Protocol Director

Storage Infrastructure EMC CX3/CX4

Red Hat Linux O/S QLogic OFED InfiniBand Drivers

Oracle ASMlib Oracle Clusterware 11g

Oracle ASM 11g

Oracle EE 11gw/ RAC and Partitioning

ERP Application Package

ISV Third Party Software

CMA BI Suite

Business Intelligence

Mic

roT

erab

yte

Pla

tfor

m

MT Core

MT O/S Binding

MT Clustered Database /

Storage Provisioning

Applications

Oracle Grid Control 11g

Page 11: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte PlatformOracle Software

Physical Data

Guard

OracleRMAN

Oracle Database Partitioning Parallel Query Oracle VPD Oracle RAC

Oracle ASMASMlib

Oracle Clusterware CRS CSS EVM Cache Fusion

Page 12: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Compute NodesDell 1950/2950

• MicroTerabyte solution consists of 2/4/8 RAC Database nodes

• 2 ETL and BI Nodes• Each node is a Dell 1950/2950 consisting of:

– Processor: One/Two quad-core Intel Xeon X5355 @ 2.66GHz

– Memory: 16-64 GB– Hard drives: 218GB-3TB Internal Storage– RAID Controller: PERC 5/i – 1 DDR InfiniBand HCA– Network interface cards: Dual gigabit NICs

(100baseTx-FD)– Power supply: 670W, optional hot-plug redundant

power (1+1)– Operating system: RedHat Enterprise Linux v5

Page 13: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Storage NodesEMC CX4 Model 480

Front-End Host Connectivity

• Two storage processors per CX4

• Each processor has:

• Four 4Gb Fibre Channel optical ports

• FCP SCSI-3 protocol

Back-End Disk Connectivity

• Each processor has 4Gb Fibre Channel arbitrated loops.

• Multiple RAID groups may be distributed across redundant loops

• Supports a maximum of 480 disk drives

System Memory

• Each processor has 8GB of Memory

Power Consumption (Processor Chassis)

• 355 VA (290W max)

Power Consumption (Disk Expansion Chassis)

• 440 VA (425W max)

Page 14: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Unified Fabric LayerWhat is it?

• QLogic 9020 & 9040 Multi-Protocol Fabric Directors– 9020 with two (2) FVIC IB-FC Virtual I/O Controllers

– 9040 with up to four (4) FVIC IB-FC Virtual I/O Controllers

– each FVIC provides 10 DDR (20Gb) IB ports & 8 4Gb FC ports• supports up to 128 Virtual HBA ports per module

• automatic sensing 1/2/4 Gb/s

• load balancing

• automatic port and module fail-over

• LUN mapping and masking features

• QLogic 7104 Host Channel Adapters– Dual Port, DDR

– IPoIB, RDS, SRP

Page 15: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Unified Fabric LayerGeneral Benefits

• Managing one fabric• Reduced footprint• Compact implementation• Fewer host components needed to support I/O and

interconnect• Increased bandwidth and reduced latency• Reduced host resources (1 HCA vs. several HBAs)• Path failover through SRP protocol• Well positioned to take advantage of advances in

InfiniBand technology

Page 16: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Unified Fabric LayerOracle RAC Benefits

• Scalable platform to support Oracle RAC• More predictable response times• Capability to drive more Oracle I/O through

fewer compute nodes• Ability to exploit storage capability at a lower

cost• Reduced Oracle messaging latency via RDS/IB

Page 17: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

CONFIGURATION

A mid-size MicroTerabyte configuration, including:

Servers: Four (4) Dell 1950 Intel quad core servers. Each 1950 includes 8 cores, 16GB memory and 1 dual channel HCA. The server is running Red Hat Linux 5 update 1 (2.6.18-53 kernel).

Storage: Two (2) EMC CX3-40 storage arrays. Each storage array includes 8 4Gbps front-end fiber channel connections for a total of 16 4Gbps FE adapters -- approx 7.5TB usable storage configured in 4+1 RAID sets.

Unified Interconnect and Storage IB fabric: One (1) QLogic 9040 Multi-Protocol Director with 2 FVIC modules

HARDWARE COST

Approximate total cost (market): $500,000

TEST METHOD AND RESULTS

Testing simulator: Oracle ORION -- Oracle I/O Numbers Calibration Tool for Linux

>> Random I/O Test Results:

Test Type: random I/O

I/O size: 8K

Rate observation 1: 9,600 sustained IOPS @ 1.87 ms per I/O avg node latency

Rate observation 2: 30,566 sustained IOPS @ 2.96 ms per I/O avg node latency

Rate observation 3: 42,615 sustained IOPS @ 2.96 ms per I/O avg node latency

>> Sequential I/O Test Results:

Test Type: sequential I/O

I/O size: 1MB

Rate observation 1: 677MB/sec sustained I/O seq. throughput @ 26.48 ms per 1MB I/O avg node latency

Rate observation 2: 2.098GB/sec sustained I/O seq. throughput @ 91.48 ms per 1MB I/O avg node latency

MicroTerabyte ORION Benchmarks

Page 18: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

MicroTerabyte Oracle Benchmarks

Servers: (4) Dell 1950

Dual Socket Quad Core

16 GB RAM per server

Single dual port IB HCA

Unified I/O Fabric: QLogic 9040

(2) FVIC Modules

Storage: (2) EMC CX-3 Model 40

(16) Fibre Channel ports

O/S: Red Hat Linux v 5.1

Kernel 2.6.18-51

UDEV

Unified I/O Fabric: QLogic IB drivers v 4.2.0.0.39

IB/RDS

IB/SRP

IPoIB

Database: Oracle RAC 10gR2

Oracle Clusterware 10gR2

Oracle ASM

Oracle RMAN

Oracle DataGuard

Storage: Flare O/S

EMC Navisphere

Backup: EMC Networker

Software

HardwareSource Target Degree of Row Count Elapsed Tim e

Table Size Segm ent Size Paralle lism

Database Operationterabytes - unless

notedterabytes - unless

notedin billions - unless

notedHH:MM:SS.MS

0.485 N/A 256 17 00:03:48.49

0.48 N/A 448 17.7 00:03:42.42

1.2 * N/A 352 44 00:09:51.93

2.3 N/A 256 35 00:18:15.33

1.2 * 1.41 128 31 01:40:03.04

0.579 * 1.02 128 9 00:51:05.57

1.2 * 1.96 128 35 02:40:03.04

2.2 0.512 * 64 22 03:35:01.01

2.2 1.2 64 17 01:51:02.79

1.2 92 GB 50 44 00:32:17.19

1.2 0.133 50 44 00:23:35.45

1.96 65GB 50 32 00:23:08.48

1.96 86GB 50 32 00:24:29.30

0.3 105GB 128 4.5 01:37:18.34

0.579 168GB 128 9 03:57:23.54

21GB 3.4GB 128 100 million 00:00:32.55

21GB 3.6GB 128 100 million 00:00:38.50

DBM S Stats Global Collection 0.56 N/A N/A 9 00:03:30.67

DBM S Stats Global and Local Collection 0.56 N/A N/A 9 00:04:36.97

Index Creation Global Hash

Table Scan

Direct Path Insert

M aterialized View Creation

Index Creation Local Bitm ap

Page 19: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

CM A M icroTerabyte Platform Aw ard Winning Human Services Aw ard Winning Medicaid Aw ard Winning Fraud Investigation

M id-Size Configuration Enterprise Data Warehouse Enterprise Data Warehouse Enterprise Data Warehouse

Dell 1950 / QLogic 9040 / EM C CX Storage SUN E25k and SUN 9990 Storage Array IBM P690 and EMC DMX3 Storage Array HP Superdome and XP Storage Array

> Com parative Real-World Database Operations

Full Scan Time of Largest Fact Table 18.25 minutes 7 minutes 45 minutes 21.75 minutes

Scan Time Normalized / Extrapolated to 2.3 TB 18.25 m inutes 32.2 m inutes 32.4 m inutes 41.5 m inutes

Approximate Direct Path Insert Time 100 m inutes 105 m inutes 215 m inutes N/A

Segment Size 1.4 TB 500 GB 2.3 TB N/A

Row s Inserted 31 Billion 2.8 Billion 3.5 Billion N/A

Local Bitmap Index Create Time 23 m inutes 13.5 m inutes 55 m inutes 43 m inutes

Row Count 35 Billion 2.8 Billion 4.2 Billion 1.1 Billion

Table Size (Moderate Cardinality) 2.3 TB 500 GB 3.2 TB 1.2 TB

> Environm ent Configuration Details

Processors 32 cores (8 quad cores) 48 32 32

Memory 64 GB 128 GB 256 GB 128 GB

Server HBAs 4 dual ported HCAs (1 HCA per server) 16 @ 4Gbps 20 @ 4Gbps 16 @ 4Gbps

Storage Array Ports 16 @ 4Gbps (2 EMC CX3 Model 40s) 16 @ 4Gbps 20 @ 4Gbps 16 @ 4Gbps

Operating System Red Hat Linux 5.1 Solaris 9 AIX 5.3 HP-UX 11i

Oracle ASM Vxvm and VXFS w ith ODM AIX Volume Manager w ith JFS2 Vxvm and VXFS w ith ODM

QLogic SRP 4.2 Veritas DMP EMC Pow erpath Veritas DMP

SRP over Inf iniBand via QLogic FC Gatew ay Fibre Channel Brocade based SAN Direct Attached Fibre Channel Direct Attached Fibre Channel

4+1 RAID 5 LUNs - 73 GB drives 4D+4D Parity Groups - 73 GB 15k Drives Metavolumes w ith 146 GB 15k drives 4D+4D Parity Groups - 73 GB 15k Drives

Oracle EE 64-bit 10.2.0.3 Oracle EE 64-bit 10.2.0.3 Oracle EE 64-bit 11.1.0.6 Oracle EE 64-bit 10.2.0.3

RAC / Inf iniBand w ith RDS non-RAC local IPC non-RAC local IPC non-RAC local IPC

Database Usable Storage 8 TB 8 TB 21 TB 16 TB

Largest Fact Table Row Count 35 Billion 2.8 Billion 4.2 Billion 1.1 Billion

Largest Fact Table Size 2.3 TB 500 GB 3.2 TB 1.2 TB

Approximate HW Cost $500K $2M+ $2M+ $2M+

Volume Manager, File System, Multi-Pathing

Storage Topology

Database Topology

MicroTerabyte Benchmark Comparisons

Page 20: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

Summary• As demonstrated in CMA’s MicroTerabyte platform,

Infiniband can provide an extremely capable transport mechanism for unifying interconnect and i/o traffic

• Increases bandwidth and reduces latency• Reduces Oracle messaging latency via RDS/IB• Provides more predictable response times• Reduces host resource requirements (i/o processing

workload off-loaded to HCA card)• Consistent with Oracle’s strategic technology direction

Page 21: MicroTerabyte Leveraging InfiniBand to Build a Powerful, Scalable Oracle Database and Application Platform Brian Dougherty Chief Architect, CMA

Brian DoughertyChief Architect, [email protected]

For More Information