View
1.504
Download
0
Category
Tags:
Preview:
Citation preview
© 2012 IBM Corporation
IBM PureData System sMeeting data challenges with simplicity, speed & lower cost
Artur WrońskiIBM Software Group
© 2012 IBM Corporation50
© Copyright IBM Corporation 2012. All rights reserve d.U.S. Government Users Restricted Rights - Use, dupli cation or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS P ROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETE NESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS P ROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INF ORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CH ANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OU T OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENT ATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFF ECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICEN SORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING TH E USE OF IBM PRODUCTS AND/OR SOFTWARE.
IBM, the IBM logo, ibm.com, Information Management, IMS, CICS, DB2, WebSphere, PureSystems, pureScale, PureExperience, PureApplication, PureData and z/OS are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
Other company, product, or service names may be trademarks or service marks of others.
Disclaimer
© 2012 IBM Corporation52
The PureSystems family delivers greater simplicity, speed & lower cost
Expert Integrated System
General Purpose Components
Custom Built System
Today’s problem: Time and effort is spent tuning general purpose components
The PureSystems solution: Simplifying the entire IT project lifecycle
Reduced Time, Cost and Risk
Design/Deploy Manage/Maintain
Design ManageDeploy Maintain
© 2012 IBM Corporation53
Optimized for data services:�Transactional
�Analytics
Expert integrated:�Data platform
�Infrastructure
�Unified platform management
�Built-in expertise
IBM PureData System: Optimized exclusively for data services
Workload optimized performance
Data load ready in hours
Integrated management
Single point of support
Automated maintenancein hours, not days
Data Platform
Delivering Data Services
© 2012 IBM Corporation54
Real Time Fraud Detection
CustomerAnalysisE-commerce
Different types of workloads require different data services
Transaction Processing Reporting and Analytics Operat ional Analytics
Random reads &random updates
Sequential reads & sequential data loads
Random and sequential reads & data loads + continuous ingest
Shared access to all data
Many transactions with narrow data scope accessing the same database
Random reads &random updates
Partitioned data access
Analytics with broad data scope, split into many parts across data partitions to run in parallel
Sequential reads & sequential data loads
Partitioned data access
Analytics split into many parts and narrow scope operations,all running in parallel
Random and sequential reads & data loads + continuous ingest
Scalable Transactional Database Analytics Data Warehouse Operational Data Warehouse
© 2012 IBM Corporation55
For apps like E-commerce…Database cluster services optimized for transactional throughput and scalability
For apps like Customer Analysis…Data warehouse services optimized for high-speed, peta-scale analytics and simplicity
For apps like Real-time Fraud Detection…Operational data warehouse services optimized to balance high performance analytics and real-time operational throughput
Different data workloads have different characteristics
System for Transactions
System for Analytics
System for Operational Analytics
powered by Netezza technology
© 2012 IBM Corporation57
Database Architectures: Single, Shared Storage, Shared Nothing
Single Database View
Log
DB2
Log
DB2
Log
DB2
Part 1 Part 2 Part 3
SQL 1’ SQL 1’’ SQL 1’’’
SQL 1
DB2 InfoSphere Warehouse (aka DPF)Ideal for data warehousing with MPP scale out
for near linear scalability and query processing
Log
DB2 DB2 DB2
Single Database View
DB2 pureScale Data SharingIdeal for active/active OLTP/ERP
scale out
Tran 1 Tran 2 Tran 3
Shared Data Access
DB2
Tran
Log
Database
Core DB2Ideal for OLTP and data marts
Optimized for OLTP Optimized for Analytics
© 2012 IBM Corporation58
IBM PureData System for Transactions highlights Optimized exclusively for transactional data workloads
Delivering data services for transactions
System for Transactions
Speed� Industry leading DB2 performance� Database node recovery in seconds1
Simplicity� Database deployment in minutes, not hours1
� Capable of running multiple database software versions � Handles more than 100 databases on 1 system2
� No planned system downtime for firmware / OS upgrades1
Scalability� Scaling up to 30x3
� Designed to expand from small to medium & medium to large configuration with no planned system downtime required
Smart
� Supports Oracle Database apps with minimal change; supports DB2 applications unchanged
� Clients have experienced cases of 10x storage space savings via Adaptive Compression4
Footnotes:1. Based on IBM internal tests and system design for normal operation under
expected typical workload. Individual results may vary.2. Based on one large configuration.3. Based on the designed minimum and maximum processor and memory
resources required for a single database.4. Based on client testing in the DB2 10 Early Access Program.
© 2012 IBM Corporation59
Optimized solution stack
Servers
Storage
Networking
Management
Data
Management
DeploymentDatabases
• Factory integrated and optimized • Server, storage, networking and software
• High data availability • Automatic failure detection and online
recovery
• Solid State exploitation • Automatic management of hot, warm and
cold data for faster performance
• Optimized database patterns• Database patterns pre-tuned and pre-
configured for performance
• Integrated fixes• Zero down time for system maintenance
Reduce your system integration costs
© 2012 IBM Corporation60
PureData System for Transactions -
built-in
Traditional systems -build it yourself
Uninterrupted access to data with consistent perfor mance
In minutes , 1. Just specify cluster
name, description and topology pattern
Over several days/weeks : 1. Define High Availability topology 2. Configure HW/SW/Network 3. Set up storage pools 4. Install multiple operating systems 5. Install database instances 6. Set up primary and secondary
management systems 7. Set up database members 8. Set up backup processes9. Test, tune, reconfigure...
6-node database cluster instance
Simplified deployment with high availability
© 2012 IBM Corporation61
Database
DeployDeploy Virtual Appliance
Metadata
ApplicationServer
Operatingsystem
Virtual
ApplianceMetadata
ApplicationServer
Operatingsystem
Virtual
ApplianceMetadata
HTTPServer
Operatingsystem Cluster
Topology
Consolidate100s of database servers to a single system for optimal resource efficiencyand easier administration
Optimizestorage resources for up to 10x storage space savings
Innovatefaster by deploying new databases in minutes
Acceleratedeployment of new database services in cloud environments using patterns of expertise
DeployDeploy
Captures your expertise in patterns for consistent, reliable deployment of critical databases as a service
Simplified, pattern based database deployment
© 2012 IBM Corporation62
Simplified Application Development
• Higher scalability provided by adding more nodes with no application changes required
• Dev/Test/Staging/Production database repeatability through database patterns
• Integrated data movement tools to speed creation of test databases
• Built-in Oracle compatibility mode for minimal to no application changes
• Higher utilization and lower costsprovided through shared resource management
Applications
DatabaseManagement
Software
DatabaseManagement
Software
DatabaseManagement
Software
DatabaseManagement
Software
© 2012 IBM Corporation63
Configurations Small
¼ Rack
Medium
½ Rack
Large
Full Rack
Chassis 1 1 2
Blacktip ITEs(16 cores per ITE)
6 12 24
Cores 96 192 384
Memory 1.5 TB 3.1 TB 6.1 TB
V7000 1 2 4
V7000 Exp 1 2 4
User Capacity Raw SSD Storage (400 GB drives)
Raw HDD Storage (900 GB drives)
18.6 TB4.8 TB32.4 TB
37.2 TB9.6 TB64.0 TB
74.4 TB19.2 TB
128.0 TB
PureData for TransactionsThree standard configurations to choose from Upgrade Upgrade
© 2012 IBM Corporation64
Self-optimizing Best data placement and access automatically selected based on usage statistics for optimal performance
Self-healing Failed database nodes are isolated and recovered automatically
Self-balancing Data access requests automatically load balanced for optimal performance
Self-monitoring Based on thresholds and alerts, system will monitor and automatically make changes as needed to improve performance
Simplified database administration
Self-tuning Memory management dynamically balances resources
© 2012 IBM Corporation65
Simplified and integrated system management
• Single console to manage all resources and work running on the system
• Role-based security and tasks
• management
• monitoring
• maintenance
• Easy integration with broader enterprise monitoring tools and processes
• Consistent IBM PureSystemsconsole
© 2012 IBM Corporation66
Simplified maintenance with pre-integrated fixes
• All hardware firmware and OS software patches integrated and tested together at the factory
• Can apply hardware and OS maintenance with zero downtime
• Single line of support
• Integrated stack support
Reduce risk and eliminate manual errors when applying maintenance
© 2012 IBM Corporation68
Database Architectures: Single, Shared Storage, Shared Nothing
Single Database View
Log
DB2
Log
DB2
Log
DB2
Part 1 Part 2 Part 3
SQL 1’ SQL 1’’ SQL 1’’’
SQL 1
DB2 InfoSphere Warehouse (aka DPF)Ideal for data warehousing with MPP scale out
for near linear scalability and query processing
Log
DB2 DB2 DB2
Single Database View
DB2 pureScale Data SharingIdeal for active/active OLTP/ERP
scale out
Tran 1 Tran 2 Tran 3
Shared Data Access
DB2
Tran
Log
Database
Core DB2Ideal for OLTP and data marts
Optimized for OLTP Optimized for Analytics
© 2012 IBM Corporation69
IBM PureData System for Operational AnalyticsOptimized exclusively for operational analytic data workloads
Speed� Designed for 1000+ concurrent operational queries� Continuous ingest of operational data� MPP analytics (Massively Parallel Processing)
Simplicity� Fast time-to-value� Automatic workload management� Integrated backup on the system� Integrated management and support
Scalability� Multiple sizes with data capacity up to a Petabyte
Smart� In-database analytics for leading applications� Supports DB2 applications unchanged and
Oracle Database apps with minimal change� Clients have experienced cases of 10x storage
space savings via Adaptive Compression
Delivering data services for operational analytics
System for Operational Analytics
© 2012 IBM Corporation7070
Optimized for a mix of interactive and analytic queries
� Preset and configured for top performance, throughput, and efficient resource utilization
� Continuous ingest of operational data
� Balanced throughput and performance through dynamic self-tuning
� Policy-based automatic workload management
� Adaptive compression that delivers storage savings and improved query performance
IBM PureData System for Operational Analytics
© 2012 IBM Corporation71
Real-time operational analytics with continuous data ingest
DW
Dat
a S
ourc
es
Up to the Second Loads,Faster Loads
ETL- Extract, Transform, LoadETL- Extract, Transform, Load
The Time-Value Curve:How does business value change through time?*
Business eventData ready for analytics
Analytics delivered
Value
Decision / Action taken
Decision latencyDecision latency
Time
Val
ue
gain
ed
Faster decision / action
� Complete, built-in support for continuous feed of data into the warehouse
� Parallel processing, and support for multiple connections
� Minimum impact on query performance
Analytics are delivered to decision makers faster
* Above diagram adapted from TDWI Best Practices Report - Operational Data Warehousing by Philip Russom, 4Q 2010
© 2012 IBM Corporation72
PureData System for Operational Analytics hardware and capacity sizes
*Unformatted raw disk capacity
Scalable to PB+*
Extra Small Small Medium Large
64.8 TB* 151.2 TB* 237.6 TB* 324 TB*
�� IBM POWER7 P740 & P730 IBM POWER7 P740 & P730 16 Core servers @ 3.55GHz16 Core servers @ 3.55GHz
�� Blade Network Technologies Blade Network Technologies 10G and 1G Ethernet switches 10G and 1G Ethernet switches
��Brocade SAN switches Brocade SAN switches (SAN48B(SAN48B--5)5)
�� IBM IBM StorwizeStorwize®® V7000 with V7000 with 900GB drives900GB drives
��Ultra SSD I/O Drawers, each with Ultra SSD I/O Drawers, each with six 387GB SSD six 387GB SSD
Scales to PB+
capacity*
© 2012 IBM Corporation75
System for Analytics
Delivering data services for analytics
IBM PureData System for AnalyticsOptimized exclusively for analytic data workloads
Speed� 10-100x faster than traditional custom systems*� Patented MPP hardware acceleration
(Massively Parallel Processing)
Simplicity� Data load ready in hours� No database indexes� No tuning � No storage administration
Scalability� Peta-scale data capacity
Smart� Designed to runs complex analytics in minutes,
not hours� Richest set of in-database analytics
* Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.
© 2012 IBM Corporation76
Inside the IBM PureData System for Analytics
Optimized Hardware + Optimized Hardware + SoftwareSoftware
�Hardware accelerated AMPP
�Purpose-built for high performance analytics
�Requires no tuningSnippet Blades Snippet Blades ™™
�Hardware-based query acceleration with FPGAs
�Blistering fast results
�Complex analytics executed as the data streams from disk
Disk EnclosuresDisk Enclosures
�User data, mirror, swap partitions
�High speed data streaming
SMP HostsSMP Hosts
�SQL Compiler
�Query Plan
�Optimize
�Admin
© 2012 IBM Corporation77
Snippet-Blade™ (S-Blade) Components
Intel Quad-Core
Dual-Core FPGADRAM
IBM BladeCenter Server Netezza DB Accelerator
SAS ExpanderModule
SAS ExpanderModule
© 2012 IBM Corporation78
BI Reporting and Ad-Hoc Analysis
� What happened?� When and where?� How much?
Predictive Analytics
� What will happen?� What will the impact be?
Optimization
� What is the best choice?
© 2012 IBM Corporation78
PureData System for Analytics Takes Analytics Beyond Reporting
© 2012 IBM Corporation79
� Basic Math*
� Permutation and Combination*
� Greatest Common Divisor and Least Common Multiple*
� Conversion of Values*
� Exponential and Logarithm*
� Gamma and Beta Functions
� Matrix Algebra+
� Area Under Curve*
� Interpolation Methods*
Transformations MathematicalTime Series
� Linear Regression+
� Logistic Regression+
� Classification
� Bayesian
� Sampling
� Model Testing
� Geospatial Data Type
� Geometric Functions
� Geometric Analysis
Predictive Geospatial* Fuzzy Logix
DB Lytixcapabilities
+ NetezzaAnalytics and Fuzzy Logix DB Lytixcapabilities
� Data Profiling / Descriptive Statistics+
� General Diagnostics
� Statistics+
� Sampling
� Data prep
Pre-Built In-Database Analytics
� Descriptive Statistics+
� Distance Measures*
� Hypothesis Testing*
� Chi-Square & Contingency Tables*
� Univariate & Multivariate Distributions+
� Monte Carlo Simulation*
� Autoregressive+
� Forecasting*
� Association Rules+
� Clustering+
� Feature Extraction+
� DiscriminantAnalysis*
Data Mining
Statistics
© 2012 IBM Corporation80
What’s New?Improved Concurrency, Performance, I/O efficiency and Manageability
BetterBetterPerformancePerformance
Improved Improved Management & EfficiencyManagement & Efficiency
� More than half a dozen performance improvements in:
� Optimizer efficiency
� Memory management
� Communications protocols
� Workload management
� Faster, Better, and completely transparent to the end-user
� 20x greater throughput and concurrency for tactical queries than previous generations of IBM Netezza appliances 1
� Up to 200 queries/second micro analytic workloads
� Directed Data Processing increases throughput for tactical queries
Improved Resiliency Improved Resiliency and Fault Toleranceand Fault Tolerance
� Blade level resilience for continuous high performance
� Enhanced automatic system software resilience for enterprise level requirements
1Based on IBM internal performance benchmarking of previous version to current version.
© 2012 IBM Corporation81
� 10-100x faster than traditional custom systems4
� 20x greater concurrency and throughputfor tactical queries than previous Netezza technology5
� Pattern based database deploymentin minutes, not hours 1
� Handles more than 100 databaseson 1 system2
IBM PureData System is unique
� Continuous ingest of operation data� Designed to handle 1000+ concurrent
operational queries 3
� Up to 10x storage savings with active compression6
System for Transactions
System for Analytics
System for Operational Analytics
powered by Netezza technology
1 Based on IBM internal tests and system design for normal operation under Expected typical workload. Individual results may vary.2 Based on one large configuration3 Based on IBM internal tests of prior generation system, and on system design for normal operations under expected typical workload. Individual results may vary. 4 Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.5 Based on IBM internal performance benchmarking6 Based on client testing is the DB2 10 Early Access Program
Different models pre-optimized exclusively for different data workloadssaving clients time, effort and cost to tune on their own
Recommended