Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
© 2016 IBM Corporation
IBM PureData System for AnalyticsBringing speed and simplicity for big outcomes
© 2016 IBM Corporation2
The IBM analytic architecture
© 2016 IBM Corporation3
Big Data Lakes or Swamps?
As we bring data together, are we
creating a data swamp?
No one is sure of the origin or purity of
data.
No one can find the data they need.
No one knows what data is present and
and if it is being adequately protected.
How do we build trust in big data?
Need trust both to share and to consume
data.
Need understanding of quality, origin and
ownership of data.
Need classification of data to govern and
protect it.
Need timely, reliable data feeds and
results.
All built on secure and reliable
infrastructure.
© 2016 IBM Corporation4
The Data Lake subsystems
Data Lake (System of Insight)
Information Management and Governance Fabric
Catalogue
Self-
Service
Access
Enterprise
IT Data
Exchange
Self-Service
Access
Analytics
Teams
Governance, Risk and
Compliance Team
Information
Curator
Line of Business
Teams
Data Lake
Operations
Enterprise IT
Other Data
Lakes
Systems of
Engagement
Data Lake Repositories
Systems of
Automation
Systems of
Record
New Sources
© 2016 IBM Corporation5
Data Sources
Transactional
Social
Application
User Generated
Journal
Video and Audio
Machine / Sensor
Documents
Third Party
The EDW has evolved into the Logical Data WarehouseOptimizing access and reducing costs
Internal Insight
Reporting
Enterprise
Content
Discovery
Exploration
Decision
Management
Predictive
Analytics
Visualization
External-Facing
Applications
Web or Mobile
Systems of
Engagement
Information Governance
Real-time Analytics
NoSQL Doc Store Data Warehouse Deep Analytics,
Modeling
Transactional
Systems
Landing,
Exploration,
Archive
Reporting,
Analytics
Logical Data
Warehouse
© 2016 IBM Corporation6
ETL, MDM, Data Governance
Metadata and Governance Zone
6
Warehousing Zone
Enterprise Warehouse
Data Marts
Ingestion and Real-time Analytic Zone
Streams
Con
ne
cto
rs
BI & Reporting
PredictiveAnalytics
Analytics and Reporting Zone
Visualization & Discovery
Landing and Analytics Sandbox Zone
Hive/HBaseCol Stores
Documentsin variety of formats
MapReduce
Hadoop
NEXT Generation of Enterprise Data Warehouse –
With Data Zones
Netezza
IBM
BigInsights
IBM Streams
© 2016 IBM Corporation7
Transforming the user experience….
Dedicated device
Optimized for purpose
Complete solution
Fast installation
Very easy operation
Standard interfaces
Low cost
Appliances Make It Simple
© 2016 IBM Corporation8
Purpose-built data warehouse / analytics engine
Integrated database, server and storage
Standard interfaces
Low total cost of ownership
Speed: 10-100x faster than traditional systems*
Simplicity: Minimal administration and tuning
Scalability: Petabyte+ scale user data capacity
Smart: High-performance advanced analytics
Secure: Automatic data encryption
IBM PureData System for Analytics
Delivers appliance simplicity for data
warehousing and analytics
* Based on IBM customers’ reported results. “Traditional custom systems” refers to systems
that are not professionally pre-built, pre-tested, and optimized. Individual results may vary.
Server
Storage
Database
Analytics
© 2016 IBM Corporation9
IBM PureData System for Analytics N3001 Family
Specification N3001-001 N3001-002 N3001-005 N3001-010 N3001-020 N3001-040 N3001-080
Racks n/a, 2 x 2U 1 (1/4 full) 1 (1/2 full) 1 2 4 8
Active S-Blades n/a 2 4 7 14 28 56
CPU cores 40 40 80 140 280 560 1,120
User data (TB)* 16 32 96 192 384 768 1,536
Load (TB/Hour) .375 2.0 4.5 6 9.5 10 10
* Assuming 4x compression
Single rack systems Multiple rack systems
© 2016 IBM Corporation10
Simple
Same ease of use as all PureData System for Analytics appliances
Load and go with minimal tuning and administration
Fast
10-100x faster than traditional custom systems1
Smart
Rich set of in database analytic functions
Protection of all data from unauthorized access
Includes starter kits for Big Data and Business Intelligence
Agile
Easily incorporated into the data center with simplified installation into an existing rack
Affordable
Purchase or lease
IBM PureData System for Analytics - N3001-001Bringing speed and simplicity to midsize organizations for big outcomes
Netezza appliance simplicity at an affordable entry price
1Based on IBM customers’ reported results. “Traditional custom systems” refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary.
© 2016 IBM Corporation11
Appliance Features
Production ready
Rack mountable appliance
Installed in a standard, customer provided rack
Entire integrated appliance tested and packaged at the factory
Full function Netezza Platform Software (NPS) with IBM Netezza Analytics
Self Encrypting Drives; Up to 16TB1 of user data
Ease of Use
Same ease of use and features as larger appliances
- Load and go with no tuning or administration
Installation by IBM or an IBM Partner certified to install the N3001-001
Availability & Support
Highly available, Full redundancy
− All redundant hardware, 4 disk spares, hot swap power supply
Remote access for support; Call Home enabled
1 Assuming 4X compression
PureData System for Analytics N3001-001Solution Highlights
© 2016 IBM Corporation12
Spend Less Time Managing and More Time Innovating
No dbspace/tablespace sizing and configuration
No redo/physical/Logical log sizing and configuration
No page/block sizing and configuration for tables
No extent sizing and configuration for tables
No Temp space allocation and monitoring
No RAID level decisions for dbspaces
No logical volume creations of files
No integration of OS kernel recommendations
No maintenance of OS recommended patch levels
No JAD sessions to configure host/network/storage
Data Experts,
not Database
Experts
Easy Administration Portal
No software installation
No indexes and tuning
No storage administration
Simplicity and
Ease of
Administration
© 2016 IBM Corporation13
Functionality Stack – Out of the box
3rd
Pa
rty C
onn
ectivity
OD
BC
/ J
DB
C /
OLE
-DB
Eclip
se P
lug
-in
Performance Monitoring GUIAdministration GUI
3rd
Pa
rty
B&
RIn
terf
ace
Wo
rklo
ad
Ma
na
ge
me
nt
Loa
d &
Un
load
Ba
cku
p
&
Re
sto
re
Spatial Analytics
In-Database Analytics Framework and Analytics Libraries
SQL Extensions Package – Data Encryption / Decryption
User Defined Functions / Aggregates / Stored Procedures
SQL ANSI-92 Compliant plus ANSI-99 Analytic Extensions
Automatic Data Compression / Decompression
Data Warehouse Appliance Platform(Parallel Hardware – CPUs/FPGAs/Disks/Network, Software – DBMS)
Access/Object Security Model LDAP / Kerberos
Row Level Data Security
Monitoring / Auditing
Event Management / Alerting
Systems Management Tools / 3rd Party Systems Management Integration
© 2016 IBM Corporation14
IBM PureData System for Analytics - Analytics Ecosystem
IBM PureData System for Analytics
Massively Parallel Platform
Netezza
In-Database
Analytics
Transformations
Geospatial
Predictive
Statistics
Data Mining
Other Tools
In-Database
Analytics
SAS
R
Fuzzy Logix
Zementis
IBM SPSS
BI Tools
Visualization Tools
Software
Development
Kit
User-Defined
Extensions
(UDF, UDA,
UDTF, UDAP)
Language
Support
(Map/Reduce,
Java, Python,
Lua, Perl,
C, C++,
Fortran,
PMML)
Custom Stored
Procedures
(NZPLSQL)
IBM BigInsights
IBM Streams
© 2016 IBM Corporation15
Big Data and Business Intelligence ReadyUnlocking Data’s True Potential
Real-time AnalyticsInfoSphere Streams Developer Edition 2 users, non-production licenses
Business Intelligence Cognos software, 5 Analytics User licenses, plus 1 Analytics Administrator license
Hadoop Data ServicesInfoSphere BigInsights Software licenses to manage ~100 TB of Hadoop data
Exceptional value
provided
Included with the PureData System for Analytics N3001
Data Integration & TransformationInfoSphere DataStage 280 PVUs, 2 concurrent Designer Client licenses and InfoSphere Data Click
Data Warehouse Appliance
IBM Fluid Query included with
NPS appliance software
© 2016 IBM Corporation16
The query layer of Cognos Business Intelligence
Generates SQL specifically optimized for each version of Netezza to exploit its analytical
functions as much as possible
Blends Netezza data with other popular sources of business data
Powerful, efficient data summarization
Security-aware in-memory caching avoids redundant queries
Dynamic query mode employs a 64-bit extensible Java query engine
Compatible query mode for easy upgrades from Cognos 8
DynamicQuery
CompatibleQuery
DynamicCubes
© 2016 IBM Corporation17
Executing DMR Report in Dynamic Query Mode
Dimensional report results in MDX query against execution engine
If the dimension and measure data is in cache, query is computed directly without accessing database
If the data is not in the cache the necessary data is gathered with a relational SQL query
© 2016 IBM Corporation18
Dynamic Query Mode is optimized for PDA
Offers a high-performing OLAP Over Relational experience via hybrid SQL/MDX techniques
Avoids redundant queries through security-aware metadata, data, and query plan cache
management
Provides built-in query visualization tool
Leverages 64-bit architecture
Uses JDBC connection to Netezza
Advanced sorting behavior that aligns DMR queries with other OLAP data sources
© 2016 IBM Corporation19
Hadoop queries • Query compressed data from Big SQL
Data movement• Import databases to Hadoop
• Append tables on Hadoop
Database queries
+
Unifying PureData System for Analytics with Hadoop, Spark & RDBMS
IBM Fluid Query – Extends Your Data Warehouse
© 2016 IBM Corporation20
IBM Fluid Query Helps Deliver Advanced Business Analytics
Simpler, improved
user self service
capabilities
Easier, faster
consumption
of data
Better, more
transparent access to
required data sources
© 2016 IBM Corporation21
IBM Fluid Query – What it Does
RDBMS Data
Hadoop Data
Fluid
Query
Extends the reach of your data warehouse into traditional databases
› Expand your data reach to other relational databases like:
– DB2, dashDB, Oracle and MANY Other Operational databases
– Multi-generational / Multi-Model IBM Netezza appliances supported
Enables transparent data access and integration across the enterprise
› Leverage existing, fit-to-purpose data stores without adding
complexity
› Leverage new data sources without application changes
Unifies Hadoop with IBM Netezza
› Access / Utilize / Leverage Spark, Cloudera, Hortonworks, BigInsights
data
› Query BigInsights Hadoop data with Big SQL or from PureData
System for Analytics
© 2016 IBM Corporation22
IBM Fluid Query – Powering IBM’s Logical Data WarehouseWithin both IBM Netezza and BigInsights, DB2 and BLU
SQL access to data across
any system from Hadoop,
including relational data
via IBM Big SQL.
Run Hadoop queries from
your EDW and move data
to and from Hadoop via
IBM Fluid Query on
Netezza
Other sources
Bulk data
Movement
Asking Questions,
Getting Answers.
› Intelligently route queries.
› Simplify and unify information access for
consumers.
› Access data for analytics and business insight.
Operation
al
Analytics
© 2016 IBM Corporation23
IBM Fluid Query Use Cases
Discovery & Exploration
‒ Land data in Hadoop for discovery, exploration & “day 0” archive
‒ Queries can access data across IBM Netezza, Hadoop and other database sources in your LDW
‒ Spark / IBM machine learning partnership enables patter recognition applications
Build bridges to RDBMS islands
Combine data from different enterprise divisions currently trapped in separate database
implementations
Access structured data from familiar sources like Oracle, DB2, IBM Netezza and dashDB
Data Warehouse Capacity Relief and Disaster Recovery
Offload colder data from IBM Netezza to Hadoop to relieve resources on the data warehouse
Copy data to Hadoop as a disaster recovery solution (immutable backup or compressed read)
Backup your database to Hadoop in an immutable format
Queryable Archive
Query historical data on Hadoop with Big SQL or from IBM Netezza
Combine Hadoop data in IBM Big SQL, Hive, Impala or Spark SQL with other data sources
© 2016 IBM Corporation24
Cross platform query & data movement
Between IBM Netezza and Hadoop
Unifying IBM Netezza with Hadoop
Hadoop Queries
Data Movement
IBM Fluid Query – Extending to Hadoop
Question
Answer
© 2016 IBM Corporation25
Cross platform query from between PureData System for
Analytics to dashDB, DB2, Oracle and PureData System for
Analytics
Unifying PureData System for Analytics with structured databases
SQL Queries
IBM Fluid Query extends your data warehouse to RDBMS*
sources
Question
Answer
*Relational Database Management
System
© 2016 IBM Corporation26
More Accurate Decision-Making
Typical customer benefits include:
Real-time sales and inventory insights
Speed that transforms the business
Game-changing ways of working
Increased profitability, new revenue streams and reduced costs
Barnes & Noble: “Suppliers can log in on a daily basis and see sales and stock ratios. It shows them what’s selling and how, and the categories they’re strong or weak in”. - Tom Williams, Web Director, Barnes & Noble
Catalina Marketing: With Netezza’s in-database technology, we can now individualize offers
to millions of customers, resulting in coupon redemption rates that are unheard of
in the industry”. Eric Williams – CIO and EVP, Catalina Marketing
Virgin Media: Enabled credit services team to save €14 million in bad debt write-off and
customer churn. Produced 67% more effective marketing campaigns, ROI < 3 months.
Neilsen: “when something took 24 hours I could only do so much with it, but when something
takes 10 seconds, I may be able to completely rethink the business …”
- Greg Goff, SVP Application Development, Nielsen.
© 2016 IBM Corporation27
Accelerated Time to Value
• Integrated appliance – deployed in days, as opposed to months with traditional systems
• Low implementation services cost
• Minimal disruption to business
Carter’s Inc: re-platformed from Oracle to PDA in less than six weeks
Central England Co-Operative: all data loading complete within one week of installation
eHarmony: “They shipped us a box, we put it into our data center and plugged into our
network. Within 24 hours we were up and running. I'm not exaggerating, it was that easy”
- Joseph Essas, Vice President of Technology, eHarmony
International Technology Group Report: “Cost/Benefit Case for IBM PureData System
for Analytics”, June 2014:
“76% of IBM PureData System for Analytics users reported overall deployment times of
three weeks or less”
“The fastest reported IBM PureData System for Analytics deployment provided reporting
data to 500+ users within four days”.
© 2016 IBM Corporation28
Low Total Cost of Ownership
• Less than 1 FTE to administer
• Minimal training requirements – leverages existing skills within SIG
• One product – simple and predictable future support costs
iBasis, a KPN Company: “Our data warehouse team consists of one to two employees that
we need once every three months, to do small changes for release verifications”.
– Mark Saponar, CIO, iBasis
International Technology Group Report: “Cost/Benefit Case for IBM PureData System
for Analytics”, June 2014:
“Among 21 IBM PureData System for Analytics users, 18 employed less than one FTE for
database as well as system storage administration. One organization cited a single FTE
supporting multiple systems. Two cited two FTEs supporting more than 20 and more than
30 systems respectively.
Among organizations with less than one FTE, 12 (67 percent) estimated that the actual
number was less than 0.5. Administration overhead was said to represent a fraction of one
person’s time once a week…two hours a week…a couple of hours a week…a few hours
a month…less than an hour a day (to administer five systems)…maybe six hours
every three months…20 hours a year.”
© 2016 IBM Corporation29© 2015 IBM Corporation
A True Solution That Drives …………
“Decreased Time to Value:Much easier and faster deployment
They shipped us a box, we put it into our data center and plugged into our network. Within 24 hours we were up and running. I'm not exaggerating, it was that easy.
- Joseph Essas, Vice President of Technology, eHarmony
eHarmony
”“
Increased Analytical PowerSpeed that transformsthe business
…when something took 24 hours I could only do so much with it, but when something takes 10 seconds, I may be able to completely rethink the business… . ”- SVP Application Development, Nielsen
“Game Changing Ways of Working With Netezza’s in-database technology, we can now individualize offers to millions of customers, resulting in coupon
redemption rates that are unheard of in the industry.
”- Eric Williams, CIO and EVP, Catalina Marketing
“Lower Cost of Ownership
Our data warehouse team consists of one to two employees that we need once every three months, to do small changes for release verifications.. ” - Mark Saponar, CIO, iBasis, a KPN Affiliate
“Real Time Sales and Inventory Insights
Suppliers can log in on a daily basis and see sales and stock ratios. It shows them what’s selling and how, and the categories they’re strong or weak in.
-Tom Williams, Director, Web Services, Barnes & Noble”
“The PureData System, powered by Netezza technology, provided huge technical advantages & big business advantages. We can now insure devices on behalf of a bank in the UK, which we couldn’t have done before.
- Paul Scullion, Head of Business Intelligence, Carphone Warehouse”“Increased Profitability, New Revenue Rtreams and Reduced Business Costs
The PureData System, powered by Netezza technology, provided huge technical advantages & big business advantages. We can now insure devices on behalf of a bank in the UK, which we couldn’t have done before.
- Paul Scullion, Head of Business Intelligence, Carphone Warehouse”
1
2
3
4
5
6
© 2016 IBM Corporation
Thank you