Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
1© Cloudera, Inc. All rights reserved.
How Hadoop Changes the Analytics ParadigmKunal Taneja, System Engineering Manager A/NZ
2© Cloudera, Inc. All rights reserved.
Why is Big Data Happening Now?
Everything that can be measured will be measured.
Employees and customers expect more personal interactions, but not at the cost of their privacy.
The most innovative companiesembrace experimentationand agility.
Instrumentation Consumerization Experimentation
3© Cloudera, Inc. All rights reserved.
Example: Instrumentation in Banking
2012 2013 2014 2015 2016 2017 2018 2019
ICB Ring-fencing
ICB Loss Absorbency
Leverage Ratio -Basel III
NSFR – Basel III
MiFID II
T2S
LCR – Basel III
ICB / Competition
Audit Policy
Cross Border Debt Recovery
Financial Transaction Tax
Market Abuse Directive (MAD II)
PRIP
Accounting Directive ReviewAIFM Directive
EU Transparency Directive
EU Reg on Credit Rating Agencies
CRDV
Internal Governance GuidelinesFATCA
PD
EMIR
SWAPS Push Out –Dodd Frank
Securities Law Directive (SLD)
Volker Rule – Dodd Frank
Short Selling
Close Out Netting
Crisis Management
Recovery & Resolution
Effective dates yet to be confirmed
4© Cloudera, Inc. All rights reserved.
DataSources
DataSystems
DataAccess
BusinessAnalytics
Custom Applications
Existing Data
Databases
Operational Applications
New Data
Traditional Architectures Under Pressure
Limited DataNot efficient to keep existing data, let alone handle new data sources.
Time consuming to transform data for analysis in existing systems.
Limited InsightsPower users struggle with data.
Many users have no data.
Compliance and PrivacyMore data, more users, and more tools create complexity.
Need to balance business agility with security and governance.
5© Cloudera, Inc. All rights reserved.
Traditional Architectures Under Pressure
Source Systems
Enterprise Data Warehouse
BI Abstraction & Reporting Layer
Data Acquisition Layer
•Extraction&Staging
•Cleansing
ATOMIC Layer
•Normalisation & Storage
Performance & Access
•Transformation & Calculation
• Performance & Access
Dashboard & Reports Ad-hoc Analysis Mobile
E
T
L
Data
Model l ing
6© Cloudera, Inc. All rights reserved.
Source Systems
Enterprise Data Warehouse
BI Abstraction & Reporting Layer
Data Acquisition Layer
•Extraction&Staging
•Cleansing
Atomic Layer
•Normalisation & Storage
Transformation & Access Layer
•Transformation & Calculation
• Performance & Access
Dashboard & Reports Ad-hoc Analysis Mobile
E
T
L
D
A
T
A
M
O
D
E
L
L
I
N
G
Example - Limited Data “iPhone users more likely to Buy?”
7© Cloudera, Inc. All rights reserved.
Example - Limited Data
8© Cloudera, Inc. All rights reserved.
Source Systems
Enterprise Data Warehouse
BI Abstraction & Reporting Layer
Data Acquisition Layer
•Extraction&Staging
•Cleansing
Atomic Layer
•Normalisation & Storage
Transformation & Access Layer
•Transformation & Calculation
• Performance & Access
Dashboard & Reports Ad-hoc Analysis Mobile
E
T
L
D
A
T
A
M
O
D
E
L
L
I
N
G
Example - Limited Data (slow transforms) “What is my VaR?”
8 Hours
12 Hours
2 Hours
CDC
ETL
ETL
9© Cloudera, Inc. All rights reserved.
Example - Limited Insights
10© Cloudera, Inc. All rights reserved.
Example - Limited Insights
11© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights reserved.
Expanding Data Requires A New Approach
What we doCopy Data to Applications
What we should doBring Applications to Data
DataInformation-centric
businesses use all Data:
Multi-structured, Internal & external data
of all types
App
App
App
Process-centric businesses use:
• Structured data mainly• Internal data only• “Important” data only• Multiple copies of data
App
App
App
Data
Data
Data
Data
12© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights reserved.
The Old Way: Bringing Data to Applications
Can’t Get a 360 View• Many special-purpose
systems• Moving data around• No complete views
Can’t Retain Valuable Data• Leaving data behind• Risk and compliance• High cost of storage
Can’t Meet ETL SLAs• Up-front modeling• Transforms slow• Transforms lose data
Can’t Ask New Questions• Existing systems strained• No agility• “BI backlog”
4
1
2
3
SERVERSMARTSEDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
13© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights reserved.
The New Way: Bringing Applications to Data
SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE
ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS ESTERNAL DATA SOURCES
Consolidated Architecture• Bring applications to data• Combine different workloads on
common data (i.e. SQL + Search)• True analytic agility
4
1
2
3 4
Active Archive• Full fidelity original data• Indefinite time, any source• Lowest cost storage
1
Scalable Transformations• One source of data for all analytics• Persist state of transformed data• Significantly faster & cheaper
2
Agile Exploration• Simple search + BI tools• “Schema on read” agility• Reduce BI user backlog requests
3
14© Cloudera, Inc. All rights reserved.
One Platform, Many Workloads
Security and Administration
Process
InsightsSqoop, Flume
TransformMapReduce,
Hive, Pig, Spark
Discover
Analytic DatabaseImpala
SearchSolr
Model
Machine LearningSAS, R, Spark,
Mahout
Serve
NoSQL DatabaseHBase
StreamingSpark Streaming
Unlimited Storage HDFS, HBase
YARN, Cloudera Manager,Cloudera Navigator
A new kind of data platform• One place for unlimited data
• Unified, multi-framework data access
Only with Cloudera:
• Leading performance
• Enterprise system and data management
• Fundamentally secure
• Open source, open standards
15© Cloudera, Inc. All rights reserved.
SAS on Cloudera Enterprise
• Tight Integration with SAS suite of access engines, big data, and in-memory analytics solutions
• Certified Impala connector to SAS VA delivers the fastest interactive SQL on Hadoop
• Comprehensive data security and governance enable SAS users to innovate with confidence
16© Cloudera, Inc. All rights reserved.
Joint Customer Successes
With SAS® Visual Analytics, business executives at Telecom Italia can compare the performance between all operators for a key indicator – such as accessibility or percentage of dropped calls – on a single screen for a quick overview of pertinent strengths and weaknesses.
Epsilon built a next-generation marketing application, leveraging Cloudera and taking advantage of SAS® capabilities by our data science/analytics team, that provides its clients with a 360-degree view of their customer
AMERAN provides 360-degree viewsinto energy usage patterns and similar household comparisons to help consumers save energy.
Optimize Discover Empower
We Are Hiring in NZ
https://jobs.jobvite.com/cloudera/