View
922
Download
0
Tags:
Embed Size (px)
DESCRIPTION
IBM Future of Power 04.09.13
Citation preview
© 2013 IBM Corporation
Søren Ravn ([email protected])Big Data ArchitectIBM Software Group, Information Management
September 4th, 2013
Big Dataa Paradigm Shift
August 2013
IBM Future of Power Event 2013
© 2013 IBM Corporation2
What is Big Data?
Where is it comming from?
Where is it going?
What can I do with it?
© 2013 IBM Corporation3
Google search on
“What is Big Data ”
you will get 2,9 mill. hits
© 2013 IBM Corporation4
What is Big Data?
A definition:
Big Data are datasets that grow so large and/or varied that they become awkward to work with using traditional information management technologies
© 2013 IBM Corporation5
What is Big Data or Big Data Analytics
TDWI: Big data analytics is the application of advanced analytic techniques to very large, diverse data sets that often include varied data types and streaming data.
Wikipedia : Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization.
Forbes:Big data is new and “ginormous” and scary – very, very scary….
U.S. federal Big Data commission report:Big Data is a phenomenon defined by the rapid acceleration in the expanding volume of high velocity, complex, and diverse types of data. Big Data is often defined along three dimensions -- volume, velocity, and variety
McKinsey Global Institute : Big data: The next frontier for innovation, competition, and productivity
© 2013 IBM Corporation6
U.S. federal Big Data commission report
The Big Data Commission will provide guidance to the White House and Congress on the use of big data to improve government efficiency, services and capabilities, and drive innovation and the economy
• The Commission was formed in May, 2012
• Steve Mills co-chair
• Brought together experts from Government, Academia and Industry
• The report seeks to demystify big data, and focus on the business and mission value it will deliver
• Intent is to provide clear recommendations and a roadmap for getting started
Find it here:http://ibmdatamag.com/2012/11/demystifying-big-data/
© 2013 IBM Corporation7
“Data is the New Oil”
““We have for the first time an economy based on We have for the first time an economy based on a key resource [Information] that is not only renewable, a key resource [Information] that is not only renewable, but selfbut self--generating. Running out of it is not a problem, generating. Running out of it is not a problem,
but drowning in it is.but drowning in it is.””–– John John NaisbittNaisbitt
Harvesting any resource requires Mining, Refining and Delivering
Big Data is the next Natural Resource
© 2013 IBM Corporation8
Integration & Analytics (DW, MDM,…)
The unseen information
Governance
Operational systems
© 2013 IBM Corporation9
2+ billion
people on the Web
by end 2011
30 billion RFID tags today
(1.3B in 2005)
4.6 billioncamera phones
world wide
100s of millions of GPS
enableddevices
sold annually
76 million smart meters in 2009…
200M by 2014
12+ TBsof tweet data
every day
25+ TBs oflog data every
day
? T
Bs
ofda
ta e
very
day
Where is big data coming from?
© 2013 IBM Corporation10
Big data is a hot topic because technology makes it possible to analyze ALL available data
Cost effectively manage and analyze all available data,
in its native form – unstructured, structured, streaming
ERPCRM RFID
Website
Network Switches
Social Media
Billing
© 2013 IBM Corporation11
The characteristics of big data
Collectively Analyzing the broadening Variety
Responding to the increasing Velocity
Cost efficiently processing the growing Volume
Establishing the Veracity of big data sources
30 Billion RFID sensors and counting
1 in 3 business leaders don’t trust the information they use to make decisions
50x 35 ZB
2020
80% of the worlds data is unstructured
2010
© 2013 IBM Corporation12
Extending and Integrating Big Data requires a Holistic Approach
Traditional ApproachStructured, analytical, logical
New ApproachCreative, holistic thought, intuition
Multimedia
Data Warehouse
Web Logs
Social Data
Sensor data:images
RFID
Internal AppData
TransactionData
MainframeData
OLTP SystemData
Traditional Sources
ERP Data
StructuredRepeatable
Linear
UnstructuredExploratory
Dynamic
Text Data:emails
Hadoop andStreams
NewSources
© 2013 IBM Corporation13
New Architecture to Leverage All Data and Analytics
Data in
Motion
Data at
Rest
Data in
Many Forms
Information Ingestion and Operational Information
Decision Management
BI and Predictive Analytics
Navigation and Discovery
IntelligenceAnalysis
Landing Area,Analytics Zoneand Archive
� Raw Data� Structured Data� Text Analytics� Data Mining� Entity Analytics� Machine Learning
Real-timeAnalytics� Video/Audio� Network/Sensor� Entity Analytics� Predictive Exploration,
Integrated Warehouse, and Mart Zones� Discovery� Deep Reflection� Operational� Predictive
� Stream Processing � Data Integration � Master Data
Streams
Information Governance, Security and Business Continuity
© 2013 IBM Corporation14
Big Data ExplorationFind, visualize, understand all big data to improve decision making
Enhanced 360 o Viewof the CustomerExtend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
Big Data Use Cases
© 2013 IBM Corporation15
Big Data ExplorationFind, visualize, understand all big data to improve decision making
Enhanced 360 o Viewof the CustomerExtend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
Big Data Use Cases
© 2013 IBM Corporation16
Vestas optimizes capital investments based on 2.5 Petabytes of information
Need
• Model the weather to optimize placement of turbines, maximizing power generation and life expectancy
Benefits• Reduce time required to identify placement
of turbine from weeks to hours
• Reduces IT footprint and costs, and decreases energy consumption by 40 % --while increasing computational power
• Incorporate 2.5 PB of structured and semi-structured information flows. Data volume expected to grow to 6 PB
1616
© 2013 IBM Corporation17
Big Data ExplorationFind, visualize, understand all big data to improve decision making
Enhanced 360 o Viewof the CustomerExtend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
Big Data Use Cases
© 2013 IBM Corporation18
How do you correlate information across different data sets, e.g.,
social media and trusted enterprise data?
How do you decide the next best action when dealing with customers?
How do you monitor and visualize data in real time and generate
alerts?
Is your customer data distributed among many different applications and sources? How do you deliver it in usable form to the employees who need it?
© 2013 IBM Corporation19
Enhanced 360º View of the Customer
RequirementsCreate a connected picture of the customer
Mine all existing and new sources of information
Analyze social media to uncover sentimentabout products
Add value by optimizing every client interaction
Industry Examples• Smart meter analysis • Telco data location monetization• Retail marketing optimization
• Travel and Transport customer analytics and loyalty marketing
• Financial Services Next Best Action and customer retention
• Automotive warranty claims
Optimize every customer interactionby knowing everything about them
© 2013 IBM Corporation20
Enhanced 360º View of the Customer: In Practice
360o View of Party Identity
CRMJ Robertson
Pittsburgh, PA 15213
35 West 15 th
Name:
Address:
Address:
ERPJanet Robertson
Pittsburgh, PA 15213
35 West 15 th St.
Name:
Address:
Address:
LegacyJan Robertson
Pittsburgh, PA 15213
36 West 15 th St.
Name:
Address:
Address:
SOURCE SYSTEMS
Janet
35 West 15 th St
Pittsburgh
Robertson
PA / 15213
F
48
1/4/64
First:
Last:
Address:
City:
State/Zip:
Gender:
Age:
DOB:
InfoSphere MDM
BigInsights Streams Warehouse
Unified View of Party’s Information
InfoSphereData
Explorer
© 2013 IBM Corporation21
Big Data ExplorationFind, visualize, understand all big data to improve decision making
Enhanced 360 o Viewof the CustomerExtend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
Big Data Use Cases
© 2013 IBM Corporation22
Security/Intelligence Extension
© 2013 IBM Corporation
Enhanced Intelligence & Surveillance Insight
Real-time Cyber Attack Prediction & Mitigation
Analyze network traffic to:• Discover new threats early• Detect known complex threats• Take action in real-time
Analyze Telco & social data to:• Gather criminal evidence• Prevent criminal activities• Proactively apprehend criminals
Crime prediction & protection
Security/Intelligence Extension enhances traditional security solutions by analyzing all types and sources of under-leveraged data
Analyze data-in-motion & at rest to:• Find associations • Uncover patterns and facts• Maintain currency of information
© 2013 IBM Corporation23
Big Data ExplorationFind, visualize, understand all big data to improve decision making
Enhanced 360 o Viewof the CustomerExtend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
Big Data Use Cases
© 2013 IBM Corporation24
Handling Machine Data Brings Unique Challenges
Data Sources and Integration
• Complex formats, no standards
• Extremely large data volumes
• Mix of enterprise and machine data
• Streaming data as well as data at rest
AnalyticsVisualizations/
Actions/ Outputs
• Large scale indexing
• Correlation across different data sets
• Advanced analytics for different data types
• New visualizations for streaming and massive data sets
• Real-time dashboards
• Geospatial mash-up
- Gain deep insights into operations, customer experience, transactions and behavior- Proactive planning to increase operational efficiency- Troubleshoot problems and investigate security incidents- Monitor end-to-end infrastructure to avoid service degradation or outages
Outcome
© 2013 IBM Corporation25
Big Data ExplorationFind, visualize, understand all big data to improve decision making
Enhanced 360 o Viewof the CustomerExtend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
Big Data Use Cases
© 2013 IBM Corporation26
Data Warehouse Augmentation: Value & Diagram
Pre-Processing Hub Query-able Archive Ad hoc &
Exploratory Analysis
Information Integration
Data Warehouse
StreamsReal-time processing
BigInsightsLanding zone
for all data
Data Warehouse
BigInsights combined with unstructured &
new kind of data
Data Warehouse
1 2 3
26
© 2013 IBM Corporation27
Data Warehouse Augmentation: Next Generation Enterprise Data Warehouse Architecture
PredictiveAnalytics
BI & Reporting
Visualization & Discovery
Operational
Warehouse
Zone
Operational
Warehouse
Zone
Analytics
Warehouse
Zone
Analytics
Warehouse
Zone
Hadoop Zone- Preprocessing, Queriable Archive,
Ad Hoc Analysis
Information
Integration
and
Governance
Information
Integration
and
Governance
Integration
Master Data
Governance
Custom Applications
Structured Semi Structured Unstructured
Hadoop Analytics
& Visualization
Real time Analytics
Zone
© 2013 IBM Corporation28
0
IBM Big Data Platform - Move the Analytics Closer to the Data
IBM Big Analytics
IBM Big Data Platform
Systems Management
Application Development
Visualization & Discovery
Accelerators
Information Integration & Governance
HadoopSystem
Stream Computing
Data Warehouse
New analytic applications drive the requirements for a big data platform
• Integrate and manage the full variety, velocity and volume of data
• Apply advanced analytics to information in its native form
• Visualize all available data for ad-hoc analysis
• Development environment for building new analytic applications
• Workload optimization and scheduling
• Security and Governance
© 2013 IBM Corporation29
Assemble & Distill
Consume & Deliver
IBM Big Analytics
IBM Big Analytics
Explore & Experiment
Report& Act
Applied Analytics
Predict& Analyze
Next wave of analytics harnesses the value of the new mix of information
• Visualize and explore the variety, velocity and volume of big data
• Apply advanced analytics to uncover patterns previously hidden
• Blend traditional structured information with data previously unavailable
• Optimize access and delivery to take insight to action
• Extend existing capabilities to address specific analytic applications
HadoopSystem
Stream Computing
Data Warehouse
Operational SourcesBig
Data
© 2013 IBM Corporation30 IBM Confidential
PureData System for HadoopBringing Big Data to the enterprise
� Simplify the delivery of unstructured data to the enterprise
� Integrate Hadoop with the data warehouse
� Leverage Hadoop for data archive
� Provide best in class security
� Provide data exploration across structured and unstructured data
� Accelerate insight with machine data
� Accelerate insight with social data
Beyond today’s big data appliances
System for Hadoop
© 2013 IBM Corporation31
Pre-defined PowerLinux Hadoop/BigInsights Configurations
© 2013 IBM Corporation32
Big Data considerations....
• How to find out that the datasets exist ?• How to get permission to access and use ?• Privacy, confidentiality, security ?• How to combine disparate datasets and sources ?• How to normalize and integrate ?• How to reconcile standards and metadata considerati ons ?• Underlying data structures ?• Interoperability ?• How to get the people who collect these disparate d ata types to
communicate with one another ?(And with the computer people ?)
• If they understand one another better, will combing diverse databe easier and more useful ?
• How to get people who don’t understand data structu res and architecture to understand them well enough to make analysis and modeling more possible and successful ?
How do you address these challenges?
These experiences reveal a great irony -- that while the impact of Big Data will be transformational, the path to effectively harnessing it is not. The journey is evolutionary versus revolutionary, incremental and iterative
– Demystifying Big Data, TechAmerica Report, October 2012
Is your organization characterized by one or more o f the following traits?
1. Executive Management wants a big data plan
2. Executive Management wants it to be realistic and drive value as it is being implemented
3. Wants a partner to rely on for guidance & expertise to lower risk
4. Big Data must be leveraged with the existing infrastructure
5. Concerned about the complexity & risk of Big Data acquisition
✔
✔
✔
✔
✔
© 2013 IBM Corporation34
Patterns of organizational behavior are consistent across four stages of big data adoption
Big data adoption
When segmented into four groups based on current levels of big data activity, respondents showed significant consistency in organizational behaviors Total respondents n = 1061
Totals do not equal 100% due to rounding
© 2013 IBM Corporation35
Importance of Hadoop & Big Data
� “We believe that more than half of the world’s data will be stored in Apache Hadoop within five years”– Hortonworks
IBM INTERNAL USE ONLY
© 2013 IBM Corporation36
Gartner on Hadoop: Don’t Delay
- Big data analytics and the Apache Hadoop open source project are rapidly emerging as the preferred solution to address business and technology trends that are disrupting traditional data management and processing. Enterprises can gain a competitive advantage by being early adopters of bi g data analytics.
- Enterprises should consider adopting a packaged Had oop distribution . . . to reduce the technical risk and increase speed of implementation of the Hadoop initiative.
- Enterprises should not delay implementation just be cause of the technical nature of big data analytics....Early adopters will gain competitive advantageand invaluable experience, which will sustain the advantage as the technology matures and gains wider acceptance.
- Adopt big data analytics and . . . Hadoop . . . to meet the challenges of the changing business and technology landscape.
IBM INTERNAL USE ONLY
© 2013 IBM Corporation37
© 2013 IBM Corporation38
© 2013 IBM Corporation39
So....don’t get lost in the sea of data
© 2013 IBM Corporation40
THINK