View
9
Download
0
Category
Preview:
Citation preview
This presentation, including any supporting materials, is owned by Gartner, Inc. and/or its affiliates and is for the sole use of the intended Gartner audience or other authorized recipients. This presentation may contain information that is confidential, proprietary or otherwise legally protected, and it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates. © 2013 Gartner, Inc. and/or its affiliates. All rights reserved.
Sid Deshpande
Big Data Trends
Hype and Confusion About What Big Data Really Is
"Data that exceeds the processing capacity of conventional database systems."
"The tools, processes and procedures allowing an organization to create, manipulate, and manage very large datasets and storage facilities."
"The voluminous amount of unstructured and semistructured data a company creates — data that would take too much time and cost too much money to load into a relational database for analysis."
"Big data is not going away and it's only going to get bigger."
"Data big enough to raise practical rather than merely theoretical concerns about the effectiveness of anonymization."
"Data with velocity, volume and/or variety growing faster
than Moore's law."
?
"Big data" is high-volume, -velocity and -
variety information assets that demand cost-
effective, innovative forms of information
processing for enhanced insight and decision
making.
Definitions
Cloud computing is a style of computing in
which scalable and elastic IT-enabled
capabilities are provided as a service to
consumers using Internet technologies
Technology and the Real World
3
“We are stuck with technology when what we really want is just stuff that works”
Photo Credit: www.douglasadams.com
Douglas Adams: ‘Salmon of Doubt’ - 2002
Why Big Data and Cloud Computing together
Co
mm
on
Ap
pro
ach
es R
eq
uir
ed
Can potentially transform the datacenter environment
3
End users are hard pressed to separate vendor hype from real possibilities
2
Both initiatives need a business focus to ensure success
1
Have the potential to make businesses more agile, flexible and competitive
4
Early convergence of vendor solutions in both domains 5
The Nexus of Forces defines it all
The convergence and mutual reinforcement of social, mobile, cloud and information patterns that drive new business scenarios
New Business Scenarios
- Information is the context for social, mobile and cloud
- The nexus of these forces produces and demands new data types and new kinds of information processing
- Information governance is paramount
Key Issues
1. How can you avoid the hype and identify real benefit associated with Big Data and Cloud?
2. How do big data and cloud technologies integrate with incumbent technology platforms?
3. How will Big Data and Cloud Computing evolve in the next 3-5 years?
Key Issues
1. How can you avoid the hype and identify real benefit associated with Big Data and Cloud?
2. How do big data and cloud technologies integrate with incumbent technology platforms?
3. How will Big Data and Cloud Computing evolve in the next 3-5 years?
What does the Hype Cycle say?
Source: Hype Cycle for Emerging Technologies, 2012 (G00233931)
• Storage in the cloud:
Common infrastructures
Gateway appliances
• Hadoop in the cloud:
Hosted MapReduce as a Web service:
• Example: Amazon Elastic MapReduce (EC2 and S3)
Hosted NoSQL:
• Example: MongoHQ for developers who use MongoDB
Big Data Intersection With the Cloud
• Will the cloud be the answer?
- Scalability = millions of users, billions of files
- Elastic performance, capacity
- Multitenant, location-independent
• Low-cost option for experimentation
• Limiting factors: Bandwidth and transfer speeds
• Data location may be driving factor if volume is very large and update rate is high
Big Data Opportunities in Vertical Industries
Banking
Securities
Use cases: CEP integration, fraud detection, dark data, data cleansing,
visualizations, predictive modeling
Infrastructure trends: Heavy users of Hadoop, highly customized
environments, high-end incumbent infrastructure
Communications
Media
Use cases: Location-based analytics, video analytics, content recognition,
subscriber analytics, online advertising
Infrastructure trends: Mega scale datacenters with diverse incumbent
technology sets, high proclivity toward cloud-based solutions, ability to deploy
greenfield solutions
Government
Use cases: cybercrime and terrorism prevention, inter-and
intra-governmental policy compliance, healthcare
Infrastructure trends: High interest in community clouds, strong mandate for
secure, resilient solutions, good opportunity for high performance
hardware designs
Manufacturing
Use cases: Integration of unused OT data with IT systems, co-relating
engineering data with sales and demand side data
Infrastructure trends: High volume of dark data, relative technology
immaturity, siloed implementation of hardware, good potential for "in memory"
solutions, provider-driven hardware purchase profiles
Gartner Data Magnitude Index (DMI)
11
• The DMI model is a near term (<24 months timeframe) framework that gauges the independent
and composite index of the “3Vs” for an organization to determine what class of technical
solution is justified for consideration.
• The DMI is based on status quo conditions, so the onus lies on Technology and Service
Providers (TSPs) to educate customers and enable them to increase DMI thereby helping them
to be more competitive.
Source: Douglas Laney
Gartner, October 2012
Buying Centers for Big Data Infrastructure
Analysis
Criteria Enterprises Providers HPC
Incumbent
Infrastructure
Legacy Greenfield Mixed
Ease of Funding Medium High High
Cloud Proclivity Low High Medium
Open Source
Proclivity
Medium High Medium
Level of
Customization
Low High Medium
Primary
Concern
Business Outcomes, ROI
Measurement
Economies of Scale and
Service Delivery
Next Gen. Infrastructure,
Specialist Tools
Bottom line: One size does NOT fit all
Big Data – business problems addressed
n = 449
What are the 'Big Data' business problems you are now addressing – or will likely address soon? (Multiple responses allowed.)
Popular initiatives from opposite ends of the spectrum.
Source: Gartner Research Circle 2012
Key Issues
1. How can you avoid the hype and identify real benefit associated with Big Data and Cloud?
2. How do big data and cloud technologies integrate with incumbent technology platforms?
3. How will Big Data and Cloud Computing evolve in the next 3-5 years?
Big Data - concerns
n = 473
Other than infrastructure growing pains, what is your biggest concern or challenge with 'Big Data'?
Source: Gartner Research Circle 2012
Is all Hadoop Big Data? Absolutely not!
The elephant in the room
16
Is Big Data only Hadoop? Absolutely not!
Can Hadoop drive Big Data outcomes ?
Absolutely!
Other frameworks exist, but Hadoop has:
• Been proven in hyperscale datacenters
• Large number of committers equals
rapid evolution of projects
• Packaged, stable distributions with
paid support
Why is Hadoop so popular?
Integration with enterprise technology:
• DW with Hadoop embedded
• SQL front ending
• Hadoop on virtualized servers?
• Connectors and plug-ins with BI and
storage platforms
How will it evolve?
The Myth of "Commodity" Hardware Versus Commercial Offerings
• Typical discussion revolves around "commodity hardware" for Hadoop.
• Vendors of enterprise-class hardware, such as SGI, have in fact been responsible for some of the largest clusters.
• EMC Greenplum, Oracle and Teradata-Aster have announced appliance form factors for big data customers.
• As Hadoop and other MapReduce processing move into the mainstream, the presupposition of high failure rates must change.
Big Data Solutions Landscape
Big Data Hardware Vendors
Big Data Software Vendors
Big Data Services
noSQL
Appliances
Server
Storage
Hadoop Value Adds Data Integration/Federation
BI
Networking
Distributed Processing
OT/ Analytics/
Visualization
Big Data in the Cloud
The Cloud Services Landscape: Evolution Continues Up the Layers
Providers are developing offerings across multiple
segments, making market segments increasingly
interconnected
Cloud Service Broker
(CSB)*
Film Forecaster
System Infrastructure Services
Business Proc. Serv.
Information Services
Application Services
App. Infrastructure Services
Mg
mt. a
nd
Se
cu
rity
Cloud Enablement
IaaS
PaaS
SaaS BPaaS
Clo
ud
Bro
ke
rag
e
CSB
• Commodity hardware or high-density servers?
• Scale out NAS or traditional storage?
• Are we effectively increasing complexity?
Compute/ Storage
• What network design points do I need to consider?
• How do network security considerations change?
• Does big data justify an upgrade to 10GbE or higher?
Networking
• Don't they lead to increased lock-in?
• Backup to disk or tape?
• RTO considerations for backup?
Appliances
• Isn't cloud IaaS perfect for big data?
• Hadoop in the cloud: PoC or production?
• Data movement concerns?
Cloud Infrastructure
CIO Questions on Big Data Infrastructure
• What parts of our EDW can be migrated to Hadoop for
batch processing?
• What is a ‘Logical Data Warehouse’?
Data Warehousing
and BI
• How can I use Hadoop in a virtualized server
environment?
• What is ‘data virtualization’ and how can it help?
Virtualization
• What are the privacy and security implications of
leveraging Big Data?
• Can Big Data help identify areas of business/IT risk?
Security and Privacy
• How do information management policies need to
evolve to keep pace with Big Data?
• What do we do with Dark Data?
Information Management
CIO Questions on Big Data Infrastructure
• Some big data may not need to persist.
• Big data is likely to be too large to back up through conventional methods:
- Backup to cloud option.
• Backup alternatives:
- Storage snapshot and replication.
- Special-purpose file systems that incorporate tiers of disk and tape.
Is Big Data too Big to Back Up? Maybe!
New Technology Approaches Required
• Infrastructure technologies
- Additional space, cooling, power
• Servers — increasing reliability, redundancy, support
• Storage — scalability, performance
• Need headroom in memory, cores and storage
• Data management technologies
• Analysis techniques
- Can't meet big data analytics requirements with existing technology.
IT operations must integrate technologies for processing, management and analytics with storage/repository solutions for compression, deduplication and retention.
Delivering scalable analytics using distributed file systems such as Hadoop must be combined with storage designs that contain massive growth.
Key Issues
1. How can you avoid the hype and identify real benefit associated with Big Data and Cloud?
2. How do big data and cloud technologies integrate with incumbent technology platforms?
3. How will Big Data and Cloud Computing evolve in the next 3-5 years?
Public Cloud Services* Growing Strongly; But Still Less Than 3% of Overall IT by 2016
$43 B
$50 B
$58 B
$70 B
$84 B
$ 00 B
$117 B
2010 2011 2012 2013 2014 2015 2016
$U
S B
illi
on
s
Public Cloud Services
4%
19%
Source: Gartner, IT Spending Forecast, 2Q12 Update & Public Cloud Services Forecast 3Q12 Update, Sept 2012 (G00238928)
*
* Excluding Cloud Advertising
Big Data Driven Spending and Market Structure
Source: Big Data Drives Rapid Changes in Infrastructure and $232 Billion in IT Spending Through 2016, Gartner Document ID: G00245237
*
Organizations are replacing early implementations of big data solutions already and this rapid cycling will continue through 2020.
Big Data is a Composite Market Total IT Spending Driven by
Big Data Functional Demands
A rapid expansion in the volume, variety and velocity of data means that CIOs are under increasing pressure to explore technologies that help them deliver value to the business while minimizing the traditional trappings of enterprise storage purchases (such as vendor lock-in, licensing constraints and quick technology obsolescence).
Cloud Computing and Big Data are transforming the role of Enterprise IT
By 2016, 80% of big data projects will use architectures that
account for less than 20% of total storage spending today
By 2014, IT organizations in 30% of Global 1000 companies will
broker (aggregate, integrate and customize) two or more cloud
services for internal and external users, up from 5% in 2012.
Internal IT departments will begin to behave like external service providers and will play the role of a CSB in order to control the delivery and consumption of cloud services in their environments.
Organizations will look to both external service providers and internal IT to build and implement the key CSB functions of integration, aggregation and customization.
Big Data needs different skills, not all of which are in abundance
• Not just at the entry level, even CIOs need to constantly upgrade their skill sets
• Data Scientists with a business focus will become critical to success
• The relative immaturity of the technologies will drive demand for services, creating some 2.4 million job openings in the IT services sector globally through 2014
Gartner Predicts: By 2015, big data demand will reach 4.4
million jobs globally, but only one-third of those jobs will be
filled
Recommendations for IT and Business Leaders
Chief Information Officers (CIOs) and IT Leaders
Consider pace of technology evolution before writing the cheque
Focus on the business problems that big data and cloud computing can solve before selecting and evaluating technology platforms
Avoid vendor hype: View Big Data investments as incremental feature add-on’s rather than a complete transformation of your data center
Invest in training senior IT leadership on technology platform innovations
Choose big data platforms based on long term viability of the vendor
Chief Executive Officers and Business Leaders
Big Data is not the problem, it is the solution to many business challenges
Lend support to IT leaders by funding departmental proof of concept projects that showcase business value
While procuring multi tenant services, conduct or ask for extensive security and privacy audits of your provider environment
View information as a currency to drive business competitiveness
Recommendations for Technology and Service Providers
Enable CxOs to create a business case and justifiable ROI models for Big Data and Cloud Computing initiatives
Datacenter portfolio vendors: Integrate not just go to market but internal product engineering strategies between different groups
Align cloud computing and big data product and go to market strategies to enable a smooth progression from proof of concept to production
Focus on platform integrations with multiple technology stacks within the same segment to ensure a wider array of choices for the end user
For vendors offering multi tenant hosted platforms for Big Data platforms, articulate data governance and privacy implications clearly to enterprise datacenter leaders
Recommended