Upload
inside-analysis
View
65
Download
2
Tags:
Embed Size (px)
Citation preview
The Big Picture: Understanding the Many Roles of Hadoop Exploratory Webcast | January 28, 2015
SPONSORED BY
Guests
Robin Bloor Chief Analyst, The Bloor Group @robinbloor [email protected]
Eric Kavanagh CEO, The Bloor Group @eric_kavanagh [email protected]
Findings Webcast May 27, 2015
Making Sense of Hadoop
Roundtable Webcast March 18, 2015
Exploratory Webcast January 28, 2015
#Hadoop
In Three Segments
The Forces of Disruption
Focus Areas
Hadoop: Then, Now & Later
PART ONE
PART THREE
PART TWO
The Generic Dimensions of IT
u All IT involves 4 components (only) • Users • Software • Data • Hardware
u Change any one of these and the other three components have to adjust
u Aggregate these and you get a process
u Time will impose change anyway
u We can also consider a larger field, since this applies to all systems not just IT systems
Four Fundamental (IT) Factors
Hardware
Users
Software Data
Business
InformationB
usinessProcess
Hum
anActivity
AllInform
ation
Staff
Facility
People
Civilization
TIME
The Hexagon of Business Change
u Speed • Speed of action • Speed of business process
u Cost • Cost of acquisition • Cost of ownership
u Time • Time to deploy • Time to employ
u Business Value • By competitiveness • By cost reduction
u Effort • Effort to develop • Effort to deploy
u Fit • Compatible • Incompatible
Plus, capacity to change
SPEED
TIMETAKEN
EFFORT
FIT
VALUE
Speed of ProcessSpeed of Action
CompatibleIncompatible
TimetoDeploy
TimetoEmploy
Cost Reduction
Competitiveness
AcquisitionCost
TCOCOST
Effort toDeploy
Effort toDevelop
Hexagon ofChange Factors
Plus Capacity
The Technology Layers
u The buying impulse descends through the stack
u The impact of technology change rises up the stack
u This ensures the eventual “legacification” of all technology
The BuyingImpulse Goes
Down
TechnologyChange Rises Up
The TechnologyLayers
Technology Layer Perspectives
u This simple model has a number of uses
u For example, we can use it to depict the “aaS options”
u More importantly we can use it to track disruption …
u More of which later…
The aaS Possibilities
Disruption in The Technology Layers
u Disruption (as innovation) can happen in any layer
u Where it occurs, it will impact all layers above it
u And it may also impact the layers below it (but less quickly)
u There is no such thing as future-proof; but some technologies definitely live longer
The BuyingImpulse Goes
Down
TechnologyChange Rises Up
The TechnologyLayers
Mainframe Computer (Batch architecture)
On-line Interaction (Centralized architecture)
PC (Client server)
Internet (Multi-tier architecture)
Mobile (Service oriented architecture)
Internet of things (Event driven architecture)
Tech Revolutions
Note that all of these disruptive changes were driven by hardware innovation
Hardware Layer Disruption
u SSD is now on the Moore’s Law curve
u Spinning disk has almost popped its clogs
u Memory grows and can be networked
u CPUs still evolving: CPU and GPU merged
Hierarchical Memory
u On chip speed v RAM • L1(32K) = 100x • L2(246K) = 30x • L3(8-20Mb) = 8.6x
u RAM v SSD • RAM = 300x
u SSD v Disk • SSD = 10x
Note: Vector instructions and data compression
In-Memory Disruption
u Memory will become the primary store for data (this impacts data flows)
u Almost all applications are poorly built for this
u Memory is an accelerator, as is CPU cache – this is becoming a factor
u HP’s Memristor waits in the wings
Hadoop: Deceptive Impression Because Hadoop was built to run on 1000s of servers, there’s an impression that Hadoop needs such huge
clusters/grids
In reality the opposite is now happening, the number of servers is diminishing, which means that Moore’s Law still
operates
Putting a SoC in IT
u It’s possible that the CPU-Memory split will vanish, possibly soon
u This requires the emergence of the commodity SoC
u There are already SoCs that run Linux
u Grids of SoCs would replace grids of servers
Parallelism: The Imp Is Out of the Bottle
u Multicore chips enabled parallelism
u It has changed the whole performance equation
u It enabled Big Data
u Big Data is really Big Processing
Some Architectural Principles
u The new atom of data is the event
u SUSO, scale up before scale out
u Take the processing to the data, if you can
u Hadoop is a component not a solution
The Hadoop Ecosystem
u Apache Projects: HBase, HCatalog, Pig, Hive, Flume, Storm, Sqoop, Nutch, Avro, Oozie, ZooKeeper, etc.
u New commercial products: Actian, RedPoint, Attunity, Voltage Security, etc.
u Languages and dev environments
Hadoop Usage
u Data archive
u Data staging & ETL
u Data preparation
u Analytics sandbox
u Analytics platform
u Database environment
The State of Play
Graphic from Allied Market Research
There are some straws in the wind here: it is being used everywhere
It’s a HUGE market Also VC investment is massive
Hadoop in the Technology Layers
Hadoop in theTechnology Layersu Hadoop starts as a scale-
out file system with a one-dimensional development environment
u It evolves with the addition of YARN to begin to occupy the OS & Sys Mgt layer
u Analytics applications become synonymous with Hadoop
u Hadoop is migrating through the stack
Hadoop as an OS
u The Trail of OSes • OS/360 -> OS/370 -> z/OS • VMS • Unix -> Solaris • MS-DOS -> Windows • Linux • OS X -> iOS
u OSes evolve in two ways • Own development • Third party add-ons
u They create application ecosystems u In time they make previous OSes
obsolete u This is what Hadoop is in the
process of doing
Hadoop in the Basic Map
u Hardware & Cloud
u Software
u Data realities
u Usage
Four Fundamental (IT) Factors
Hardware
Users
Software Data
Business
InformationB
usinessProcess
Hum
anActivity
AllInform
ation
Staff
Facility
People
Civilization
TIME
Hadoop in the Hexagon
Plus, capacity to change
SPEED
TIMETAKEN
EFFORT
FIT
VALUE
Speed of ProcessSpeed of Action
CompatibleIncompatible
TimetoDeploy
TimetoEmploy
Cost Reduction
Competitiveness
AcquisitionCost
TCOCOST
Effort toDeploy
Effort toDevelop
Hexagon ofChange Factors
Plus Capacity
u Speed • Speed of action • Speed of business process
u Cost • Cost of acquisition • Cost of ownership
u Time • Time to deploy • Time to employ
u Business Value • By competitiveness • By cost reduction
u Effort • Effort to develop • Effort to deploy
u Fit • Compatible • Incompatible
Hadoop in the Technology Layers
Hadoop in theTechnology Layers
u Speed • Speed of action • Speed of business process
u Cost • Cost of acquisition • Cost of ownership
u Time • Time to deploy • Time to employ
u Business Value • By competitiveness • By cost reduction
u Effort • Effort to develop • Effort to deploy
u Fit • Compatible • Incompatible
In Three Segments
Part 1 - The Forces of Disruption
Part 2 – Hadoop: Then, Now & Later
Part 3 – Focus Areas