Upload
hp-enterprise
View
110
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Big Data technologies are surfacing in data centers to solve problems legacy systems were not built to handle. Hadoop is one of those technologies. Successful Hadoop implementations share common characteristics and also address the requirements needed to function as a proper tenant within the data center. This session addresses those common characteristics as well as the integration requirements that need to be addressed for Hadoop within the data center of the future.
Citation preview
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Hadoop in the datacenter Donald Livengood/ June 2013
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 3
My background
Title
Distinguished Technologist
TS Consulting
IT industry experience • Big Data
• Client Infrastructure, Mobility, VDI
• Unified Communications & Collaboration
• Virtualization & Private Cloud
• Electronic Messaging & Directory Services
Professional information • Certified Infrastructure Architect
Years at HP
28
Current responsibilities Responsible for the creation of services and delivery readiness for Big Data Infrastructure world-wide
Name: Donald Livengood
E-mail: [email protected]
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 4
Agenda
Big Data
Why Hadoop exists
Designing Hadoop
Integrating Hadoop into the datacenter
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Big Data
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 6
Gartner "Big Data" is a popular term generally used to acknowledge the exponential growth, availability and use of information in the data-rich landscape of the emerging information economy era.
What is “Big Data”
HP definition
“Big Data is a class of data challenges, due to increasing volume, velocity, variety, and complexity, that are beyond the capabilities of the traditional software, architecture, and processes to effectively manage and utilize.”
What does Big Data mean for Enterprise IT? A combination of IT capabilities to deal with volume, velocity, variety of data.
McKinsey Report “Big Data” refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Note: Slide for internal use only
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 7
Big Data variables forcing Big Data technology adoption
Variety Any type of data
Volume Ability to handle very large amounts of Data
Velocity Process all data quickly
Voracity End-user appetite for Big Data consumption
Data in many forms Structured, unstructured, text, multimedia. Relevant information are into unstructured data.
Data consumption Ingestion and processing of Data Real Time Processing. Velocity as it relates to consumption of big amounts of data
Data quantity Scale from terabytes to petabytes to zettabytes. Volumes that traditional Data Management technologies cannot handle in time for consumption
Data creation & transport Streaming data, milliseconds to seconds to respond. Velocity related to ingestion, cleaning, meaning of data.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 8
Traditional Information Technologies are not adequate with Big Data
SQL
Consistency
Availability
Big Data
Traditional
Finding out useful information requires powerful analytics and massive processing
Variety Volume
Velocity
Real-time data processing (vertical DB, In-Memory DB)
Scale-Out, Partitioned architecture
Handle Structured and Unstructured Data
Voracity
Allow for multiple Ingress points (Query, Search)
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 9
Agenda
Big Data
Why Hadoop exists
Designing Hadoop
Integrating Hadoop into the datacenter
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Why Hadoop exists
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 11
Sto
rag
e P
latf
orm
Today's architecture
Classic ETL Processing
Business Transactions and Interactions
Business transactions and interactions
CRM – ERM – SCM FMS – HRM
$ € ¥ Transaction Data
Analytical, Dashboards, Reports, Visualization
Enterprise Data Warehouse
Business intelligence & analytics
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 12
Sto
rag
e O
nly
pla
tfo
rm (
SA
N/N
AS
)
Gap in today's architecture
Social media data
Forum
Blog
Feeds
Web
Clicks
Multi-media
Audio
Video
Images
Document management
Content Management
File Sharing
File Hosting
Collaboration
Search
Message data
IM and VOIP
Messaging System
Sensors data
GPS
Sensors devices
RFID
Other events
Classic ETL Processing
Business Transactions and Interactions Business transactions and interactions
CRM – ERM – SCM FMS – HRM
$ € ¥ Transaction Data
Analytical, Dashboards, Reports, Visualization
Enterprise Data Warehouse
Business intelligence & analytics
Moving data to compute doesn’t scale
Can’t explore original data
Archiving = Death Cheap storage Expensive restore
Data dropped due to ETL
Can’t handle data types
Schema change takes time
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 13
Had
oo
p
Gap in today's architecture
Social media data
Forum
Blog
Feeds
Web
Clicks
Multi-media
Audio
Video
Images
Document management
Content Management
File Sharing
File Hosting
Collaboration
Search
Message data
IM and VOIP
Messaging System
Sensors data
GPS
Sensors devices
RFID
Other events
Classic ETL Processing
Business Transactions and Interactions Business transactions and interactions
CRM – ERM – SCM FMS – HRM
$ € ¥ Transaction Data
Analytical, Dashboards, Reports, Visualization
Enterprise Data Warehouse
Business intelligence & analytics
Move some data to legacy system
All data available
Keep Data in Hadoop Cheap storage & can tier Always available
Hadoop
Use MapReduce
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 14
What is Hadoop?
Hadoop consists of two core components
• The Hadoop Distributed File System (HDFS)
• MapReduce
- Computation Framework (engine)
- Resource Manager & Scheduler
- Other engines are/will be introduced (Impala)
A set of machines running HDFS and MapReduce is known as a Hadoop Cluster
• Individual machines are known as nodes
• A cluster can have as few as one node, as many as several thousands
• More nodes = better performance!
There are many other projects based around core Hadoop
The ‘Hadoop Ecosystem’ includes many projects
eg, Pig, Hive, HBase, Flume, Oozie, Sqoop, etc
A flexible and scalable architecture for large scale processing and computation across a distributed network of computers
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 15
Nodes & roles
Master Nodes: Name Node
• Oversees data storage in HDFS
– Maps a HDFS file name to set of blocks, maps blocks to DataNodes
Job Tracker
• Coordinates parallel processing using MapReduce
Slave Nodes: DataNode (slave to Name node)
• Block server
– Stores blocks as separate files on local filesystem
• Communicates to NameNode re: existing blocks
TaskTracker (slave to Job Tracker)
• Starts and monitors Map tasks
• Heartbeat and status to Job Tracker
Edge Node
- Not part of Hadoop architecture
- Usually not part of cluster (but could be)
- 1 or more used for ingress/egress to/from cluster
- Provides authenticated users with access to private subnet (cluster)
- Configured for transient storage & high bandwidth to core network
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 16
Hadoop HDFS and MapReduce B
lock
2
B
lock
3
B
lock
4
B
lock
5
B
lock
6
B
lock
1
Server 1 Server 2
Block 1
Block 2
Block 1
Block 3
Server 3
Block 5
Block 6
Server 4
Block 2
Block 3
Server 5
Block 4
Block 5
Server 6
Block 4
Block 6
HDFS MapReduce
Mapping Process
Shuffle Data
Reduce Process
Outputs Stored locally to HDFS
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 17
Agenda
Big Data
Why Hadoop exists
Designing Hadoop
Integrating Hadoop into the datacenter
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Designing Hadoop
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 19
HP Reference Architectures provide a firm baseline for a balanced cluster
Hadoop Sizing: Workload Matters
Examples IO-bound workloads
• Indexing
• Searching
• Grouping
• Decoding/decompressing
• Data importing and exporting Computation Optimized Low Power Consumption
Balanced
Balanced/ More Power per Node Storage Optimized
Fewer Disks Disk More disks
Low
CP
U
Hig
h
Examples CPU-bound workloads
• Machine learning
• Complex text mining
• Natural language processing
• Feature extraction
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 20
Caveat: You must know your workload
Workload-based Configuration Approach (1 of 2)
Base guidelines
NameNode: RAID1, 32GB per 1M files, 4+ disks (usually 64GB balanced)
Datanode: 1GB per core, 1 disk per core
- 4 1TB or 2TB hard disks in a JBOD (Just a Bunch Of Disks) configuration
- 2 quad core CPUs, running at least 2-2.5GHz
- 16-24GBs of RAM (24-32GBs if you’re considering Hbase)
Network:
- 1Gb Ethernet for nodes, 10Gb for edge nodes and network switch uplinks
- Use 10Gb if “free”: can drive cost very high for adapters and switches
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 21
Caveat: You must know your workload
Workload-based Configuration Approach (2 of 2)
Light Processing Configuration: 1GB per core
- (1U/machine): Two quad core CPUs, 8GB memory, and 4 disks
- CPU-intensive: Use 2GB per core versus 1GB
Balanced Compute Configuration: 2 to 3GB per core
- (1U/machine): Two quad core CPUs, 16 to 24GB memory, and 4 disks
Storage Heavy Configuration: 2 to 3GB per core, big storage & power
- (2U/machine): Two quad core CPUs, 16 to 24GB memory, and 12 disk drives
- Power consumption ~200W in idle state and can go as high as ~350W when active
Compute Intensive Configuration: large memory, moderate storage
- (2U/machine): Two quad core CPUs, 48-72GB memory, and 8 disks
- Used when a combination of large in-memory models and heavy reference data caching is required.
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 22
Which platform to choose?
DC rack capacity is limited
DC cooling & power are issues
High density commodity servers
Need to balance
• Core: disk ratio (1:1) – threads help
• CPU cost – power budget and price
• Disk capacity – more is better
Average size ~ 20 servers
Plan for change!
Optimizing Rack Capacity
DL360p SFF(12 core), 1.20
DL380p/e(16 core), 1.33
SL4540(16 core), 1.07
DL380p/e(12 core), 1.00
350
400
450
500
550
600
650
400 500 600 700
Ha
rd D
riv
es/
Ra
ck
Cores/Rack
Hadoop Data Nodes Core/Disk Ratios per 42u Rack
DL360p SFF(12 core)
DL380p/e(16 core)
SL4540(16 core)
DL380p/e(12 core)
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 23
Item %Cost
Memory 40%
Disk 36%
Chassis 8%
Network 7%
CPU 6%
Software 2%
Rack 1%
Cost Distribution – SL/DL Server rack
Network
Load balanced, redundant, wire-speed
Separate management network
Chassis/CPU
DL/SL series of commodity servers
Single quad-core, mid-range Xeon
Disks
Full complement of LFF Terabyte disks
Memory
(24)32 GB of ECC
Disk and Memory are the largest cost contributors
23
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 24
Hadoop Physical Architecture Typically organized as racks of commodity servers with DAS storage
“Commodity” Server Hardware
1GbE Rack Switches
ECC Memory
Storage using SATA disk
Only master servers require RAID disk
Out-of-band management via iLO
Rack
|
HPN Top of Rack Switches
Management Node
Hadoop Master
Hadoop Slave
Hadoop Slave
Rack
|
HPN Top of Rack Switches
Hadoop Slave
Hadoop Slave
Hadoop Slave
Hadoop Slave
Cluster Switch
Rack
|
HPN Top of Rack Switches
Hadoop Slave
Hadoop Slave
Hadoop Slave
Hadoop Slave
iLO 1Gb iLO 1Gb
iLO 1Gb
10Gb 10Gb
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 25
Hadoop Solution Packaging Consistent with Reference Architectures & Hadoop Appliance
0
|
HPN Top of Rack Switches
Management Node
Hadoop Master
Hadoop Slave
Hadoop Slave
Head Rack Enclosure
|
HPN Top of Rack Switches
Hadoop Slave
Hadoop Slave
Hadoop Slave
Hadoop Slave
42u Rack Expansion
|
HPN Top of Rack Switches
Hadoop Slave
Hadoop Slave
Hadoop Slave
Hadoop Slave
42u Rack Expansion
Base Expansion
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Designing Hadoop: Network
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 27
Network considerations
Hadoop is a high-performance computing platform - Hadoop drives performance and availability through IP communications
Guidelines - Cluster must have dedicated switching – no shared switches or VLANs:
• Network traffic characteristics of Hadoop demand this
- All servers should use 1Gb/10Gb Ethernet to the Top of Rack (ToR) switches
- All ToR switches should have multiple 10 GbE connections to the core switches, for both bandwidth and redundancy
• Integrated Lights Out (ILO) management may be supported from a separate 1GbE/100 Mbps network
Use server bonded NICs and redundant ToR switches - Cost is higher but worth it in multi-rack clusters
- Improved bandwidth
- Avoids replications costs on failure of ToR switch - Connect ToR to Aggregation switches to join racks
- More complex but significant benefits
• HP can provide assistance
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 28
Network considerations
Routing
• Cluster should not route any in-cluster traffic out of the cluster
− Misconfigured routers can allow this
Network & port stress
• Hadoop can stress all ports, across all servers and ToR switches for extended periods
− Use switches suited for Hadoop, not just “favored” switch types
− Network traffic characteristics of Hadoop demand this
DNS
• Hadoop makes many DNS & reverse DNS lookups
− Even for nodes within the cluster
• Use and maintain local /etc/hosts file for in-cluster lookups
• MapReduce jobs making excessive calls to remote servers can general large amounts of external traffic
• May consider placing cached DNS server in every worker node to mitigate the problem
Integration into the corporate network
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
“We’ve profiled our Hadoop applications so we know what type of infrastructure we need”
Said no-one. Ever.*
*Credit: HP Hadoop engineer
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 30
Balanced design approach
Whole Cluster as an Appliance
Well defined Ingress/Egress Interfaces
Cluster Deployment & Management
Integration into DC Monitoring
Infrastructure-isolated cluster network
• Simplifies cluster network
• Separates cluster traffic load
• High speed connections for ingress/egress
DMZ Edge Nodes
Appliance Cluster
Access via controlled interfaces to minimize disruption, improve security, and reduce risk to DC and processes
Data import
Data export
Monitoring
Management
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 31
Enterprise-ready Big Data platform
Pre-integrated, pre-tested, pre-engineered We’ve done all the hard work for you
Full-rack, half-rack, expansion rack options
Out of the box Not in months, but hours or days
Super fast Loading, sorting, and analysis
Easy scaling Expansion racks available
Via CMU: 800 nodes in 30 minutes
HP AppSystem for Apache Hadoop
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 32
Deploy
…it’s like using as-is open source technology, you have a lot of work to do!
Without AppSystem
Without the HP AppSystems ~ 8+ weeks
Research components
Develop complex Design
Order collection of parts
Assemble parts
Install, upgrade firmware & software
Test & adjust design
Find your mistakes somewhere in here and start over
With HP AppSystem for Apache Hadoop ~ 4 weeks
Choose your AppSystem
Order the AppSystem
Installation Deploy Success!
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 33
Agenda
Big Data
Why Hadoop exists
Designing Hadoop
Integrating Hadoop into the datacenter
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Integration into the datacenter
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 35
Big Data Transformation
Big Data information refinery
Insight Processing
Protection & Compliance Management
Infrastructure Integration
Enterprise Data Warehouse
Analytical, Dashboards Reports, Visualization
Business intelligence
Business transactions and interactions
Web, Mobile
CRM – ERM – SCM FMS – HRM Value
Creation
share, refine & development
Message data
Document management
Social media data
Multi-media
Sensors data
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 36
Big Data Functional Architecture: a refinery approach
Technology Integration
Se
curi
ty
Op
era
tion
s
Big Data Converged, Automated, Energy efficient infrastructure
Activity logging Intrusion Prevention
Switch virtualization Virtual application network
SSL/VPN Networks Storage replication
Server scale-out management
Collection Computation Consumption
Protection
Big Data Management
Compliance
Big Data Storage
Big Data Processing On Line / Batch Analysis
Internal / External Data
Structured / Unstructured
Real Time
Backup and Recovery
Governance
Privacy and Security
Destruction Archival Retention
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 37
Functionalities
Destruction
Backup and Recovery
Governance
Privacy and Security
Protection Compliance
Retention
Archival
Protection qualities
Confidentiality, integrity, availability
De-duplication Replication
Data quality metrics
Compliance qualities
Rapid search (Legal)
Tiering
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 38
Big Data Classification
! Classification Retention period
Recovery Time Objective (RTO)
Recovery Point Objective (RPO)
Forensic window
Vital / Critical 7 years 30 minutes <10 minutes 6 months
Sensitive 5 years 1 day < 1 hour 3 months
Non critical 6 months 1 week < 48 hours 1 month
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 39
Confidentiality, Integrity, and Security of Big Data
Big Data Security
Unique Big Data Threats
Data Privacy Preservation
CSIRT Program Changes
Security Controls
Security Technology
eDiscovery
!
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 40
Security integration
Confidentiality
Identity Access
Identity Access
Perimeter Security
Confidentiality
Perimeter Security
Refinery Outbound Presentation
Refinery Inbound Pipeline
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 41
Big Data Security examples
HP related technologies
HP ProtectTool
• Authentication Services
• Multi-Factor Authentication
• Role Based Access (RBAC)
HP TippingPoint IPS
• In-line protection
• Real-time threat protection
Qualities
Role based
Speed
Reliable
Flexible authentication
Perimeter Security
Speed
Reliable
Managed
Real-time
Identity Access
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 42
Backup and recovery
Backup and Recovery Policy
target time
Deduplication
eDiscovery
Vaulting
Replication
Media transfer performance
Storing reliability
Qualities
HP ESL Tape Backup
StoreOnce
• Back up up to 100TB/hr with Catalyst
• Restores up to 40TB/hr
• Couplet redundant
• Tape vaulting
HP Related Technologies
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 43
Governance
Variety
Velocity Voracity
Volume
Validity
Accuracy assurance Consistency assurance Accessibility assurance Big Data Governance
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 44
Privacy & Security H
P R
ela
ted
T
ech
no
log
ies
ArcSight Security Intelligence
• Threat Detection
• Security Analysis
• Different data sources log data management
• Legal and Compliance
Qu
alities
Confidentiality
Connectors
Compliance
Traceability
Speed
Autonomy Security Performance Suite
• Data Protector
• Live Vaulting
• eDiscovery
• Compliance Archiving
• Records Mgmt
Protection
Vaulting
Fast Restore
Encryption
Fast Discover
Security Controls
Data Privacy Preservation eDiscovery
Privacy & Security Policy
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 45
Protection
Purging Shredding Wiping Degaussing
Different types of media
Protection Policy
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 46
Archival
HP Related Technologies
3PAR StoreServ
StoreAll
• Console based tiering
• Express Query and Autonomy IDOL integration
• Mesh-Active Architecture
• Thin technologies
• Peer Motion
• Virtual Lock
• Adaptive Optimization
Qualities
Automated policy-based tiering
Rapid search
Extreme Data Reduction
Scalable Storage
Archival Policy
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 47
Retention
Sarbanes-Oxley
HIPAA
PCI DSS
Safe Harbor
Data Privacy Act
GLBA
HP Related Technologies
3PAR StoreServ
StoreAll
• Console based tiering
• Snapshots and data validation
• WORM features
• File and Object Storage
• 16PB namespace
Qualities
High scalability
Automated policy-based tiering
Data Protection
WORM potentiality
Open standards interface
Retention Policy
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 48
Enabling Big Data with Networking & Storage example
HP Related Technologies
HP FlexNetwork Architecture
• FlexCampus
• FlexFabric
• FlexManagement
HP Converged Storage
• HP 3Par StoreServ
• HP StoreAll
• WAN Optimization
HP Infrastructure Tools
• Insight CMU
• IMC
• StoreVirtual software
Qualities
Simplicity
Speed
Scalability
Identity-based access
Storage
Geographic Snapshot and Cloning Capabilities
Thin Provisioning
Seamlessly handle fast moving data
Network IT Operations
Manageability of:
Connections
Storage Scale-out
Server Scale-out
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 49
End-to-End Service level
Big Data Refinery Service Level Target:
Current different Service Level:
Business transactions and interactions
Very High
High
Business intelligence & analytics
High
Message data
High
Document management
Medium
Multi-media
Very Low
Sensors data
Medium
Refinery's consolidated Service Level:
Low
Social media data
Big Data information refinery
Insight Processing
Infrastructure Integration
Management
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 50
Summary slide
Do you know the technology you will use and the workload? • If not, check out HP AppSystems and Reference Architectures for Hadoop (and other Big Data technologies)
Skills • Do you have experience with high-performance Linux clusters or Hadoop clusters?
Space & power • Can your data center handle the space, power, and cooling now & in the future?
Network & storage capacity • Can they handle data movement, staging, post-processing, and export/import?
• What load (export/import) can existing BI/Analytics systems handle?
Monitoring & Support framework • How will Hadoop ecosystem integrate
IT architectural requirements & standards
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 51
For more information
Attend these sessions
• RT3462, Big Data Analytics 360
• RT3463, Big Data & the internet of things
• TB2590,What’s new in HP Vertica
• BB3378, Any data, any size
• TK2789, Keynote: Make information matter
Visit these demos
• HP AppSystem for Apache Hadoop
• IT Big Data Transformation Experience
After the event
• Contact your sales rep!
• Visit www.hp.com/go/bigdata
• Visit www.hp.com/go/hadoop
Your feedback is important to us. Please take a few minutes to complete the session survey.
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 52
Learn more about this topic
Use HP Autonomy’s Augmented Reality (AR) to access more content
1. Launch the HP Autonomy AR app*
2. View this slide through the app
3. Unlock additional information!
*Available on the App Store and Google Play
© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thank you