Transcript

Big Data Analytics - an infrastructure and datamanagement perspective

BDCA; Kick Off User Group Cross MeetupMarch 3rd, 2015

Jürgen Türk, CSE Netapp

© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only1

Agenda

1.Who is NetApp?

2.NetApp approach to Big Data

3.Analytics Solutions – Reference Architectures

4.Case Studies

5.Wrap Up - Next Steps

NetApp is Technology Leader

14%R+D

M&A

only for

innovation1992

2004

2014

1998

NetApp Product Strategy Market-leading innovations, that are�

Shared and Dedicated

Storage Solutions

Flash

AcceleratedCloud

Integrated &

NetApp Product Strategy Market-leading innovations, that are�

NetApp and BigData

The 3V Paradigm

� Variety� Multiple data sources

� Multiple data formats

� Velocity� High speed processing

� Fast changing requirements

� Volume� Huge amounts of data

� Process and persist

7

Why NetApp?Practical solutions that solve today’s problems

Get

Control

NetApp helps you turn your

exploding data from threat to

opportunity. Manage your data

effectively and affordably.

Break

Through

Break through the limits. With

NetApp, you can take on even the

most massive and complex data

projects.

Gain

Insight

Turn insight to action. NetApp helps

you get to clarity and insight faster

and more reliably.

Experience Managing Data at Scale

NetApp’s Largest Customer

100 Customers

50 Customers

10 Customers

4 Customers100 PB

50 PB

20 PB

10 PB

Experience Managing Data at Scale

� Best of breed storage for Big Data Applications

� Built on open standards with best-in-class partnerships

� Validated with ecosystem leaders

� Complete server, network and storage “Racks”

� Delivered via trusted high-value partners

Open

Best-of-Breed

Choice

Value PropositionSome problems require and Enterprise Class Hadoop Solution

10

Enterprise Class Hadoop

Packaged ready-to-deploy modular Hadoop cluster

� The Data has intrinsic value $$$� Usable capacity must expand faster than

compute � Higher storage performance� Real human consequences if the system fails

(Threats, treatments, financial losses)� System has to allow for asymmetric growth

White Box Hadoop

Values associated with early adopters of Hadoop

� Social Media Space � Contributors to Apache � Strong bias to JBOD� Skeptical of ALL vendors

Enterprise Class Hadoop

Packaged ready-to-deploy modular Compute / Memory intensive Hadoop cluster � Compute intensive applications� Tic Data Analysis� Extremely tight Service Level expectations

� Severe financial consequences if the analytic run is late

Enterprise Class Hadoop

Bounded Compute algorithm / Memory intensive Hadoop cluster � Compute intensive applications� Additional CPUs do not improve run time� Extremely tight Service Level expectations � Severe financial consequences if the analytic run is late

� Need for deeper storage per datanode

Co

mp

ute

Po

we

r

Storage Capacity

Challenges with Hadoop Enterprise

Operations

Implementation

� Requires three copies of data, larger footprint,

and more storage

� Limited flexibility; storage and servers tied

together affects scalability

� Low cluster efficiency, higher network

congestion

� A disk drive failure reduces performance

dramatically

� Slow recovery from disk drive failure

� Expensive process to replace failed disks

online

� Most common Hadoop support issue is disk

drive failure

Availability

� Need to keep up with fast-paced patches,

projects of open source platform

� Need to decide on distribution of Hadoop

� Skills are not common

� Integration with existing IT infrastructure can be

difficult

� Tuning expertise needed to make Hadoop

perform optimally

FlexPod Convergered Infrastructure Family

Enterprise/Service ProviderMSB/Branch Office Dedicated

Dis

tinct A

rchite

ctu

res

Dis

tinct A

rchite

ctu

res

FlexPod® Express FlexPod Data Center FlexPod Select

Cisco UCS C-SeriesNexus, Catalyst®, MDSE-Series, FASReference architecture and/or designsApplication-based management

Cisco UCS C-SeriesNexus® 3KFAS2xx0, Two fixed pod sizesCisco UCS Director, VMware®, and Microsoft®

Cisco UCS C-Series/B-Series, Nexus® 5kFAS StorageFlexible pod sizesFlexPod validated management and ecosystem

Massively scalable shared virtual data

center infrastructure

Big data analytics, scientific,

HPCFor smaller, less-dynamic

requirements and VAR velocity

Storage Pool

Network Pool

Compute Pool

AppAppApp

Storage Pool

Network Pool

Compute Pool

App AppAppAppAppApp

Storage

Network / Direct

Compute

Nodes

App

Faster deployment

And implementation

Small management efforts

– one Hotline for all

Seamless growth on

demand

Modular

Referencearchitecture–

“Building Blocks” tuned for

best cooperation

FlexPod Select =

Especially optimized for

Big Data Workloads

More operational efficiency

with less efforts

Maximum Flexibility: The Unified Architecture makes sure that a FlexPod

can be integrated into an existing IT-Infrastructure

BigData Analytics Plattform for

ComputeCenters

Scaleable and high-available

Architecture

Quick and risk-freeImplementation

Optimized and standardizedOperation

24x7 Hotline for theentire infrastucture

All Components are perfectly

tuned

Plug&Play for Industrie 4.0 Solutions

NFSv3 Connector for Hadoop

* HDFS can be swapped out or run side-by-side with HDFS..

© 2014 NetApp, Inc. All rights reserved. NetApp Confidential – Limited Use 14

JobUser jobs

Compute layer MapReduce

File System

Yarn

HDFS

Resource layer

Storage layer

MapReduce

File System

Yarn

NFS / HDFS

HDFS gets complementedwith NFS*

Schneller beschafft

Schneller implementiert

Geringerer

Managementaufwand

Eine Hotline für alles

Wächst mit Ihren

Anforderungen

Modulare

Referenzarchitektur –

“Building Blocks” passen

immer optimal zusammen

FlexPod Select =

Speziell optimiert für

Big Data Workloads

Mehr Betriebssicherheit

mit weniger Aufwand

Maximale Flexibilität: Die Unified Architektur stellt sicher, dass der FlexPod

auch in bestehende IT-Umgebungen eingebunden werden kann.

RZ konforme BigData Analytics

Plattform

Skalierbare und hochverfügbare Architektur

Schnelle, risikolose Implementierung

Optimierter und standardisierter

Betrieb

24x7 Hotline für Gesamtinfrastruktur

Alle Komponenten sind perfekt

aufeinander abgestimmt

Plug&Play für Industrie 4.0 Lösungen

Certified Storage for HANA TDI + Hadoop

© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only16

FAS Product Family

7-mode and cDOT

NAS- shared file system

10Gb Ethernet and NFS

Single node and

Multi-node

SAN - Block Device

FC and XFS

E-Series

Product Family

Single node and

Multi-node

Example: FlexPod Select with Cloudera

* NetApp 50% Storage Guarantee http://www.netapp.com/us/solutions/infrastructure/virtualization/guarantee.html

� Converged big data platform from NetApp and Cisco for Hadoop

� Enterprise-class Hadoop: Innovative storage, servers, networking validated with leading Hadoop distributions

� Faster time to value: Prevalidated configuration accelerates deployment

� High availability: Less downtime, higher serviceability to meet tight SLAs around data applications and processes

� Flexible scaling: Independently scale servers and storage; modular design for scaling as data needs grow

Cisco UCS®C-Series Rack Mount Servers

NetApp® FASStorage Systems

NetApp E-SeriesStorage Array

Cisco UCS Manager

Cisco UCS Fabric Interconnect

17

Use Case Example:

NetApp Auto Support

� Correlate disk latency (hot) with disk type

– 24 billion records

– 4 weeks to run query

– Hadoop implementation 10.5 hours

� Bug detection through pattern matching

– 240 billion records – Too large to run

– Hadoop implementation 18 hours

Phone home data representing information about

the status NetApp storage controllers

Hortonworks

SAP LVMLandscape Virtualization

Management

© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only19

SAP HANA Studio

Smart Data AccessE-Series

5600

10Gb Ethernetand NFS

Flexpod Select with Hadoop

UCS C-Series Server

FAS8040HA Pairwith cDOT

10Gb Ethernetand NFS

Flexpod SAP HANA Database Nodes

UCS Blade Server

FlexCloneCopies

SnapCreatorHANA PluginSAP Lumira

Mobile Device

Call to action – get started

Identification of

Usecase

Connect to

Analytics Expert

+

Connect IT

and

LOB

Workshop

Proof of

Concept

BusinesscaseReadyness check

RUN

Go

Productive

Thank You

NetApp Confidential - Internal Use Only


Recommended