Big Data Analytics - an infrastructure and files. Convergered Infrastructure Family MSB/Branch Office Enterprise/Service Provider Dedicated Distinct Architectures Distinct Architectures

  • View
    219

  • Download
    3

Embed Size (px)

Text of Big Data Analytics - an infrastructure and files. Convergered Infrastructure Family MSB/Branch...

  • Big Data Analytics - an infrastructure and datamanagement perspective

    BDCA; Kick Off User Group Cross MeetupMarch 3rd, 2015

    Jrgen Trk, CSE Netapp

    2014 NetApp, Inc. All rights reserved. NetApp Proprietary Limited Use Only1

  • Agenda

    1.Who is NetApp?

    2.NetApp approach to Big Data

    3.Analytics Solutions Reference Architectures

    4.Case Studies

    5.Wrap Up - Next Steps

  • NetApp is Technology Leader

    14%R+D

    M&A

    only for

    innovation1992

    2004

    2014

    1998

  • NetApp Product Strategy Market-leading innovations, that are

    Shared and Dedicated

    Storage Solutions

    Flash

    AcceleratedCloud

    Integrated &

  • NetApp Product Strategy Market-leading innovations, that are

  • NetApp and BigData

    The 3V Paradigm

    Variety Multiple data sources

    Multiple data formats

    Velocity High speed processing

    Fast changing requirements

    Volume Huge amounts of data

    Process and persist

  • 7

    Why NetApp?Practical solutions that solve todays problems

    Get

    Control

    NetApp helps you turn your

    exploding data from threat to

    opportunity. Manage your data

    effectively and affordably.

    Break

    Through

    Break through the limits. With

    NetApp, you can take on even the

    most massive and complex data

    projects.

    Gain

    Insight

    Turn insight to action. NetApp helps

    you get to clarity and insight faster

    and more reliably.

  • Experience Managing Data at Scale

    NetApps Largest Customer

    100 Customers

    50 Customers

    10 Customers

    4 Customers100 PB

    50 PB

    20 PB

    10 PB

  • Experience Managing Data at Scale

    Best of breed storage for Big Data Applications

    Built on open standards with best-in-class partnerships

    Validated with ecosystem leaders

    Complete server, network and storage Racks

    Delivered via trusted high-value partners

    Open

    Best-of-Breed

    Choice

  • Value PropositionSome problems require and Enterprise Class Hadoop Solution

    10

    Enterprise Class Hadoop

    Packaged ready-to-deploy modular Hadoop cluster

    The Data has intrinsic value $$$ Usable capacity must expand faster than

    compute Higher storage performance Real human consequences if the system fails

    (Threats, treatments, financial losses) System has to allow for asymmetric growth

    White Box Hadoop

    Values associated with early adopters of Hadoop

    Social Media Space Contributors to Apache Strong bias to JBOD Skeptical of ALL vendors

    Enterprise Class Hadoop

    Packaged ready-to-deploy modular Compute / Memory intensive Hadoop cluster Compute intensive applications Tic Data Analysis Extremely tight Service Level expectations

    Severe financial consequences if the analytic run is late

    Enterprise Class Hadoop

    Bounded Compute algorithm / Memory intensive Hadoop cluster Compute intensive applications Additional CPUs do not improve run time Extremely tight Service Level expectations Severe financial consequences if the analytic run is late

    Need for deeper storage per datanode

    Co

    mp

    ute

    Po

    we

    r

    Storage Capacity

  • Challenges with Hadoop Enterprise

    Operations

    Implementation

    Requires three copies of data, larger footprint,

    and more storage

    Limited flexibility; storage and servers tied

    together affects scalability

    Low cluster efficiency, higher network

    congestion

    A disk drive failure reduces performance

    dramatically

    Slow recovery from disk drive failure

    Expensive process to replace failed disks

    online

    Most common Hadoop support issue is disk

    drive failure

    Availability

    Need to keep up with fast-paced patches,

    projects of open source platform

    Need to decide on distribution of Hadoop

    Skills are not common

    Integration with existing IT infrastructure can be

    difficult

    Tuning expertise needed to make Hadoop

    perform optimally

  • FlexPod Convergered Infrastructure Family

    Enterprise/Service ProviderMSB/Branch Office Dedicated

    Dis

    tinct A

    rchite

    ctu

    res

    Dis

    tinct A

    rchite

    ctu

    res

    FlexPod Express FlexPod Data Center FlexPod Select

    Cisco UCS C-SeriesNexus, Catalyst, MDSE-Series, FASReference architecture and/or designsApplication-based management

    Cisco UCS C-SeriesNexus 3KFAS2xx0, Two fixed pod sizesCisco UCS Director, VMware, and Microsoft

    Cisco UCS C-Series/B-Series, Nexus 5kFAS StorageFlexible pod sizesFlexPod validated management and ecosystem

    Massively scalable shared virtual data

    center infrastructure

    Big data analytics, scientific,

    HPCFor smaller, less-dynamic

    requirements and VAR velocity

    Storage Pool

    Network Pool

    Compute Pool

    AppAppApp

    Storage Pool

    Network Pool

    Compute Pool

    App AppAppAppAppApp

    Storage

    Network / Direct

    Compute

    Nodes

    App

  • Faster deployment

    And implementation

    Small management efforts

    one Hotline for all

    Seamless growth on

    demand

    Modular

    Referencearchitecture

    Building Blocks tuned for

    best cooperation

    FlexPod Select =

    Especially optimized for

    Big Data Workloads

    More operational efficiency

    with less efforts

    Maximum Flexibility: The Unified Architecture makes sure that a FlexPod

    can be integrated into an existing IT-Infrastructure

    BigData Analytics Plattform for

    ComputeCenters

    Scaleable and high-available

    Architecture

    Quick and risk-freeImplementation

    Optimized and standardizedOperation

    24x7 Hotline for theentire infrastucture

    All Components are perfectly

    tuned

    Plug&Play for Industrie 4.0 Solutions

  • NFSv3 Connector for Hadoop

    * HDFS can be swapped out or run side-by-side with HDFS..

    2014 NetApp, Inc. All rights reserved. NetApp Confidential Limited Use 14

    JobUser jobs

    Compute layer MapReduce

    File System

    Yarn

    HDFS

    Resource layer

    Storage layer

    MapReduce

    File System

    Yarn

    NFS / HDFS

    HDFS gets complementedwith NFS*

  • Schneller beschafft

    Schneller implementiert

    Geringerer

    Managementaufwand

    Eine Hotline fr alles

    Wchst mit Ihren

    Anforderungen

    Modulare

    Referenzarchitektur

    Building Blocks passen

    immer optimal zusammen

    FlexPod Select =

    Speziell optimiert fr

    Big Data Workloads

    Mehr Betriebssicherheit

    mit weniger Aufwand

    Maximale Flexibilitt: Die Unified Architektur stellt sicher, dass der FlexPod

    auch in bestehende IT-Umgebungen eingebunden werden kann.

    RZ konforme BigData Analytics

    Plattform

    Skalierbare und hochverfgbare Architektur

    Schnelle, risikolose Implementierung

    Optimierter und standardisierter

    Betrieb

    24x7 Hotline fr Gesamtinfrastruktur

    Alle Komponenten sind perfekt

    aufeinander abgestimmt

    Plug&Play fr Industrie 4.0 Lsungen

  • Certified Storage for HANA TDI + Hadoop

    2014 NetApp, Inc. All rights reserved. NetApp Proprietary Limited Use Only16

    FAS Product Family

    7-mode and cDOT

    NAS- shared file system

    10Gb Ethernet and NFS

    Single node and

    Multi-node

    SAN - Block Device

    FC and XFS

    E-Series

    Product Family

    Single node and

    Multi-node

  • Example: FlexPod Select with Cloudera

    * NetApp 50% Storage Guarantee http://www.netapp.com/us/solutions/infrastructure/virtualization/guarantee.html

    Converged big data platform from NetApp and Cisco for Hadoop

    Enterprise-class Hadoop: Innovative storage, servers, networking validated with leading Hadoop distributions

    Faster time to value: Prevalidated configuration accelerates deployment

    High availability: Less downtime, higher serviceability to meet tight SLAs around data applications and processes

    Flexible scaling: Independently scale servers and storage; modular design for scaling as data needs grow

    Cisco UCSC-Series Rack Mount Servers

    NetApp FASStorage Systems

    NetApp E-SeriesStorage Array

    Cisco UCS Manager

    Cisco UCS Fabric Interconnect

    17

  • Use Case Example:

    NetApp Auto Support

    Correlate disk latency (hot) with disk type

    24 billion records

    4 weeks to run query

    Hadoop implementation 10.5 hours

    Bug detection through pattern matching

    240 billion records Too large to run

    Hadoop implementation 18 hours

    Phone home data representing information about

    the status NetApp storage controllers

  • Hortonworks

    SAP LVMLandscape Virtualization

    Management

    2014 NetApp, Inc. All rights reserved. NetApp Proprietary Limited Use Only19

    SAP HANA Studio

    Smart Data AccessE-Series

    5600

    10Gb Ethernetand NFS

    Flexpod Select with Hadoop

    UCS C-Series Server

    FAS8040HA Pairwith cDOT

    10Gb Ethernetand NFS

    Flexpod SAP HANA Database Nodes

    UCS Blade Server

    FlexCloneCopies

    SnapCreatorHANA PluginSAP Lumira

    Mobile Device

  • Call to action get started

    Identification of

    Usecase

    Connect to

    Analytics Expert

    +

    Connect IT

    and

    LOB

Recommended

View more >