29
1 © Cloudera, Inc. All rights reserved. Modernizing Business Intelligence and Analytics 1 © Cloudera, Inc. All rights reserved. Justin Erickson Senior Director, Product Management

Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

1©Cloudera,Inc.Allrightsreserved.

ModernizingBusinessIntelligenceandAnalytics

1©Cloudera, Inc.Allrightsreserved.

JustinEricksonSeniorDirector,ProductManagement

Page 2: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

2©Cloudera,Inc.Allrightsreserved.

•WhatbenefitscanIachievefrommodernizingmyanalyticDB?•WhenandhowdoImigratefromcurrentsystems?• Howdoesitworkinthecloud?

Agenda

Page 3: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

3©Cloudera,Inc.Allrightsreserved.

EDWOptimization

DataPreparation

Self-ServiceBI&Exploration

UseyourEDWmoreefficientlybyoffloadingworkloadstoHadoop

Fast,flexibleETLoverlargedatavolumes,sodataisalwaysreadyforyourbusiness

Fastesttime-to-insightswithamodernanalyticdatabasedesignedwithHadoop’sflexibilityandagility

KeyApplications

Page 4: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

4©Cloudera,Inc.Allrightsreserved.

Cloudera’sAnalyticDatabase

Identify,offload,&optimizeworkloadsto

Hadoop

NavigatorOptimizer

IntelligentSQLeditor

Hue

Audit,lineage,encryption,key

management,&policylifecycles

Navigator

IntegrationwiththeleadingBItools

BIPartners

InteractivequeryengineforBI&SQLanalytics

Impala

Large-scaleETL&batchprocessingengine

Hive-on-Spark

Multi-Storage,Multi-Environment

DataStorageforFast&ChangingData

Kudu

Page 5: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

5©Cloudera,Inc.Allrightsreserved.

KeyBenefitsAnanalyticdatabasedesignedforHadoop

High-PerformanceBIandSQLAnalytics

FlexibilityforDataandUseCaseVariety

Cost-effectiveScaleforTodayandTomorrow

GoBeyondSQLwithanOpenArchitecture

Page 6: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

6©Cloudera,Inc.Allrightsreserved.

AnalyticDBAnatomyBuiltforself-serviceandhybridcloud

Page 7: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

7©Cloudera,Inc.Allrightsreserved.

AnatomyofanAnalyticDatabaseCloudera DecoupledbyDesign

QueryEngine

StorageEngine

Catalog

QueryEngine(Impala)

Catalog(HMS)

MonolithicAnalyticDatabase ModernAnalyticDatabase

Storage(Kudu)

Storage(S3)

Storage(HDFS)

Page 8: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

8©Cloudera,Inc.Allrightsreserved.

LimitedtoSQLonly• Maintaindatacopiesfornon-SQL

RigidDataModel• Tightlycoupledstorageandcompute

StaticSizing• Majormaintenancetoaddcapacity/nodes

PoorlyDesignedforCloud• Noelasticityorintegrationwithobjectstorage

PainPointsTraditionalMonolithicAnalyticDatabases

COMPUTESTORE

Page 9: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

9©Cloudera,Inc.Allrightsreserved.

Benefits ofCloudera’sModernApproachCloud-Native&On-Premise

GoBeyondSQL• OpenArchitecture:Openformatsandopenstorage

• ShareddataacrossSQLandnon-SQLworkloads

DataFlexibility• Faster,moreagiledataacquisition• Dataportability:Openformatsandopenstorage

Cost-EffectiveScalability• Elasticscaleon-premorinthecloud

• Cloud-nativepay-per-useandtransience

• Provenatbigdatascale

Hybrid• Runsacrossmulti-cloud&on-prem

• Multi-storageoverS3,HDFS,Kudu,Isilon,DSSD,etcSharedData

Page 10: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

10©Cloudera,Inc.Allrightsreserved.

EDWOptimizationExpandtheValueofYourDataWarehousingLandscape

Page 11: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

11©Cloudera,Inc.Allrightsreserved.

MotivationsforOptimizingtheEDW

CostcontainmentforexistingworkloadsLimitedbudgetforexpansion

UnabletotakeonnewworkloadsUnabletokeepupwithchangingbusinessneeds

Difficultyhandlingbothfixed-SLAreportsandself-serviceexploration

Growingimportanceofself-serviceBI,advancedanalytics,andcloud

$$

Page 12: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

12©Cloudera,Inc.Allrightsreserved.

ExistingEDWLandscape

DataSources

ETL/Staging

EDW

Archive

DataMarts

CannedReports

Dashboards/AnalyticApplications

Non-SQLWorkloads

Self-ServiceBI/AdHoc

Page 13: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

13©Cloudera,Inc.Allrightsreserved.

OptimizingtheEDWwithCloudera

• Cost-EffectiveScale• Sayyestomorewithouttherisk

• GoBeyondSQL• Exploration,advancedanalytics,andmoreallinoneplatform

•ModernizetheDataWarehouseLandscape• MaximizetheEDWwhileenablingiterative,self-serviceaccess/BI• Well-suitedforon-prem,cloud,andhybriddeployments

90%lessperTBvsRDBMSand75%lessvsNetezza

Augmented itsOracleEDWwithmulti-tenantClouderasystemwiththeirBItoolconfiguredtoallowuserstopullreportsfromboth

MediaResearchFirmSavedtensofmillionsbyoffloadingDBMStoClouderainthecloud

Page 14: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

14©Cloudera,Inc.Allrightsreserved.

ModernDataWarehouseEnvironment

DataSources

EDW

AnalyticDatabase

OperationalDatabase

DataScience&Engineering

SharedDataLayer

ModernDataPlatform

FixedReports

Dashboards/AnalyticApplications

Non-SQLWorkloads

Self-ServiceBI/AdHoc

FlexibleReporting

Page 15: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

15©Cloudera,Inc.Allrightsreserved.

Plan Offload Optimize

EstimateEffort

RiskAnalysis

SchemaDesign

FineTuningDataModelonHadoop

OptimizeQueriesforPerformance

Test&Validate

Evaluate

IdentifyUseCases

ImpactAnalysis

Objectives PrioritizedPlan

ValidateROI,CostInitialPOC

OffloadeachworkloadEvaluatetheneedforoffload Impactanalysis,prioritizedplan

Optimizeperformance

WorkloadVisibility

NavigatorOptimizerBuilttohelpyouthroughtheoptimizationprocess

OffloadActions

Page 16: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

16©Cloudera,Inc.Allrightsreserved.

WorkloadVisibilityGetinsightsintowhat’shappeningtoday

EvaluateQueries• Topqueries• Queryduplication• Querycomplexity• Commonaccesspatterns

EvaluateDataAccess• Toptables,topcolumns• Usage-basedERdiagram• Alltables/columnsinuse

EvaluatePOC• IdentifyinitialworkloadpieceforPoC• Getpartitioningkeysuggestions

Evaluate

Page 17: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

17©Cloudera,Inc.Allrightsreserved.

ImpactAnalysis&PrioritizedPlanUnderstandwhatittakestooffload

ImpactAnalysis• Focuseffortsbyidentifyingduplication• Workloadriskassessmentbasedoncomplexityandbestpractices

• Understandquerycompatibility

PrioritizedPlan• Estimateeffort• Identifyeasiestpiecestostartforfastsuccess• Prioritizeworkloadsforoffload

Plan

Page 18: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

18©Cloudera,Inc.Allrightsreserved.

PredictableOffloadRemovetheguesswork

Understandoffloadrequirements• Determinemostcommonworkload

patterns• Developdata-/usage-drivenoffload

strategy

Actionablerecommendations• Complexityassessmentforriskierareas• Focuseffortsbyidentifyingduplication• Designrecommendationsforbestresults

Offload

Page 19: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

19©Cloudera,Inc.Allrightsreserved.

OptimizingwithinHadoopMaintainpeakperformance

Understandusageandkeepupwithdataneeds• Understandmostcommonusagepatterns• Identifyoptimizationopportunities• Proactivelyadjustdatamodels

Performanceoptimizations• BestpracticeguidanceforHiveandImpala• Queryperformanceoptimization• Increaseplatformadoption

Optimize

Page 20: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

20©Cloudera,Inc.Allrightsreserved.

Builtforhybridcloud

Page 21: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

21©Cloudera,Inc.Allrightsreserved.

What’sDrivingAnalyticstotheCloud?Bigdatadeploymentsincloudareaccelerating:

● ExecutiveMandate:Minimizeon-premdatacenterfootprint

● IncreasedAgility:End-userself-service

● Elasticity:Optimizeinfrastructureusage

● LowerOverallTCO

Page 22: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

22©Cloudera,Inc.Allrightsreserved.

MostOrganizationsAreorWillbeHybridCloud

• 76%willembracehybridcloud(Gartner1)• 82%willhaveamulti-cloudstrategy(RightScale2)• 50%will“repatriate”atleastonepubliccloudworkloadbacktoprivatecloudor

on-prem forcostreasons(4513)• 50%ofCloudera’scloudcustomersrunahybridenvironment

1Gartner,MarketTrends:CloudAdoptionTrendsFavorPublicCloudWithaHybridTwist20152RightScale 2016StateoftheCloudReport3451Research:AWSLambda:newandexciting,oldandrehashed,morevendorlock-in(oralltheabove)?,November22,2016

Whyisthisacriticalstrategy?

Portability&Cost Functionality DataGravity

Page 23: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

23©Cloudera,Inc.Allrightsreserved.

Cost-Efficiencies&FlexibilityintheCloudPrimaryAnalyticDatabasePatterns

Onlypayforwhatyouneed,whenyouneedit

▪ Transientclusters▪ Objectstoragecentric▪ Cloud-nativedeployment

ETL

ReduceOperatingCosts NewInsights,NewRevenue

BI/Analytics

Exploreandanalyzealldata,whereveritlives

▪ Long-runningclusters▪ Objectstorageorlocalstorage▪ Lift-and-shiftdeployment

Page 24: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

24©Cloudera,Inc.Allrightsreserved.

AddUseCases,Analytics,andDataOn-Demand• AvoidtheITbacklogwithinstantaccesstoalldata

• On-demandclustersquerydirectlyonsharedobjectstorage

PredictableResultsWheneverYouWant• Consistentqueryperformance,evenduringpeaktimes

• Multi-tenancyviaisolatedclustersonshareddata

Just-in-TimeResources• Real-timecapacityforyourneeds,astheychange

• Elasticallygrow/shrinkyourclusterviadecoupledarchitecture

Contention-FreeETL• ETLanytimewithoutimpactingotherworkloadsorriskingSLAs

• SeparateETLclustersas-neededonshareddata

AdditiveBenefitsintheCloudExtendingcoreperformance,flexibility,scalability,andopenarchitecturebenefits

Page 25: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

25©Cloudera,Inc.Allrightsreserved.

BI/AnalyticsintheCloudThreeArchitecturesOptionstoOptimizePrice/Performance

ObjectStorage

TransientCluster

TransientBI(infrequentusage)Spinupclusterswhenneeded● On-demandinstances● Usage-basedpricing● Grow/shrink● Clusterpertenantoruser

PersistentBI(regularusage)PersistentclustersforBIanytime● Reservedinstances● Node-basedpricing● Grow/shrink● Clusterpertenantgroup

PersistentCluster

PersistentBIwithLocalStorage(fastest)Maxspeedformoreregularworkloads● Reservedinstances● Node-basedpricing● Lessfrequentgrow/shrink● Sharedclusterforsharedlocaldata

PersistentCluster HDFSand/orKudu

PersistentCluster

TransientCluster

DefaultChoice

Page 26: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

26©Cloudera,Inc.Allrightsreserved.

PersistentBIonObjectStorageBestforelasticity(andspeedvstransient)

● Thisisusuallythebestchoice● Bestwhenworkloadsare:

o Flexibleandchangingo Frequentduringmostworkingdayso Notscheduledforfixedhours

● Benefitsinclude:o Predictableresultsreadilyavailableo Fullmulti-tenantisolationo Commondatainsharedobjectstorageo Grow/shrinkforTCOefficiency

● Tradeoffs:o Pernodeperfofobjectstorage(usemore,

cheapernodes)ObjectStorage

SharedHMSDB

PersistentBI(regularusage)Persistentclustersforreadyavailability● Reservedinstances● Node-basedpricing● Grow/shrink● Clusterpertenantgroup

PersistentCluster

PersistentCluster

DefaultChoice

Page 27: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

27©Cloudera,Inc.Allrightsreserved.

PersistentBIwithLocally-AttachedStorageBestperformanceforconsistentworkloads

● Bestwhenworkloadsare:o Regularandconsistento Consistentlyqueryingcommondatao TightSLAsforperformanceo Fastchangingdata(thatneedsKudu)o Runningwithoutobjectstorage(eg.Azure,GCE)

● Benefitsinclude:o Fasterperformancepernodeonlocaldatao Abilitytoqueryobjectstorageforrestofdata

● Tradeoffs:o Lesselasticthanobjectstoredbasedclusterso Lessisolationformulti-tenantworkloadsusing

sameHDFSdatao Costifthereareoff-peakhours

ObjectStorage

PersistentBIwithHDFS(fastest)Maxspeedformoreregularworkloads● Reservedinstances● Node-basedpricing● Lessfrequentgrow/shrink● SharedclusterforsharedHDFSdata

PersistentCluster

LocalHMSDB

HDFSand/orKudu

Page 28: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

28©Cloudera,Inc.Allrightsreserved.

TransientBIonObjectStorageBestTCOforinfrequentusage

ObjectStorage

ClouderaDirector

● Bestwhenworkloadsare:o Infrequentorscheduled

● Benefitsinclude:o LowestTCOwithclustersonlywhenneededo Fullmulti-tenantisolationo Commondatainsharedobjectstorage

● Tradeoffs:o Delaytospin-upclusterswhenneededo CapabilityofBIuserstospinupclusterso Pernodeperfofobjectstorage(usemore,

cheapernodes)SharedHMSDB

TransientCluster

TransientBI(infrequentusage)Spinupclusterswhenneeded.● On-demandinstances● Usage-basedpricing● Grow/shrink● Clusterpertenantoruser

TransientCluster

Page 29: Modernizing Business Intelligence and Analyticscdn.govexec.com/media/ctd_analytic_db_final.pdfUse your EDW more efficiently by offloading workloads to Hadoop Fast, flexible ETL over

©Cloudera,Inc.Allrightsreserved. 29

ThankyouThankYouJustinErickson