14
Spark Usage in Enterprise Business Operations Ken Tsai VP, Data Management & Platform-as-Services SAP @kentsaiSAP 2.17.16: Spark Summit, NYC

Spark Usage in Enterprise Business Operations

Embed Size (px)

Citation preview

Page 1: Spark Usage in Enterprise Business Operations

Spark Usage in Enterprise Business Operations

Ken Tsai VP, Data Management & Platform-as-Services SAP @kentsaiSAP

2.17.16: Spark Summit, NYC

Page 2: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

© 2016 SAP SE or an SAP affiliate company. All rights reserved.

No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an SAP affiliate company. SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark information and notices. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors. National product specifications may vary. These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty. In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.

Page 3: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

SAP – Our Quick Snapshot in the Enterprise Computing World

74% of the world’s transaction revenue touches an SAP system.

SAP’s product focus:

Enterprise Applications

Business Networks

Platforms – 15 yrs on IMC

SAP customers represent 87% of Forbes Global 2,000 companies.

SAP touches $16 trillion of world consumer purchases.

Page 4: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

SAP HANA – An In-Memory Platform to Enable New Business Scenarios Previously Not Feasible

BKPF BSEG BSEG BSEG

no indices no aggregates no redundancies

CORE DATA STRUCTURE REMAINS UNCHANGED

•  Soft financial close anytime •  Real-time revenue and cost analysis •  Real-time liquidity forecasts •  Real-time alerts and blocks on suspicious

transactions

Page 5: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Distributed Big Data Is Everywhere How to better use it in core enterprise business applications?

~79% of Data Reservoirs/Lakes are still disconnected

from core business operations

How do I embed big data signal into my business applications

and enterprise analytics?

53 Difficulty integrating with CRM and/or other systems

% 49 Unable to apply or integrate external data quickly enough to inform real-time decision making

% 59 Only a few analysts with specialized training can analyze big data

%

Harvard Business Review Analytic Services, Global Survey of 251 Respondents, Sept. 2015

Page 6: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Introducing SAP HANA Vora

An in-memory query engine that extends the Apache Spark execution framework to enrich the interactive analytics experiences on massively distributed computing clusters

•  OLAP processing •  In-Memory

Computing for high performance

•  Connecting to Enterprise Systems

•  Unified System Management

SAP HANA

ERP DATA BIG DATA

Parallelized Queries

Vora

Page 7: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Key Open Source Contribution to Apache Spark Ecosystem Spark to HANA Push-downs & Data Hierarchies

scala>valhierarchy=sqlContext.sql(s"""SELECTLVL,COUNT(*),ROUND(AVG(P_RETAILPRICE),2)FROM(SELECTLEVEL(node)ASLVL,P_RETAILPRICEFROMHIERARCHY(USINGPART_HIERARCHYAScJOINPARENTpONc.P_PARENT=p.P_PARTKEYSEARCHBYP_PARTKEYASCSTARTWHEREP_PARTKEY=1SETnode)ASH0)T1GROUPBYLVL""".stripMargin).collect().foreach(println)

901

903

913 912

904

911

+---+---+------------+|LEVEL|COUNT|AVG(P_RETAILPRICE)|+-----+-----+------------------+|0|1|901||1|2|903.5||2|3|912|+-----+-----+------------------+

valoptions=Map("dbschema"->config.user,"host"->config.host,"instance"->config.instance)#HANALiveCustomerBasicDataVirtualDataModelvalcustConf=options+("path"->s"""sap.hba.ecc/CustomerBasicData""")valcust=sqlContext.read.format("com.sap.spark.hana").options(custConf).load()cust.registerTempTable("customer")#HANALiveSalesOrderHeaderVDMvalsohConf=options+("path"->s"""sap.hba.ecc/SalesOrderHeader""")valsoh=sqlContext.read.format("com.sap.spark.hana").options(sohConf).load()soh.registerTempTable(soh)

#Top5CountriesbySalesOrderVolumesalesOrder=sqlContext.sql("select"Country",count(*)asFrequencyfromsalesOrderassLEFTOUTERJOINcustomerascons.soldToParty=c.CustomerGROUPBYCountryORDERBYFrequencydesc”)

Page 8: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Airline Use Case – Optimize MRO scheduling with Sensor Data

Challenges

•  $10,000 loss for every hour spent on maintenance, repair, and overhaul (MRO)

•  Predictive MRO generates TB of sensor data per flight

Solution

•  SAP HANA Vora rapidly processes sensor data in HDFS and combines it with flight schedule and staffing data in SAP HANA to prioritize maintenance jobs and accelerate MRO

Why SAP HANA Vora

•  Optimize MRO operations with interactive, on-demand drill down by airport, flight route, etc.

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Page 9: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Utility Use Case – CenterPoint Energy

Challenge

•  Smart meters generate TBs of data/month

•  Regulatory requirement to retain data for 10 years

•  Current storage solution full by end-2016

•  Need to leverage HDFS as an additional tier for storage

Solution

•  SAP HANA for most recent sensor signal and operational data, Dynamic Tiering for 1~2yrs old data, HDFS for historical sensor data

•  SAP HANA Vora accesses and queries data across all tiers

Why SAP HANA Vora

•  SAP HANA Vora provides enterprise analytics & OLAP like experience across data warehouse and HDFS.

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Page 10: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Utility Use Case – How It Works CenterPoint Energy

Our benchmark tests proved that SAP HANA paired with SAP HANA Vora are the right solutions for us. We expect immediate cost benefits and to see competitive differentiation in the future.”

Gary Hayes, CIO & SVP at CenterPoint Energy

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

SAP HANA MOST RECENT SENSOR DATA

Dynamie Tiering

1-2 YR OLD DATA

Parallelized Queries

HDFS

HISTORICAL SENSOR DATA

Query data within and across tiers

Page 11: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Financial Services Use Case – Extend Fraud Pattern Detection

Challenges

•  100+ million business transactions daily, 25% growth YoY

•  Limited access to archived data •  Difficult to detect patterns in

historical transactions

Solution

•  Current transactions in SAP HANA, historical transactions in HDFS clusters

•  Real-time detection of abnormalities

Why SAP HANA Vora

•  Real-time, aggregated insights from current and historical transactions

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Page 12: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

2016 and the Road Ahead

Customers in North America, APJ, and EMEA

Dev edition

available on AWS

TODAY

General Availability

Vora Modeler to build and query

OLAP style cubes on data

COMING SOON

Planning (HR, Financial) Extend engine support

for time series Transaction

management

Analytics on archived ERP data in Hadoop

FUTURE

Page 13: Spark Usage in Enterprise Business Operations

©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16 ©  2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16

Contribute to Spark Ecosystem, Embrace Best of Community Innovation

Contribution to Open Source:

Hierarchy capabilities

Connection to ERP: predicate pushdown to HANA

On-the-market solution

SAP HANA Vora

Page 14: Spark Usage in Enterprise Business Operations

Thank you! Ken Tsai: [email protected] @kentsaiSAP

Enter to Win a GoPro HERO4

Session at SAP Booth 102

Learn More @ hana.sap.com/vora

Try Dev Edition bit.ly/1K1qLyo

We’re Hiring: https://spark-summit.org/east-2016/jobs/