Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it...

Preview:

Citation preview

Powering the Connected Data Platform With ETL Onboarding

@Scott_GnauCTO, Hortonworks

@TenduYogurtcuBig Data GM, Syncsort

Global Leader in Big Iron to Big Data Solutions

2Syncsort Confidential and Proprietary - do not copy or distribute

• Provider of enterprise software and leader in Big Iron to Big Data solutions in more than 85 countries around the world

• Global presence in 87% of enterprise Fortune 500 companies

• High performance & scalable software harnessing valuable data assets to power business and operational analytics, while dramatically reducing the cost of mainframe and legacy systems

• Unique focus on customer value through cost-effective solutions and unparalleled support; trusted leader for nearly 50 years

WOODCLIFF LAKE, NJ

JAPAN

SINGAPORE

2

Global customer base of leaders and emerging businesses across all major industries

Strategic partnerships in Big Iron and Big Data ecosystems

Meet Today’s Presenters

3Syncsort Confidential and Proprietary - do not copy or distribute

Scott GnauCTO, Hortonworks

Tendu Yogurtcu, PhDGM, Big Data, Syncsort

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Open and Connected Data Platforms

DATA ATREST

DATA IN MOTION

ACTIONABLEINTELLIGENCE

The Future of the Enterprise is About All Data

Modern Data Applications

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Modern Data Applications

Modern Data Architecture

• ALL Data: Data-at-Rest & Data-in-Motion

• Cloud & Data Center• Powered by Open

Source

Big Data Analytics & IoT

Next Generation Data Use-Cases:• Predictive Retail• Factory Automation• Connected Cars• Predictive Analytics• Artificial Intelligence

The Shift to the Modern Data Architecture

System-centric User-centricRelational Database

Mainframe Client/Server Web & SaaS

IDMS

Data atRest

Data inMotion

ACTIONABLE INTELLIGENCE

Modern Data Applications

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Connected Data Platforms Enable Enterprise Transformations

Data in Motion

Data in Motion

Data at Rest

Data at Rest

MachineLearning

Deep HistoricalAnalysis

C L O U D

D ATA C E N T E R

Stream Analytics

Edge Data

Edge Data

Edge Analytics

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Data is the new Raw Material for Commerce

Easy Onboarding of New Data from New Sources

Access to Data from Legacy Systems and Apps

Successful Modern Data Apps

New Business and Revenue models

All Data

Data – Raw Material for Advanced Analytics

8

Syncsort Makes ALL Data Accessible & Usable – Ready for Analytics

9

Our Strategy: Simplify Big Data Integration

• Deploy on premise or in the cloud

• Choose among multiple execution frameworks – Hadoop, Spark, Linux, Unix, Windows

• Integrate streaming and batch data with a single data pipeline for innovative applications, like IoT

• Future-proof applications to avoid re-writing jobs in order to take advantage of innovations in new execution frameworks

• Access and integrate ALL enterprise data sources – including mainframe – for advanced analytics

10

Three Commitments Underpin Our Big Data Integration Strategy

Syncsort Confidential and Proprietary - do not copy or distribute 12

Light footprint

Self-tuning engine

Single install. No 3rd party dependencies

World-class data processing, mainframe expertise

JIRA:MAPREDUCE-2454MAPREDUCE-4807MAPREDUCE-4049MAPREDUCE-5455HIVE-8347SQOOP-1272PARQUET-134Spark-packagesand more!

Ongoing Contributions to theOpen Source Community1

Leverage Syncsort Technology Innovations & Mainframe Heritage

2

Strong Partnerships with StrategicBig Data & Hadoop Players

3

ETL Onboarding with Syncsort

13

Insurance: Easy Access to ALL Data for Better Analytics

14Syncsort Confidential and Proprietary - do not copy or distribute

• Challenge: Needed hard-to-access operational data for advanced analytics

• Solution:• Quickly load ~1000 database tables into HDP with the

click of a button• Access & integrate complex Mainframe VSAM files, data

from DB2/z, Oracle & SQL Server• Track changes & keep data up to date

• Benefits:• Insight: Better and faster analytics• Agility: Reclaim development time; single tool to ingest, detect changes and populate the data lake• Compliance: Build audit trails, keep EDW current• Productivity: No need for deep understanding of Hadoop

Leading Media Company: Accelerate New Business Initiatives

15Syncsort Confidential and Proprietary - do not copy or distribute

• Challenge: Build scalable platform to support new business initiatives & scale for double-digit data growth, while reducing escalating EDW & ELT Costs

• Solution:• Shift data storage & processing out of the EDW into

Hadoop• Migrate 500+ SQL ELT workloads to DMX-h on HDP

• Benefits:• Agility: Scalable architecture to deploy new business initiatives – analyze more set top box data,

blend website user activity data, etc.• Cost: Millions of dollars in savings from EDW, including SQL tuning & maintenance costs• Productivity: ETL developers can stop coding & tuning, and get up & running on Hadoop quickly

Hotel Chain: Ease of Use, Timely & Up-to-Date Reporting

16

• Challenge: More timely collection & reporting on room availability, event bookings, inventory and other hotel data from 4,000+ properties globally

• Solution: • Near real-time reporting• DMX-h consumes property updates from Kafka every 10s• DMX-h processes data on HDP, loading to TD every 30 min• Deployed on Google Cloud Platform

• Benefits:•Time to Value: DMX-h ease of use drastically cut development time

•Agility: Reports updated every 30 minutes vs every 24 hours

•Productivity: Leveraging ETL team for Hadoop (Spark), visual understanding of data pipeline

•Insight: Up-to-date data = better business decisions = happier customers

Syncsort DMX-h: Benefits to Business

17Syncsort Confidential and Proprietary - do not copy or distribute

• Faster Time to Value: •Faster & better insights with readily-accessible data

• Compliance:•Secure data access, ability to build audit trails

• Increased Productivity:•Reclaim development time by automating, optimizing and future-proofing development

•Across platforms, on premise and in the cloud

• Cost: •Lower archival costs

•Reduced development time

•Reduced Total Cost of Ownership, higher ROI

Syncsort Confidential and Proprietary - do not copy or distribute 18

See For Yourself!***

Take a 30-day Free Trial @www.syncsort.com/try

Recommended