20
Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

Embed Size (px)

Citation preview

Page 1: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

Building the Modern Data Hub:

Beyond the Traditional Enterprise Data Warehouse

Page 2: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

2www.datavail.com

The New World of Data

90% of the world’s information was created in the last two years. 80% of all enterprise data is unstructured, which means it’s not the neat and tidy data that for decades has been held in relational databases, which in turn plug nicely into “business intelligence” tools, enterprise data warehouses and other traditional data analytics systems.

Today’s data needs different tools. And it requires a different sort of data scientist.

Page 3: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

3www.datavail.com

The EDW Analytic Conundrum

Modern DataHub

● Flexible - add new data easily

● Fresh - up to date data, near real time

● Any query no matter how complex

● Rapid deployment - days to weeks

Traditional EDW

● ETL based - Brittle, hard to add new data sources

● Stale - data can be out of date

● Limited - queries limited by what data available

● Slow - months to deploy or update

Page 4: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

4www.datavail.com

The Traditional Data Warehouse

Extract Load & Transform Processes Star Schema

Data Warehouse (EDW)

Data Visualization

SignificantInvestment in Planning,

Development, Monitoring & Maintenance

Page 5: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

5www.datavail.com

The Traditional Data Warehouse

Extract Load & Transform Processes Star Schema

Data Warehouse (EDW)

Data Visualization

SignificantInvestment in Planning,

Development, Monitoring & Maintenance

What’s the ROI?

How long is this going to

take?

Are we sure these are the right reports?

How quickly can we make changes?

Page 6: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

6www.datavail.com

Today’s Traditional EDW Problems Extraction, Transformation & Data Loading

•Highly transformative, structured ETLs are a costly investment on many levels from development, monitoring, tuning to operational maintenance & remediation

•Target schema structures require planning based on end goals but often those goals are not well defined•Often today the data we have is both structured and unstructured

• Traditional EDWs are a long term investment and the ROI is often hard to measure

• Perishable Insights are difficult to capture in traditional EDWs requiring fast turnaround (Superbowl,

Mother's Day, Thanksgiving, etc.)

Page 7: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

7www.datavail.com

Today’s Traditional EDW Problems Visualization &

Reporting

• Traditional analytic reporting is predicated on structured schemas (star, snowflake, relational, etc..)

• if these are not planned well it can create performance problems

• hard structures can lead to missing metrics and reporting opportunities

• Any reworking of the final analytics requiring new metrics or data elements often require going back to the ETL to properly remediate the missing elements

• Producing insights and reporting for new trends can be time consuming when predicated on pre-planned data structures

• Missed opportunities on Perishing Insights (Superbowl,

Mother's Day, Thanksgiving, etc.)

Page 8: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

8www.datavail.com

A Proposed Modern Approach

MongoDBJSON Data Warehouse

No Predetermined Schema

Cubes

UnstructuredData Star Schema

EDW

OLTPData Mart

Reporting

ETL / ELTStaging

Immediate Access to Data for Analytic Insights, Fast

ROI & PlanningOther Data

Sources

Page 9: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

9www.datavail.com

NoSQL as Source for Visualization

JSON

Structured Data• RDBMS• Cloud (AWS, Azure, etc)

-MongoDB-Spark

BI Tools *TableauPowerBISpotfire

Reporting

BI Connector

NoSQLHadoop

Hadoop HFSJSON, CSV, XML Data LakeNo Predetermined Schema

Page 10: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

10www.datavail.com

Hadoop Data Lakes & Data Hubs

• Hadoop is NOT a database it’s a filesystem• Impala, Cassandra or just JSON, XML, CSV files

• SlamData connects to Hadoop using Spark (both written in Scala)

• Much simpler to implement than 1st generation data hub/lakes.

Historical Data

Historical Data

Historical DataHadoop HFS

JSON, CSV, XML Data LakeNo Predetermined Schema

Page 11: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

11www.datavail.com

What is SlamData?

• SlamData is not a Database• SlamData is not a monitoring tool• SlamData is not an ETL tool• SlamData is not NoSQL• SlamData is not a replacement SQL

Server, Oracle, DB2, MySQL, Informix, etc...

• SlamData is not expensive

• SlamData is an analytics engine

• SlamData uses SQL2 for queries• SlamData will natively connect to

MongoDB, Hadoop (eventually SQL,

Oracle, MySQL, Flatfiles, and more)

• SlamData solves the problem of directly querying JSON, CSV, ect.

• SlamData spans a huge gap in traditional data warehouse needs

NOT IS

Page 12: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

Examples of SlamDatain Action

Page 13: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

13www.datavail.com

Interactive reports

• Live interactive reports. Embed them as real-time visuals in your own Analytics Dashboard or share them as quick insights.

Page 14: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

14www.datavail.com

Complex queries over nested data

Page 15: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

15www.datavail.com

Chart out Machine Data

• Machine data visualizations are quick and easy. Embed them as real-time visuals in your own Analytics Dashboard or share them as quick insights.

Page 16: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

The Value of SlamData

Page 17: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

17www.datavail.com

When Could This Solution Make Sense?

1. You are using MongoDB and getting reporting out is a struggle

2. You’re planning a traditional data warehouse project, and the 6-12 month time frame is daunting and you need better report planning to determine ROI

3. You are using a product like Splunk to capture machine data and it’s become too expensive

4. You have Hadoop or are planning to implement Hadoop as a DataLake or DataHub

Page 18: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

18www.datavail.com

Why this approach? Simple, save time and money

• Scoping EDW is more simple• Imagine the ability to eliminate the overhead of planning the data

structure before you know the end analytic needs

• ETL development is less complex• If the task is just defined as capturing and storing the data; it

becomes much more simple

• Implement solutions in days to weeks, not weeks to months

• SAVE $$$$, Less costly storage options, no ETL software, less maintenance, lower cost to implement.

Page 19: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

19www.datavail.com

Case Studies

Global technology company

Needs:• Consolidated security and log analytics

• Needed ability to do complex ad-hoc queries without limitations.

• Share and publish results easily

Solution:• Using MongoDB to live capture logs

• SlamData for ad-hoc queries and visualizations

Large Government AgencyNeeds:• Consolidate data from 5+ data sources in

various formats

• Need to be able to answer ad-hoc questions in minutes to hours, not days to weeks

• Data is perishable, slow brittle ETL or data mapping was not a good option

Solution:• Consolidate data into MongoDB datahub

• Use SlamData for building rapid reports that can be shared and published

Page 20: Building the Modern Data Hub: Beyond the Traditional Enterprise Data Warehouse

20www.datavail.com

So What’s Next Step?

• Lets us show you - give us your toughest data analytics problem

• Deliver a POC in two weeks or less• SlamData is the missing piece of data lake/data hub

•Fast time to value, less cost• Leverage current SQL skills, lower the learning curve

• Build powerful reports, dashboards in minutes, on live data