93
Big Data, Analytics and 4 th Generation Data Warehousing Martyn Jones Big Data Spain 2015

Big data, Analytics and 4th Generation Data Warehousing

Embed Size (px)

Citation preview

Page 1: Big data, Analytics and 4th Generation Data Warehousing

Big Data, Analytics and 4th Generation Data Warehousing

Martyn Jones

Big Data Spain 2015

Page 2: Big data, Analytics and 4th Generation Data Warehousing

agenda

∙ Imperatives.∙ Data value chains.∙ Resources.∙ 4th Generation Data

Warehousing.∙ Analytics Data Store / Big Data.

∙ Information Supply Framework.

Friday 16th 12:30 - 13:15 #BDS15

Room 25 – Technical

0 5 10 15 20 25 30 35 40 45#BDS15

Page 3: Big data, Analytics and 4th Generation Data Warehousing

Quote, Unquote

"It is not consciousness of men that determines their being, but, on the contrary, their social being that determines their consciousness.“

Karl Marx

Page 4: Big data, Analytics and 4th Generation Data Warehousing

business background

Page 5: Big data, Analytics and 4th Generation Data Warehousing

Media presence

Twitter @GoodStratTweet

http://www.goodstrat.com http://www.linkedin.com/grp/home?gid=8338976

http://www.itworld.com/blog/it-circus/

Page 6: Big data, Analytics and 4th Generation Data Warehousing

Quote, Unquote

“Do Big Data initiatives require a business case? If so, have you ever seen one?” –Joseph Cotter, UK

“Big data - reinventing the wheel every day with a new and slightly different value for Pi.” – Karl Snowsill, Australia

“The Big Data Contrarians - A place where you can find a way to cut through BIG Bull…” – Sanjay Pandey, Canada

“"If you had all the answers in the world...what would your question be?“ - Yves de Hondt – Belgium

“Big Data in bite size sessions - walk this way !!” – Steve Scholes, MBA, UK

“The only sane spot in the Big Data asylum.” – Dominic Vincent Ligot, Phillipines

“Enforcing strict limits on koolaid consumption” – Gary Anderson, USA

Page 7: Big data, Analytics and 4th Generation Data Warehousing

the ages of data

B . C . L i f e o f B r i a n A . D .

Page 8: Big data, Analytics and 4th Generation Data Warehousing

C h a n g eI n s i g h tP o t e n t i a l l y

u s e f u l

Simplicity

A b u n d a n t

V o l u m e V e l o c i t y V a r i e t y

Page 9: Big data, Analytics and 4th Generation Data Warehousing

framework

O b t a i n I n t e g r a t e A n a l y s e P r e s e n t

D A T A

D A T A

D A T A

Page 10: Big data, Analytics and 4th Generation Data Warehousing

the road to Big Data success…

S t r a t e g i c

T a c t i c a l

O p e r a t i o n a l A n a l y t i c s

A r c h i t e c t e d

M a n a g e d

I n t e g r a t i o n

D a t a

Page 11: Big data, Analytics and 4th Generation Data Warehousing

scope

BIZ DATA DWBIG

DATASTATS PRES

Page 12: Big data, Analytics and 4th Generation Data Warehousing

Business ImperativesA good place to start

Page 13: Big data, Analytics and 4th Generation Data Warehousing

what’s important to business?

BE

NOTICEDCASH

FLOWBE

NOTICEDCASH

FLOWBE

NOTICEDCASH

FLOW

Page 14: Big data, Analytics and 4th Generation Data Warehousing

what else is important to business?

Market share

Differentiation

Ability to execute

Liquidity

Profitability

Time and place utility

React to

competitive threats

Enhance service

scope

Improving customer

service

Respond to price

pressure

Segmentation of n

Addressing short-term

attention spans

Ability to respond to

irrationality

Be noticed

Cash flow

Risk

Legislation

No pressBad press

Customer

centricity

Front office

empowerment

Excellence

Channel

excellence

Operational excellence

Product

excellence

Cultures

IT business

value

Base protection

Expansion

Diversification

Consolidation

Page 15: Big data, Analytics and 4th Generation Data Warehousing

Augmented Competitive Forces

Competition from

within the industrySuppliers Buyers

Replacements

Potential entrants

Threat of replacement product or service

Threat of newentrants

Bargainingpower

Bargainingpower

Sources: Michael Porter;Martyn R Jonesand others

Rivalry with existing

competitors

Pressure groupsMedia

Government

Power to change the game

Exposure

Page 16: Big data, Analytics and 4th Generation Data Warehousing

McKinsey 7S Framework

Culture

Page 17: Big data, Analytics and 4th Generation Data Warehousing

differentiated capabilities

Page 18: Big data, Analytics and 4th Generation Data Warehousing

operating models

Customer segments

Channels

Products

Services

Organsational design

Processes

Data & information

Physical assets

Development

Deployment

Organsational design

Performance management

Information technology

Business

model

Operating

model

People

model

Customers

Systems People

Processes Organisation

Page 19: Big data, Analytics and 4th Generation Data Warehousing

objectives

1. Information awareness corresponding to areas of operation and spheres of control

2. Comprehensive data and information supply framework

3. Continually seek to maintain and then improve data’s contribution to business

Page 20: Big data, Analytics and 4th Generation Data Warehousing

Business data everywhereWhere, when, what, who, why... how?

Page 21: Big data, Analytics and 4th Generation Data Warehousing

Data

I n t e r n a l P a s t

E x t e r n a l P r e s e n t

S h a r e d F u t u r e

Page 22: Big data, Analytics and 4th Generation Data Warehousing

Data

O p e r a t i o n a l O n l i n e

B i g D a t a A r c h i v e d

D a r k D a t a U n m a n a g e d

Page 23: Big data, Analytics and 4th Generation Data Warehousing

Data

A r c h i v e s S o c i a l M e d i a

D o c u m e n t s M a c h i n e L o g

M e d i a S e n s o r

B u s i n e s s

A p p l i c a t i o n s

D a t a

S t o r a g e

P u b l i c W e b

Page 24: Big data, Analytics and 4th Generation Data Warehousing

Activities, Abstractions and Relations

Page 25: Big data, Analytics and 4th Generation Data Warehousing

Velocity

Volume

Variety

Adequacy

Ambiguity

Small

Availability

Accuracy

Relevance

Persistence

Reliability

Value

Obtuseness

Listo

Complexity

Utility

Descriptiveness

Big

Velocidad

Volumen

Variedad

Adecuación

Ambigüedad

Precisión

Disponibilidad

Exactitud

Relevancia

Persistencia

Confiabilidad

Valor

Obtuso

Smart

Complejidad

Utilidad

Descriptivo

Grande

D a t a

Facets of Big DataFacets of Data

Page 26: Big data, Analytics and 4th Generation Data Warehousing
Page 27: Big data, Analytics and 4th Generation Data Warehousing

B I G D A T A

I n t e r n e t o f

T h i n g s

C L O U D

S t a t i s t i c s

D a t a

W a r e h o u s i n g

P r e s e n t a t i o n

D a t a S u p p l y F r a m e w o r k

Page 28: Big data, Analytics and 4th Generation Data Warehousing

The Data Warehouse25 years... of sometimes getting it right

Page 29: Big data, Analytics and 4th Generation Data Warehousing

Enterprise Data Warehousing – AS IS

S u b j e c t

o r i e n t e d

S t r a t e g i c

d e c i s i o n m a k i n g

I n t e g r a t e d

T i m e

v a r I a n tN o n – v o l a t i l e

Page 30: Big data, Analytics and 4th Generation Data Warehousing

Operational Systems Data Warehouse

Purchasing

HR

CreditOrder

Processing

Marketing

SalesLogistics

Billing

Arrangements

ProductsParty

TimeGeography

Transactions

Subject oriented

Page 31: Big data, Analytics and 4th Generation Data Warehousing

Operational Systems Data Warehouse

Euro Account Customer:Customer: Village Bank GmbHCountry code: D

Mutual Fund Customer:Customer: Village BankersRegion: Westphalia

NTIP Customer:Customer: Village Bank InternationalCountry: Germany

Account:Number Customer Type230956 441353 Euro010555 441353 MF291284 441353 NTIP

Party:Number: 100441353Name: Village Bank GmbHCountry: Germany

Integrated

Page 32: Big data, Analytics and 4th Generation Data Warehousing

Operational Systems Data Warehouse

0

10

20

30

40

50

60

70

80

90

100

Trading Activity Snapshots:

Date Security Amount

2006.09.01 MartyBank 79.000.000

2006.09.02 MartyBank 92.000.000

2006.09.03 MartyBank 44.000.000

2006.09.04 MartyBank 39.000.000

2006.09.05 MartyBank 80.000.000

Trading Activity: MartyBank

Time variant

Page 33: Big data, Analytics and 4th Generation Data Warehousing

Operational Systems Data Warehouse

Order

Processing

Create

Replace

Update Delete

Orders

Read Read

Read ReadWrite

Read

Non-volatile

Page 34: Big data, Analytics and 4th Generation Data Warehousing

Strategic decision support

Supporting strategy formulation,

choice and execution

Page 35: Big data, Analytics and 4th Generation Data Warehousing

Data Warehousing 2.0

Data Sources

Str

uc

ture

d D

ata

ETL

Extr

ac

t

Tra

nsf

orm

Loa

d

Internal

ODS

ODS

EDW

ETL

Extr

ac

t

Tra

nsf

orm

Loa

d

Data Marts

Str

uc

ture

d D

ata

Un

stru

ctu

red

Data

Mart

Data

Mart

Report Repository

Reports &

Extracts

Stats

Da

ta s

ele

ctio

n a

nd

re

pre

sen

tatio

n

Da

ta a

na

lytic

s

Re

po

rt s

et

an

d e

xtr

ac

t c

rea

tio

n

Service

Pu

sh /

Pu

ll Te

ch

no

log

y

Vis

ua

lisa

tio

n

An

no

tatio

n

Users

Inte

rna

l

Clie

nts

Oth

er

sta

ke

ho

lde

rs

Metadata, Workflow/Process Control and CIW Management

Metadata ProcessÊDW

Management

Staging

Staged

Data

EDW

Un

stru

ctu

red

EDW

Data

Mart

Str

uc

ture

d D

ata

Un

stru

ctu

red

Page 36: Big data, Analytics and 4th Generation Data Warehousing

The Data Warehouse25 years... of sometimes getting it right… and wrong

Page 37: Big data, Analytics and 4th Generation Data Warehousing

Enterprise Data Warehousing – AS A BODGE

G e t d a t a

W o n d e r w h y i t ‘ s n o t

m e e t i n g e x p e c t a t I o n s

D u m p d a t a

Q u e r y d a t a V i s u a l i s e d a t a

Page 38: Big data, Analytics and 4th Generation Data Warehousing

Enterprise Data Warehousing – AS A BODGE

DW BODGER TEAM HADOOP TEAM

We built a data dog house using Oracle and IBM technology and we called it a data

warehouse

We can do data warehousing too and it will be cheaper, faster and smarter

Page 39: Big data, Analytics and 4th Generation Data Warehousing

Data Supply FrameworkA data architecture for data sourcing, transformation, integration, storage, search, analysis and presentation

Page 40: Big data, Analytics and 4th Generation Data Warehousing

Data Supply Framework

Operational

Data Store

Data

Warehouse

Business

Intelligence

Data

logistics

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Allinformation

and data consumers

All

information

consumers

All digital data

All data processing, enrichmentand information creation

Page 41: Big data, Analytics and 4th Generation Data Warehousing

Internal

digital data

Data Supply Framework

External

digital data

Data logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Page 42: Big data, Analytics and 4th Generation Data Warehousing

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

Data Supply FrameworkData Sources 4th Generation Data Warehousing

Data Sources Core Statistics

Cambriano Energy 2015

Page 43: Big data, Analytics and 4th Generation Data Warehousing

Core Data SourcingComprehensive data acquisition and transformation

Page 44: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Cambriano Energy 2015

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Page 45: Big data, Analytics and 4th Generation Data Warehousing

Core Data Sourcing

•Most business data is highly structured

•Most business Big Data is web related

•There is a growing collection of tools for capturing, transforming and moving both

•The closer to the money that your data is, the higher its potential value

Page 46: Big data, Analytics and 4th Generation Data Warehousing

Core Data Sourcing

•Most business data is highly structured

•Most business Big Data is web related

•There is a growing collection of tools for capturing, transforming and moving both

•The closer to the money that your data is, the higher its potential value

Page 47: Big data, Analytics and 4th Generation Data Warehousing

4th Generation Data WarehousingProviding a solid foundation for strategic, tactical and operational decision making

Page 48: Big data, Analytics and 4th Generation Data Warehousing

Enterprise Data Warehousing – 4 GEN

S u b j e c t

o r i e n t e d

S t r a t e g i c ,

t a c t i c a l & o p e r a t i o n a l

s u p p o r t

I n t e g r a t e d

T i m e v a r i a n c e &

t i m e p e r s p e c t i v e s

C o n s t r a i n e d

v o l a t i l i t y

C l a s s i f i c a t i o n

s c h e m a

R u l e b a s e d

t r a n s f o r m a t i o n

Page 49: Big data, Analytics and 4th Generation Data Warehousing

4th Generation EDW

Interpretation

Prediction

Diagnosis

Design

Planning

Monitoring

Debugging

Repairing

Instruction

Control

S t r a t e g y

T a c t i c s

O p e r a t i o n s

Page 50: Big data, Analytics and 4th Generation Data Warehousing

Using, applying and measuring

Big Data

Big Data

Big Data

Predictive Analytics

Predictive Analytics

Outcomes

EDW 4.0

EDW 4.0E(A)TL

Page 51: Big data, Analytics and 4th Generation Data Warehousing

Using, applying and measuring

Big DataPredictive analytics

Select predictions

Define trackable actions

Apply outcomes and actions to EDW

4

Accumulate campaign Big

Data

Descriptive analytics

Select findingsCombine with

trackable actions

Apply outcomes and actions to EDW

4

Run campaign

Analyse campaign and performance of Big Data analytics

Page 52: Big data, Analytics and 4th Generation Data Warehousing

Forecasts and results – from all perspectives

-400

-300

-200

-100

0

100

200

300

400

500

01/15 02/15 03/15 04/15 05/15 06/15 07/15 08/15 09/15 10/15 11/15 12/15 01/16 02/16 03/16 04/16 05/16 06/16

Cambriano Big Data Campaign 2015-2016

Forecast Actual Strategy BD Costs Benefit

Values Relativity Dimensions HierarchiesStructuresPast Future

Page 53: Big data, Analytics and 4th Generation Data Warehousing

Using, applying and measuring

•Combining Big Data analytics with Data Warehousing 4.0

•Planning and managing initiatives

•Measuring, analysing and reporting the effectiveness of business initiatives

•Measuring, analysing and reporting the tangible contribution of the Big Data analytics process to the creation of business value

Page 54: Big data, Analytics and 4th Generation Data Warehousing

Big Data and Core StatisticsA multi-faceted data theatre for ad-hoc, speculative and immediate operational analytics

Page 55: Big data, Analytics and 4th Generation Data Warehousing

Internal

digital data

Data Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Page 56: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Data Value Chains

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

DATA INFORMATION KNOWLEDGE

Requires context Requires interpretation Requires wisdom

Relevant Correct Usable

Irrelevant Incorrect Useless

Meaningless Misleading Wrong

Value? Value? Value?

Page 57: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Data Assets in MOSCOW

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

RISK

ASSET

SECURE

BAU

Assurance

Highest High Medium/LowVery

low/None

MUST SHOULD COULD WON’T

Yes Yes Maybe Maybe/No

Yes Yes Yes Maybe/No

Yes Yes Yes Maybe/No

Page 58: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Data Assets in MOSCOW

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

RISK

ASSET

SECURE

BAU

Assurance

Highest High Medium/LowVery

low/None

MUST SHOULD COULD WON’T

Yes Yes Maybe Maybe/No

Yes Yes Yes Maybe/No

Yes Yes Yes Maybe/No

Page 59: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Data Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

OLTP

Applications

‘What if ’

analysis

MIS /

Reporting

Visualisation

Publication

ºAll digital

data

Page 60: Big data, Analytics and 4th Generation Data Warehousing

Internal

digital data

DSF 4.0 Data Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

All

information

consumersº

All digital

data

Page 61: Big data, Analytics and 4th Generation Data Warehousing

Internal

digital data

External

digital data

Primary data flow

Secondary data flow

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

º

Statistics

Data

Science

Big Data

Small Data

Smart Data

This Data

That Data

That

department

Messing

with dataMap Fatten

Retrospect

Reports

Alerts

Visualisation

Analytics

This

department

The other

department

Map Reduce

DSF 4.0 Data Supply Framework

Page 62: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Data Supply Framework

Operational

Data Store

Data

Warehouse

Business

Intelligence

Data

logistics

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Allinformation

and data consumers

All

information

consumers

All digital data

All data processing, enrichmentand information creation

Page 63: Big data, Analytics and 4th Generation Data Warehousing

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Page 64: Big data, Analytics and 4th Generation Data Warehousing

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Page 65: Big data, Analytics and 4th Generation Data Warehousing

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Data Sources – This element covers all the current sources, varieties andvolumes of data available which may be used to support processes of'challenge identification', 'option definition', decision making, includingstatistical analysis and scenario generation.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 66: Big data, Analytics and 4th Generation Data Warehousing

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Core Data Warehousing – This is a suggested evolution path of the DW 2.0model. It extends the Inmon paradigm to not only include unstructured andcomplex data but also the information and outcomes derived from statisticalanalysis performed outside of the 4th generation Data Warehousinglandscape.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 67: Big data, Analytics and 4th Generation Data Warehousing

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Core Statistics – This element covers the core body of statistical competence,especially but not only with regards to evolving data volumes, data velocityand speed, data quality and data variety.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 68: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

INTO THE ZONE!

Page 69: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Complex Data – This is unstructured or highly complexly structured data contained in documents and other complex data artefacts, such as multimedia documents.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 70: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Event Data – This is an aspect of Enterprise Process Data, and typically at a fine-grained level of abstraction. Here are the business process logs, the internet web activity logs and other similar sources of event data. The volumes generated by these sources will tend to be higher than other volumes of data, and are those that are currently associated with the Big Data term, covering as it does that masses of information generated by tracking even the most minor piece of 'behavioural data' from, for example, someone casually surfing a web site.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 71: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Infrastructure Data – This aspect includes data which could well be described as signal data. Continuous high velocity streams of potentially highly volatile data that might be processed through complex event correlation and analysis components.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 72: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Event Applicance – This puts the dynamic data collation, selection and reduction functionality as close to the point of event data generation as physically possible.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 73: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Signal Applicance – This puts the dynamic data collation, selection and reduction functionality as close to the point of continuous streaming data generation as physically possible.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 74: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Distributed Inter Process Communication – Different forms of messaging allow high volumes of data to be transmitted in near real time.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 75: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Staging and Reduction – Traditional data staging combined with in-line data reduction.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 76: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

ET(A)L – Extending ETL to include data analytics components tightly integrated into parallel ETL job streams.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 77: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

ADS – The Analytics Data Store. 1. Statistics oriented 2. Integrated by focus area 3. Variable volatility 4. Time variant

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 78: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Statistical Analysis – Qualitative analysis. Diagnostic analysis, predictive analysis, speculative analysis, data mining, data exploration, modelling.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 79: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Scenarios and outcomes – 1. Snapshots of outcomes of scenario analysis as the process of analyzing possible future events by generating alternative possible outcomes. 2. Captured outcomes of statistical analysis.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

Page 80: Big data, Analytics and 4th Generation Data Warehousing

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Martyn Richard Jones 2015 – martynjones.eu

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Write back – The ability to append data, update data and enrich data within the Analytics Data Store, and to provide scenario data to the Core Data Warehousing.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

Page 81: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 – Core Statistics: Analytics Data Store

Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

Page 82: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 – Analytics Data Store

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

Distributed File SystemNon-relational distributed file storage / NoSQL

DFS (Including ‘refractoring’ of Unix primitives)

Unix File StorePOSIX compliant

Document DBMS

Graph DBMSKey-Value

DBMSIn-memory Column Oriented Relational

DBMS

Relational DBMS (MPP/SMP/Hybrid)

Object DBMS

POSIX compliant Unix / Linux primitives

Relational DBMS

Page 83: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 – Analytics Data Store - Technologies

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

Page 84: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 – What’s important?

Cambriano Energy 2015 - http://www.cambriano.es

Data Warehouse

Martyn Richard Jones 2015 – martynjones.euPublished by goodstrat.com

Business Intelligence

Operational Data Store

Analytics Data Store

Statistical Analysis

Dark Data

Big Data

Internet of Things

Knowledge Management

Structured Intellectual

Capital

Cloud

Page 85: Big data, Analytics and 4th Generation Data Warehousing

SummaryA good place to end, for now

Page 86: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Data Supply Framework

Operational

Data Store

Data

Warehouse

Business

Intelligence

Data

logistics

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Allinformation

and data consumers

All

information

consumers

All digital data

All data processing, enrichmentand information creation

Page 87: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Perspectives

Look back From now

From then

From before

From the future

Look at now

Look at near +/-

Look foward From now

From before

From the future

Multiple worlds and universes

Page 88: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Perspectives

What we got right

What we can do better

What we can retry at another time

What we can drop

Page 89: Big data, Analytics and 4th Generation Data Warehousing

DSF 4.0 Perspectives – Look Back

2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020

From nowFrom the

futureFrom then

Dimensions

Classification

From

before

Data

Page 90: Big data, Analytics and 4th Generation Data Warehousing

Summary

• Never open up too many data fronts at the same time

• Iterate and take baby steps

• Use agile where it makes sense

• Keep everything as close to the business as possible

• Involve the business – continuously

Page 91: Big data, Analytics and 4th Generation Data Warehousing

Summary

• Consider everything

• Question everything

• Never stop hypothesising

• Never stop testing

• For every initiative have a business imperative

• Make continuous engagement and involvement a goal

Page 92: Big data, Analytics and 4th Generation Data Warehousing

Muchas graciasMany thanks

Big Data Spain 2015

Page 93: Big data, Analytics and 4th Generation Data Warehousing

Big Data, Analytics and 4th Generation Data Warehousing

Big Data Spain 2015