23
© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY TDWI Data Warehouse Automation Better, Faster, Cheaper ... You Can Have It All

TDWI Data Warehouse Automation Coursebook 010818download.1105media.com/tdwi/Remote-assets/Onsite/... · The Enterprise Business Intelligence initiative was designed to meet that vision,

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

TDWI Data Warehouse Automation Better, Faster, Cheaper ... You Can Have It All

TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY.

Previews of TDWI course books offer an opportunity

to see the quality of our material and help you to select

the courses that best fit your needs. The previews

cannot be printed.

TDWI strives to provide course books that are content-

rich and that serve as useful reference documents after

a class has ended.

This preview shows selected pages that are

representative of the entire course book; pages are not

consecutive. The page numbers shown at the bottom of

each page indicate their actual position in the course

book. All table-of-contents pages are included to

illustrate all of the topics covered by the course.

TDWI Data Warehouse Automation

ii © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

CO

UR

SE O

BJE

CTI

VES You Will Learn:

• Concepts, principles, and practices of data warehouse automation

• The current state of data warehouse automation technology

• Automation opportunities and benefits when building or managing a data warehouse

• How to get started with data warehouse automation

• Best practices and mistakes to avoid with data warehouse automation

TDWI takes pride in the educational soundness and technical accuracy of all of our courses. Please send us your comments—we’d like to hear from you. Address your feedback to:

[email protected]

Publication Date: October 2017

© Copyright 2017 by TDWI. All rights reserved. No part of this document may be reproduced in any form, or by any means, without written permission from TDWI. Product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies.

TDWI Data Warehouse Automation

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY iii

TAB

LE O

F C

ON

TEN

TS

Module 1 Data Warehouse Automation Concepts and Principles ... 1-1

Module 2 Building and Managing the Data Warehouse ................... 2-1

Module 3 Using Data Warehouse Automation .................................. 3-1

Module 4 Data Warehouse Automation in Action ............................. 4-1

Module 5 Getting Started with Data Warehouse Automation .......... 5-1

Appx A Bibliography and References ............................................. A-1

TDWI Data Warehouse Automation Data Warehouse Automation Concepts and Principles

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 1-1

Module 1 Data Warehouse Automation Concepts and Principles

Topic Page

Data Warehouse Automation Basics 1-2

Why Data Warehouse Automation? 1-10

The Foundation 1-16

Activities and Deliverables 1-22

The Technology Landscape 1-26

Data Warehouse Automation Concepts and Principles TDWI Data Warehouse Automation

1-10 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

Why Data Warehouse Automation? Business Benefits

TDWI Data Warehouse Automation Data Warehouse Automation Concepts and Principles

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 1-11

Why Data Warehouse Automation? Business Benefits

QUALITY AND EFFECTIVENESS

Data warehouse automation delivers quality and effectiveness through the ability to build better solutions. Better solutions are those that best meet real business requirements, and it is especially difficult to get complete and correct requirements when limited to an early phase of a linear development process. With data warehouse automation the business can make changes much later in the development process and change can occur more frequently with less disruption, waste, and rework.

AGILE BUSINESS Ability to change fast and frequently extends beyond the warehouse

development process. Changes that occur in business requirements can be met with quick response. Responding to change in real time and without the delay of lengthy projects is the essence of business agility.

SPEED Speed is the critical factor that enables agility both for agile business and

for agile development. The ability to generate quickly and to regenerate equally fast when change occurs is a fundamental automation capability.

COST SAVINGS Ultimately building better, building faster, and changing quickly when

needed bring substantial cost savings to data warehouse development, operation, maintenance, and evolution.

Data Warehouse Automation Concepts and Principles TDWI Data Warehouse Automation

1-16 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

The Foundation Components of Data Warehousing

metadata

architectural standards data models

source-to-target mapping data transformation rules and logic

database load routines audits, balancing, and controls

event, error, and exception handling

data warehouses, data marts, and exploratory data

data delivery

virtu

aliz

atio

n

cons

olid

atio

n

transform view

source view

business view

transform

extract

load

data sources for data warehouse

TDWI Data Warehouse Automation Data Warehouse Automation Concepts and Principles

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 1-17

The Foundation Components of Data Warehousing

DATA FLOW – FROM DISPARATE DATA TO INTEGRATED INFORMATION

The facing page illustrates the core elements of data warehousing, beginning with disparate data sources at the bottom of the diagram and leading to integrated information resources at the top. The information resources are not necessarily the end of the line – they are simply the end of the data integration processes. Business value is created when they are used for reporting, business intelligence, decision making, analytics, etc. The center of the diagram shows the processing steps to get from data sources to integrated information using two methods – data consolidation through extract, transform, and load (ETL) processing and data virtualization through a series of abstract data views. Consolidation and virtualization are often combined to optimize the flow of data. Data warehouse automation can implement both approaches independently or in a mix-and-match form.

ALL OF THE PARTS Data warehousing encompasses many different techniques and produces

many components to enable the data-to-information flow. Among these are architectural standards, data models, mappings, data transformations, database load procedures, tests and controls, events and errors, and metadata. Data warehouse automation includes capabilities to create, connect, manage, and apply all of these components.

TDWI Data Warehouse Automation Building and Managing the Data Warehouse

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 2-1

Module 2 Building and Managing the Data Warehouse

Topic Page

The Data Warehousing Lifecycle 2-2

Building the Data Warehouse 2-22

Managing the Data Warehouse 2-30

Building and Managing the Data Warehouse TDWI Data Warehouse Automation

2-22 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

Building the Data Warehouse Requirements-Driven Development

TDWI Data Warehouse Automation Building and Managing the Data Warehouse

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 2-23

Building the Data Warehouse Requirements-Driven Development

A TRADITIONAL APPROACH

Requirements-driven development is the traditional process that begins with requirements gathering and proceeds in a linear fashion through analysis, modeling and design, specification and coding, and deployment. This approach is a legacy of conventional software development processes that does not work well for data warehousing. It is characterized by: • Intensive and detailed planning • Long development timelines • Limited business participation early in the process • Substantial staffing and skills demand • Difficulty in requirements gathering and analysis • Much cycling back to previous steps to correct errors and oversights • Much waste and rework • Post-deployment discovery of incorrect and missed requirements

Building and Managing the Data Warehouse TDWI Data Warehouse Automation

2-30 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

Managing the Data Warehouse Operations

TDWI Data Warehouse Automation Building and Managing the Data Warehouse

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 2-31

Managing the Data Warehouse Operation

AUTOMATION FOR OPERATIONS

Operations comprises a core set of activities in managing a data warehouse, with attention to: • Sequencing • Dependencies • Scheduling • Execution • Verification • Validation • Error Handling Automation aids data warehouse operations with features and functions for: • Scheduling • Documentation and Metadata • Managed Environments • Validation Testing

TDWI Data Warehouse Automation Using Data Warehouse Automation

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 3-1

Module 3 Using Data Warehouse Automation

Topic Page

Automation Use Cases 3-2

Case Studies 3-20

Using Data Warehouse Automation TDWI Data Warehouse Automation

3-2 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

Automation Use Cases Building a New Data Warehouse

TDWI Data Warehouse Automation Using Data Warehouse Automation

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 3-3

Automation Use Cases Building a New Data Warehouse

SCENARIO You need to build a new data warehouse either where none exists or as a complete replacement of an existing and dysfunctional warehouse.

CHALLENGES Without automation, all of the normal data warehousing challenges exist.

Source data is messy, warehouses are hard to build, they take too long to build, and they are obsolete before they are deployed.

OPPORTUNITIES Automation opportunities are abundant in this scenario. You can

automate everything from planning to deployment and operation using any of model-driven, discovery-driven, or data-driven approaches.

Using Data Warehouse Automation TDWI Data Warehouse Automation

3-20 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

Case Studies ABInBev

TDWI Data Warehouse Automation Using Data Warehouse Automation

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 3-21

Case Studies ABInBev

BEVERAGE INDUSTRY COMPLEXITIES

Founded in London, Ontario, in 1847 and the proud brewer of more than 60 quality beer brands, Labatt is the leading brewery in Canada and a part of Anheuser-Busch InBev, the world’s largest brewer by volume. Since the late 1990s, Labatt has evolved from a regionally federated business into a centralized one, reporting into a global parent company, Interbrew. In 2004, Interbrew merged with AmBev, creating InBev. This merger introduced another layer into what was already a complex organizational picture. Today, Labatt also markets, distributes, and sells InBev’s global brands. This requires Labatt to manage the performance of both global and local brands and report the results to its various stakeholders. Labatt’s data environment is very rich, consisting of typical internal data sources and a large number of external data sources. This adds complexity, particularly in sales and marketing, when comparing regional performance or combining information in consistent ways. Labatt’s vision was to integrate planning and performance management in one environment. The Enterprise Business Intelligence initiative was designed to meet that vision, providing simple, easy access to multidimensional information for business users to manage their business and analysts to gain greater insights through more powerful analytic tools. To support this performance management challenge, Labatt turned to a data warehouse automation solution that can quickly adapt to business change. Source: http://kalido.com/portfolio-item/abinbev/

TDWI Data Warehouse Automation Data Warehouse Automation in Action

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 4-1

Module 4 Data Warehouse Automation in Action

Topic Page

Attunity Compose in Action 4-2

Magnitude Software Kalido in Action 4-4

TimeXtender in Action 4-6

WhereScape in Action 4-8

TDWI Data Warehouse Automation Getting Started with Data Warehouse Automation

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 5-1

Module 5

Getting Started with Data Warehouse Automation

Topic Page

Step by Step 5-2

Human Factors 5-4

New Horizons 5-6

Next Steps 5-8

Getting Started with Data Warehouse Automation TDWI Data Warehouse Automation

5-2 © TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY

Step by Step A Process for Automation

TDWI Data Warehouse Automation Getting Started with Data Warehouse Automation

© TDWI. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. DO NOT COPY 5-3

Step by Step A Process for Automation

BUSINESS CASE Build the business case for data warehouse automation based on business benefits, not technical benefits. Focus on value-creating benefits such as speed, agility, solution quality, and cost savings.

SPONSORSHIP AND BUY-IN

Seek a sponsor who can secure the funding, resources, and political will to drive data warehouse automation. Then identify other key stakeholders and work to secure their buy-in.

SHORT LIST Identify the candidate list of vendors and products that fit your needs,

constraints, and culture. When developing the short list, also identify the criteria that you will use to make a final selection.

PROOF OF CONCEPT

Identify one or two proof-of-concept projects and ask each vendor on the short list to illustrate how they will do the work. Before executing the projects establish a baseline for comparison either by: (1) automating something that you’ve already done and have known time and cost to build it manually, or (2) by building something new for which you’ve estimated time and cost to build manually.

TECHNOLOGY SELECTION

Based on proof of concept results and previously defined selection criteria, make your technology decision and install the automation tools.

ORGANIZATIONAL CHANGE

Recognize the need for and undertake the necessary organizational changes. The processes, roles, responsibilities, and team configurations for manual data warehousing are not what you will want for automation.

TRAINING AND IMPLEMENTATION

Train people to use the automation tools, and train them to understand their new roles and responsibilities with data warehouse automation.