30
The 7 Practices for Highly Effective Data Warehouse Applications Bryan Rockoff, Director of Client Services, Baseline Consulting

The 7 Practices for Highly Effective Data Warehouse ...download.101com.com/pub/tdwi/Files/The 7 Practices for Highly...The 7 Practices for Highly Effective Data Warehouse Applications

  • Upload
    dodien

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

The 7 Practices for Highly Effective Data Warehouse Applications

Bryan Rockoff, Director of Client Services,Baseline Consulting

2

Questions & Answers

• Use the Ask a Question window in the lower left-hand corner of your Webinar Console to pose a question to our presenters today:

Bryan Rockoff, Director of Client Services, Baseline Consulting [email protected]

Jesse Fountain, Sr. Director, POC Operations, [email protected]

www.baseline-consulting.com

Copyright © 2006, Baseline Consulting. All rights reserved. 1

www.baseline-consulting.com

Copyright © 2006, Baseline Consulting. All rights reserved. 1

7 Practices for Highly Effective Data Warehouse Applications

Bryan RockoffDirector, Client ServicesBaseline Consulting

Data Mastered.Value Unleashed. ℠

Copyright © 2006, Baseline Consulting. All rights reserved. 2Copyright © 2006, Baseline Consulting. All rights reserved. 2

About the Presenter…

Bryan RockoffDirector

Baseline Consulting

Bryan Rockoff is a Director with Baseline Consulting, an acknowledged leader in data integration and business analytics services delivery. The focus of Bryan's 25-year career has been

bridging technical and business disciplines, focusing on the effective use of data to advance strategic business objectives.

He has worked with leading companies, including Warner Brothers,The Gap, JC Penney, Charles Schwab, and Microsoft on a range of information delivery challenges. Prior to joining Baseline in 2004, Bryan worked at Teradata for 16 years, where he held various

management and technical leadership positions.

Copyright © 2006, Baseline Consulting. All rights reserved. 3Copyright © 2006, Baseline Consulting. All rights reserved. 3

About Baseline Consulting:

A Management and Technology Consulting Firm specializing in Business Analytics, Data Warehousing,

Data Management, and Data Integration

Baseline Consulting helps large and mid-sized businesses enhance the value of enterprise data, improve business results, and achieve self-sufficiency in managing and using data as a corporate asset. We use

proven, structured approaches that are driven by business needs, create strong business and IT partnerships, and clarify the many complex data

challenges companies face so they can take action.

Baseline’s only business is mastering data.

Data Mastered.Value Unleashed. ℠

Copyright © 2006, Baseline Consulting. All rights reserved. 4Copyright © 2006, Baseline Consulting. All rights reserved. 4

Why Focus on 7 Habits?

Change the game, shift the paradigm--a change in perception and interpretation of how the world works. It encouraged people to rethink what they were doing and focus on the method rather than the results.Seven allows people to focus on specific, achievable topics that will cause improvement

Copyright © 2006, Baseline Consulting. All rights reserved. 5Copyright © 2006, Baseline Consulting. All rights reserved. 5

The “Habits”

7. The right architecture for the right reasons

7. Sharpen the saw.

6. Implement BI “fresh”, don’t replicate the past

6. Synergize.

5. Requirements are the key to success

5. Seek First to Understand, Then to be Understood.

4. Data Warehousing is for the business, not IT

4. Think Win/Win.

3. Pick the right data modeling practice

3. Put First Things First.

2. Data management is a cornerstone of a successful data warehouse

2. Begin with the End in Mind.

1. Treat data as a corporateasset

1. Be Proactive. Bryan’s 7 Habits for DWSteven Covey’s 7 Habits

Copyright © 2006, Baseline Consulting. All rights reserved. 6Copyright © 2006, Baseline Consulting. All rights reserved. 6

Habit 1 - Treat Data as a Corporate Asset:Treat Data Like You Treat Other Assets

Manage it via the same logical controls you manage your physicalassets

All companies have security devices (cameras, badges, checks andbalances) to track physical assets. Do the same for your data

Track who comes and goesRequire credentials from the users of your data, like you require people to have ID badges and visitor passes

Background check the dataContact source system data steward

– Determine if data is being used by other systems (contact them)– Discuss change control and data quality activities.

Interview the current data sponsors and source system staff to identify change/maintenance issues.

Get the data right - determine a data correction strategy Data error detection is the responsibility of everyone, correction belongs to the system of recordWork with sponsor to determine ongoing resolution needs

Copyright © 2006, Baseline Consulting. All rights reserved. 7Copyright © 2006, Baseline Consulting. All rights reserved. 7

Habit 2 - Data Management:A Data Management Maturity Model

Reactive Re-using Organized CentralizedData Deployment

ScopeData Management Maturity

Data accessedvia firefighting

mode

Subject orfunctionaldata marts

Proactive datarequirements/development

MDM as abusiness service

Enterprise

Reliance onpersonal

relationshipsfor new data

Spreadsheetsavailable on

a shared drive

Dedicateddata

developmentteam

Formalizedpolicies via

data governance

Cross-Functional

Data isconsistentlyrecreated orduplicated

Spreadsheetsshared via

e-mail

Access to data experts for

development

Data qualityprocesses

and solutionsadopted

Departmental

Individuals forced to gather

their own data

Personalspreadsheets

become de-factodata sources

Access todata experts

for assistanceQuick access,

nimble decisionsIndividual

Efficiencies Effectiveness

Copyright © 2006, Baseline Consulting. All rights reserved. 8Copyright © 2006, Baseline Consulting. All rights reserved. 8

Habit 2 - Data Management:Define Data Management Roles

Source Data Steward – Supports iSchwab by recommending data sources and defining the value and meaning of source system contentBusiness Data Steward – A subject matter expert who works with business users on terminology, definitions and usage supportData Sponsor – Business staff member who owns data definition, and participates in access and correction policy enforcementData Modeler – Works with business data stewards to ensure accurate modeling, design, and implementation of their dataMetadata Manager - Develops and maintains end-user and developer metadata.

Because data is a corporate asset, various data support roles are distributed across technology and business organizations. These roles are particularly important when data access crosses organizational boundaries

ServiceDomain ISchwab MIS

MIS

MIS

ServiceDomain

ServiceDomain User

User

User

User

User

User

Key RolesKey Roles

Copyright © 2006, Baseline Consulting. All rights reserved. 9Copyright © 2006, Baseline Consulting. All rights reserved. 9

Habit 3 – Data Modeling: The Benefits of Data Modeling

A process for identifying business rules and data relationships

Provides a method for business users to discuss details without being burdened by technical detailsFocuses on gathering data requirements

Requires ongoing maintenance because the business changes

As companies change and adjust business models, the data’s representation of the business will changeRequires minimal ongoing resources

Separate from the physical database design activitySimplifies data education for new users and developersProvides basis for metadata content development

Copyright © 2006, Baseline Consulting. All rights reserved. 10Copyright © 2006, Baseline Consulting. All rights reserved. 10

Habit 3 – Data Modeling: Maintain an Logical Data Model

May include aggregation for performance considerations

No derived dataDerived data

May include redundancy for performance considerations

No redundant data

Redundancy

structured for access and performance considerations

Must be normalized to 3NF

Normalized

Names may be limited by DBMS requirements

Business NamesNames Used

Primary IndexPrimary KeyRows Identified

Tables and ColumnsEntities and Attributes

Consists of

PDMLDM

Represents data requirements in a stable and flexible format

The design reflects an enterprise perspective Not affected by an individual application

Provides a roadmap for data integration

Defines subject area data relationships

Provides a mechanism for new team members to learn data

The logical data model is the blueprint for delivering an integrated view of enterprise data. Although frequently confused with a physical database design, it is separate and unique. The logical data model reflects “the way the company looks” whereas the physical design supports data access and performance.

BenefitsBenefitsLogical versus Physical Data Models

Copyright © 2006, Baseline Consulting. All rights reserved. 11Copyright © 2006, Baseline Consulting. All rights reserved. 11

Habit 4 - DW is for the Business: The 4 Tiers of Business Analysis

Watch as the BI environment matures:Initial BI deployment focuses on standard report and metric reporting. The focus is metric-based business actionThe next evolution of analysis enables the business user to ask custom questions. This is frequently exception based or drill down analysisThe third level provides business users a means of restructuring data to support unique business problems.The final level illustrates a systemthat delivers new business insight to the user (data mining).

Nurture all knowledge users in youruser community

People will gravitate to their naturallevel of involvementTrain, deploy, listen, repeat

Copyright © 2006, Baseline Consulting. All rights reserved. 12Copyright © 2006, Baseline Consulting. All rights reserved. 12

While companies spend extensively on BI tool infrastructures, they frequently under invest in data usage support. One key to BI success is ensuring that users can navigate, identify, and query data to support their business question. The challenge is converting a business question into a query and identifying the data.

Metadata -- A key tool to assisting users in understanding the meaning of the data (e.g. what’s the formula for profit)

Query Support Desk -- Staff to assist new users with data and usage support. (e.g. “the tool can only show you last month’s data, not this month’s”)

User Audience/Group meetings – Allows larger organizations to better support individual users.

Training – Workshop activities focused on data analysis and interpretation (instead of tool functionality)

Habit 4 - DW is for the Business: Provide Data Usage Support

DetailsDetails

Copyright © 2006, Baseline Consulting. All rights reserved. 13Copyright © 2006, Baseline Consulting. All rights reserved. 13

Scoping • Preliminary review of business needs and audience.

Business Requirements• Detailed identification of business actions, information

needs, delivery metrics, and timeframes

Data Requirements• Information elements, definitions, and values. A

conceptual model if possible

Functional Requirements• The details developed in concert with IT to establish a

specific deliverable description

Requirements gathering covers a broad array of details associated with a project. It includes 4 basic sets of information: Scoping, Business Requirements, Data Requirements, and Functional Requirements

Different RequirementsDifferent RequirementsScoping

BusinessRequirements

Data Requirements

Functional Requirements

Habit 5 - Requirements :Divide Requirements into 4 Activities

Copyright © 2006, Baseline Consulting. All rights reserved. 14Copyright © 2006, Baseline Consulting. All rights reserved. 14

Habit 6 - Implement BI “fresh”, don’t replicate the past

Most companies have implemented at least one data analytic environment

Marts, warehouses, servers under desktops, etcExcel is classified as the #1 BI tool based on the number of installsPurpose BI environments are becoming mainstream –appliances, integrated tools, etc.IT in general has been de-mystified

BI failures need to be analyzed just like manufacturing systems

Root cause analysisObjective fault analysis, MTBFISO 9000, TQM, 6Sigma, BPR, etc (Quality, Quality, Quality)

Data Quality is definable and absoluteBuild an environment of zero tolerance data quality

Copyright © 2006, Baseline Consulting. All rights reserved. 15Copyright © 2006, Baseline Consulting. All rights reserved. 15

Habit 7 - The right architecture for the right reasons

Start with a logical designProcessing requirementsSystem-to-system data migrationGeneral Information Storage Location

Implement a physical architecture Establish a repository for detailed data – as well as application structured dataAddress historical and low latency stagingSupport bidirectional data migration (data sourcing from legacy systems and applications)

Often a hybrid approach is the bestIt’s not a war with winners and losers.

Consider emerging strategiesData Warehouse 2.0 from Bill Inmon

Copyright © 2006, Baseline Consulting. All rights reserved. 16Copyright © 2006, Baseline Consulting. All rights reserved. 16

Sample Architecture: Centralized

ReportingEnvironment

(Analytic)ETL(transformation)

OperationalSystems

Data Warehouse

CRM Applications (Operational)

CRM DBMS

Call Ctr Sales

Billing

Online Sales

Distribution

Contracts

HR

Develop

Web

Mart Ad-hoc

Modeling

Reporting

OLAP

CampaignMgmt SFA

CustomerCare

Mart

Mart

Operational Reporting

Bulk ETL(High Latency,

Highly Integrated & Cleansed)

Trickle ETL(High Speed,

Native Content)

bi-directional“trickle” feed

Copyright © 2006, Baseline Consulting. All rights reserved. 17Copyright © 2006, Baseline Consulting. All rights reserved. 17

Analytic Environment

ETL(transformation)

OperationalSystems

CRM Mart

CRM

Billing

Online Sales

Distribution

Contracts

HR

Develop

Web

SFA Cust Care

Operational Reporting

Bulk ETL(High Latency,

Highly Integrated & Cleansed)

Mart

Mart

Mart

OLAPAd hoc

Modeling

OLAPAd hoc

Modeling

OLAPAd hoc

Modeling

EIILayer

Campaign MgmtAd hoc

Modeling

Sample Architecture: Federated

Copyright © 2006, Baseline Consulting. All rights reserved. 18Copyright © 2006, Baseline Consulting. All rights reserved. 18

In Conclusion…

The “7 Habits for BI Success” are guidelines that should focus you as you build your new BI environment.Focus on incremental improvement. “Big Bang”efforts are often just that – explosions.Don’t be content with the status quo.Consider remodeling your BI environment. Always strive to do things better. Focus on the business value delivered and the end users in mind.

Copyright © 2006, Baseline Consulting. All rights reserved. 19Copyright © 2006, Baseline Consulting. All rights reserved. 19

Master business analytics, data integration,data management, and data warehousingwith Baseline Consulting.

Baseline Consulting Group15030 Ventura Blvd. – Ste: 19-707Sherman Oaks, CA 91403Ph: 800-747-3709 or 818-906-7638Fax: 818-907-6012www.baseline-consulting.com

Thank You!

Jesse M. FountainSr. Director – POC Operations

2© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

The Battle of the Bulge –Setting the stage for Technology Changes

Compelling Issues:• Organizations expect their Enterprise Data Warehouse to double in

size over the next one to three years• Business Intelligence demands are likely to double over the next year• The top impediment for Business Intelligence is performance for ad-

hoc reporting and data mining over a growing data set• TCO for conventional technology and infrastructure far exceed

expectations.

3© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

What is a DATAllegro Data Warehouse Appliance?

DATAllegro Appliances are:• Complete DW infrastructure - “From SQL to storage”

• One vendor bundle – DBMS, hardware, software, OS• Improved price-performance over traditional data

warehouse vendors• Modular Commodity rack-based appliance• Standards and Open Source emphasis

Data Warehouse Appliance – a hardware / software / OS / DBMS bundle designed to perform traditional and complex analysis functions using commodity components at a price / performance advantage over traditional approaches

William McKnight, McKnight Associates, Inc

4© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

Node Anatomy: Commodity Hardware & RDBMS

Each node is a DBMS Engine• High Sustainable Read Write Speeds – 800 MB /s• High Scan Speeds – 1TB / minute

Commodity hardware ensures reliable performanceNodes can be added to meet expansion demands

5© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

MPP Architecture for RAIDW

ODBC/JDBCOR

Bulk Load

Master rewrites query into steps that run efficiently on each database

engine with minimal/no tuning or indexes on each node with 98%

sequential I/O

Tables can be partitioned and/or replicated to spread the workload across the applianceUsing multi-level partitioning strategies

Master

6© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

About Us

Founded in 2003HQ in Orange CountyWell Capitalized• $45m in total funding • $22.5m Series C• > $30m cash

Over 95+ StaffLarge Customers• Including NPD, Sears & TEOCO

Growth > 100%

High performance While lowering the price per terabyte

One of the best startups our investors have ever seen

7© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

DATAllegro-At-A-Glance

World Wide Support through IBM Global Services

Investors Platform & BI Partners

Retail

Financial Services

Telecom

Customers

Manufacturing

8© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential

Architected for Change

Scale up as Data Grows• Flexible Architecture provides on-going performance as data grows• Upgrades can be measured in Nodes• High Speed Backups• High Speed Loads/Access are enhanced by MPP architecture

Scale up as User Concurrency Grows• User Concurrency is scalable through workload prioritization• Support for mixed workload• Lower maintenance

Changes in Regulations• Encryption for data at rest

Changes in Technology• “Future Proofed” : enabled to take advantage of

technological advancements in CPU and Storage technology

Thank you for joining us

Questions?