Upload
dodien
View
216
Download
0
Embed Size (px)
Citation preview
The 7 Practices for Highly Effective Data Warehouse Applications
Bryan Rockoff, Director of Client Services,Baseline Consulting
2
Questions & Answers
• Use the Ask a Question window in the lower left-hand corner of your Webinar Console to pose a question to our presenters today:
Bryan Rockoff, Director of Client Services, Baseline Consulting [email protected]
Jesse Fountain, Sr. Director, POC Operations, [email protected]
www.baseline-consulting.com
Copyright © 2006, Baseline Consulting. All rights reserved. 1
www.baseline-consulting.com
Copyright © 2006, Baseline Consulting. All rights reserved. 1
7 Practices for Highly Effective Data Warehouse Applications
Bryan RockoffDirector, Client ServicesBaseline Consulting
Data Mastered.Value Unleashed. ℠
Copyright © 2006, Baseline Consulting. All rights reserved. 2Copyright © 2006, Baseline Consulting. All rights reserved. 2
About the Presenter…
Bryan RockoffDirector
Baseline Consulting
Bryan Rockoff is a Director with Baseline Consulting, an acknowledged leader in data integration and business analytics services delivery. The focus of Bryan's 25-year career has been
bridging technical and business disciplines, focusing on the effective use of data to advance strategic business objectives.
He has worked with leading companies, including Warner Brothers,The Gap, JC Penney, Charles Schwab, and Microsoft on a range of information delivery challenges. Prior to joining Baseline in 2004, Bryan worked at Teradata for 16 years, where he held various
management and technical leadership positions.
Copyright © 2006, Baseline Consulting. All rights reserved. 3Copyright © 2006, Baseline Consulting. All rights reserved. 3
About Baseline Consulting:
A Management and Technology Consulting Firm specializing in Business Analytics, Data Warehousing,
Data Management, and Data Integration
Baseline Consulting helps large and mid-sized businesses enhance the value of enterprise data, improve business results, and achieve self-sufficiency in managing and using data as a corporate asset. We use
proven, structured approaches that are driven by business needs, create strong business and IT partnerships, and clarify the many complex data
challenges companies face so they can take action.
Baseline’s only business is mastering data.
Data Mastered.Value Unleashed. ℠
Copyright © 2006, Baseline Consulting. All rights reserved. 4Copyright © 2006, Baseline Consulting. All rights reserved. 4
Why Focus on 7 Habits?
Change the game, shift the paradigm--a change in perception and interpretation of how the world works. It encouraged people to rethink what they were doing and focus on the method rather than the results.Seven allows people to focus on specific, achievable topics that will cause improvement
Copyright © 2006, Baseline Consulting. All rights reserved. 5Copyright © 2006, Baseline Consulting. All rights reserved. 5
The “Habits”
7. The right architecture for the right reasons
7. Sharpen the saw.
6. Implement BI “fresh”, don’t replicate the past
6. Synergize.
5. Requirements are the key to success
5. Seek First to Understand, Then to be Understood.
4. Data Warehousing is for the business, not IT
4. Think Win/Win.
3. Pick the right data modeling practice
3. Put First Things First.
2. Data management is a cornerstone of a successful data warehouse
2. Begin with the End in Mind.
1. Treat data as a corporateasset
1. Be Proactive. Bryan’s 7 Habits for DWSteven Covey’s 7 Habits
Copyright © 2006, Baseline Consulting. All rights reserved. 6Copyright © 2006, Baseline Consulting. All rights reserved. 6
Habit 1 - Treat Data as a Corporate Asset:Treat Data Like You Treat Other Assets
Manage it via the same logical controls you manage your physicalassets
All companies have security devices (cameras, badges, checks andbalances) to track physical assets. Do the same for your data
Track who comes and goesRequire credentials from the users of your data, like you require people to have ID badges and visitor passes
Background check the dataContact source system data steward
– Determine if data is being used by other systems (contact them)– Discuss change control and data quality activities.
Interview the current data sponsors and source system staff to identify change/maintenance issues.
Get the data right - determine a data correction strategy Data error detection is the responsibility of everyone, correction belongs to the system of recordWork with sponsor to determine ongoing resolution needs
Copyright © 2006, Baseline Consulting. All rights reserved. 7Copyright © 2006, Baseline Consulting. All rights reserved. 7
Habit 2 - Data Management:A Data Management Maturity Model
Reactive Re-using Organized CentralizedData Deployment
ScopeData Management Maturity
Data accessedvia firefighting
mode
Subject orfunctionaldata marts
Proactive datarequirements/development
MDM as abusiness service
Enterprise
Reliance onpersonal
relationshipsfor new data
Spreadsheetsavailable on
a shared drive
Dedicateddata
developmentteam
Formalizedpolicies via
data governance
Cross-Functional
Data isconsistentlyrecreated orduplicated
Spreadsheetsshared via
Access to data experts for
development
Data qualityprocesses
and solutionsadopted
Departmental
Individuals forced to gather
their own data
Personalspreadsheets
become de-factodata sources
Access todata experts
for assistanceQuick access,
nimble decisionsIndividual
Efficiencies Effectiveness
Copyright © 2006, Baseline Consulting. All rights reserved. 8Copyright © 2006, Baseline Consulting. All rights reserved. 8
Habit 2 - Data Management:Define Data Management Roles
Source Data Steward – Supports iSchwab by recommending data sources and defining the value and meaning of source system contentBusiness Data Steward – A subject matter expert who works with business users on terminology, definitions and usage supportData Sponsor – Business staff member who owns data definition, and participates in access and correction policy enforcementData Modeler – Works with business data stewards to ensure accurate modeling, design, and implementation of their dataMetadata Manager - Develops and maintains end-user and developer metadata.
Because data is a corporate asset, various data support roles are distributed across technology and business organizations. These roles are particularly important when data access crosses organizational boundaries
ServiceDomain ISchwab MIS
MIS
MIS
ServiceDomain
ServiceDomain User
User
User
User
User
User
Key RolesKey Roles
Copyright © 2006, Baseline Consulting. All rights reserved. 9Copyright © 2006, Baseline Consulting. All rights reserved. 9
Habit 3 – Data Modeling: The Benefits of Data Modeling
A process for identifying business rules and data relationships
Provides a method for business users to discuss details without being burdened by technical detailsFocuses on gathering data requirements
Requires ongoing maintenance because the business changes
As companies change and adjust business models, the data’s representation of the business will changeRequires minimal ongoing resources
Separate from the physical database design activitySimplifies data education for new users and developersProvides basis for metadata content development
Copyright © 2006, Baseline Consulting. All rights reserved. 10Copyright © 2006, Baseline Consulting. All rights reserved. 10
Habit 3 – Data Modeling: Maintain an Logical Data Model
May include aggregation for performance considerations
No derived dataDerived data
May include redundancy for performance considerations
No redundant data
Redundancy
structured for access and performance considerations
Must be normalized to 3NF
Normalized
Names may be limited by DBMS requirements
Business NamesNames Used
Primary IndexPrimary KeyRows Identified
Tables and ColumnsEntities and Attributes
Consists of
PDMLDM
Represents data requirements in a stable and flexible format
The design reflects an enterprise perspective Not affected by an individual application
Provides a roadmap for data integration
Defines subject area data relationships
Provides a mechanism for new team members to learn data
The logical data model is the blueprint for delivering an integrated view of enterprise data. Although frequently confused with a physical database design, it is separate and unique. The logical data model reflects “the way the company looks” whereas the physical design supports data access and performance.
BenefitsBenefitsLogical versus Physical Data Models
Copyright © 2006, Baseline Consulting. All rights reserved. 11Copyright © 2006, Baseline Consulting. All rights reserved. 11
Habit 4 - DW is for the Business: The 4 Tiers of Business Analysis
Watch as the BI environment matures:Initial BI deployment focuses on standard report and metric reporting. The focus is metric-based business actionThe next evolution of analysis enables the business user to ask custom questions. This is frequently exception based or drill down analysisThe third level provides business users a means of restructuring data to support unique business problems.The final level illustrates a systemthat delivers new business insight to the user (data mining).
Nurture all knowledge users in youruser community
People will gravitate to their naturallevel of involvementTrain, deploy, listen, repeat
Copyright © 2006, Baseline Consulting. All rights reserved. 12Copyright © 2006, Baseline Consulting. All rights reserved. 12
While companies spend extensively on BI tool infrastructures, they frequently under invest in data usage support. One key to BI success is ensuring that users can navigate, identify, and query data to support their business question. The challenge is converting a business question into a query and identifying the data.
Metadata -- A key tool to assisting users in understanding the meaning of the data (e.g. what’s the formula for profit)
Query Support Desk -- Staff to assist new users with data and usage support. (e.g. “the tool can only show you last month’s data, not this month’s”)
User Audience/Group meetings – Allows larger organizations to better support individual users.
Training – Workshop activities focused on data analysis and interpretation (instead of tool functionality)
Habit 4 - DW is for the Business: Provide Data Usage Support
DetailsDetails
Copyright © 2006, Baseline Consulting. All rights reserved. 13Copyright © 2006, Baseline Consulting. All rights reserved. 13
Scoping • Preliminary review of business needs and audience.
Business Requirements• Detailed identification of business actions, information
needs, delivery metrics, and timeframes
Data Requirements• Information elements, definitions, and values. A
conceptual model if possible
Functional Requirements• The details developed in concert with IT to establish a
specific deliverable description
Requirements gathering covers a broad array of details associated with a project. It includes 4 basic sets of information: Scoping, Business Requirements, Data Requirements, and Functional Requirements
Different RequirementsDifferent RequirementsScoping
BusinessRequirements
Data Requirements
Functional Requirements
Habit 5 - Requirements :Divide Requirements into 4 Activities
Copyright © 2006, Baseline Consulting. All rights reserved. 14Copyright © 2006, Baseline Consulting. All rights reserved. 14
Habit 6 - Implement BI “fresh”, don’t replicate the past
Most companies have implemented at least one data analytic environment
Marts, warehouses, servers under desktops, etcExcel is classified as the #1 BI tool based on the number of installsPurpose BI environments are becoming mainstream –appliances, integrated tools, etc.IT in general has been de-mystified
BI failures need to be analyzed just like manufacturing systems
Root cause analysisObjective fault analysis, MTBFISO 9000, TQM, 6Sigma, BPR, etc (Quality, Quality, Quality)
Data Quality is definable and absoluteBuild an environment of zero tolerance data quality
Copyright © 2006, Baseline Consulting. All rights reserved. 15Copyright © 2006, Baseline Consulting. All rights reserved. 15
Habit 7 - The right architecture for the right reasons
Start with a logical designProcessing requirementsSystem-to-system data migrationGeneral Information Storage Location
Implement a physical architecture Establish a repository for detailed data – as well as application structured dataAddress historical and low latency stagingSupport bidirectional data migration (data sourcing from legacy systems and applications)
Often a hybrid approach is the bestIt’s not a war with winners and losers.
Consider emerging strategiesData Warehouse 2.0 from Bill Inmon
Copyright © 2006, Baseline Consulting. All rights reserved. 16Copyright © 2006, Baseline Consulting. All rights reserved. 16
Sample Architecture: Centralized
ReportingEnvironment
(Analytic)ETL(transformation)
OperationalSystems
Data Warehouse
CRM Applications (Operational)
CRM DBMS
Call Ctr Sales
Billing
Online Sales
Distribution
Contracts
HR
Develop
Web
Mart Ad-hoc
Modeling
Reporting
OLAP
CampaignMgmt SFA
CustomerCare
Mart
Mart
Operational Reporting
Bulk ETL(High Latency,
Highly Integrated & Cleansed)
Trickle ETL(High Speed,
Native Content)
bi-directional“trickle” feed
Copyright © 2006, Baseline Consulting. All rights reserved. 17Copyright © 2006, Baseline Consulting. All rights reserved. 17
Analytic Environment
ETL(transformation)
OperationalSystems
CRM Mart
CRM
Billing
Online Sales
Distribution
Contracts
HR
Develop
Web
SFA Cust Care
Operational Reporting
Bulk ETL(High Latency,
Highly Integrated & Cleansed)
Mart
Mart
Mart
OLAPAd hoc
Modeling
OLAPAd hoc
Modeling
OLAPAd hoc
Modeling
EIILayer
Campaign MgmtAd hoc
Modeling
Sample Architecture: Federated
Copyright © 2006, Baseline Consulting. All rights reserved. 18Copyright © 2006, Baseline Consulting. All rights reserved. 18
In Conclusion…
The “7 Habits for BI Success” are guidelines that should focus you as you build your new BI environment.Focus on incremental improvement. “Big Bang”efforts are often just that – explosions.Don’t be content with the status quo.Consider remodeling your BI environment. Always strive to do things better. Focus on the business value delivered and the end users in mind.
Copyright © 2006, Baseline Consulting. All rights reserved. 19Copyright © 2006, Baseline Consulting. All rights reserved. 19
Master business analytics, data integration,data management, and data warehousingwith Baseline Consulting.
Baseline Consulting Group15030 Ventura Blvd. – Ste: 19-707Sherman Oaks, CA 91403Ph: 800-747-3709 or 818-906-7638Fax: 818-907-6012www.baseline-consulting.com
Thank You!
2© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
The Battle of the Bulge –Setting the stage for Technology Changes
Compelling Issues:• Organizations expect their Enterprise Data Warehouse to double in
size over the next one to three years• Business Intelligence demands are likely to double over the next year• The top impediment for Business Intelligence is performance for ad-
hoc reporting and data mining over a growing data set• TCO for conventional technology and infrastructure far exceed
expectations.
3© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
What is a DATAllegro Data Warehouse Appliance?
DATAllegro Appliances are:• Complete DW infrastructure - “From SQL to storage”
• One vendor bundle – DBMS, hardware, software, OS• Improved price-performance over traditional data
warehouse vendors• Modular Commodity rack-based appliance• Standards and Open Source emphasis
Data Warehouse Appliance – a hardware / software / OS / DBMS bundle designed to perform traditional and complex analysis functions using commodity components at a price / performance advantage over traditional approaches
William McKnight, McKnight Associates, Inc
4© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
Node Anatomy: Commodity Hardware & RDBMS
Each node is a DBMS Engine• High Sustainable Read Write Speeds – 800 MB /s• High Scan Speeds – 1TB / minute
Commodity hardware ensures reliable performanceNodes can be added to meet expansion demands
5© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
MPP Architecture for RAIDW
ODBC/JDBCOR
Bulk Load
Master rewrites query into steps that run efficiently on each database
engine with minimal/no tuning or indexes on each node with 98%
sequential I/O
Tables can be partitioned and/or replicated to spread the workload across the applianceUsing multi-level partitioning strategies
Master
6© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
About Us
Founded in 2003HQ in Orange CountyWell Capitalized• $45m in total funding • $22.5m Series C• > $30m cash
Over 95+ StaffLarge Customers• Including NPD, Sears & TEOCO
Growth > 100%
High performance While lowering the price per terabyte
One of the best startups our investors have ever seen
7© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
DATAllegro-At-A-Glance
World Wide Support through IBM Global Services
Investors Platform & BI Partners
Retail
Financial Services
Telecom
Customers
Manufacturing
8© 2006 DATAllegro, Inc. All rights reserved. DATAllegro Proprietary and Confidential
Architected for Change
Scale up as Data Grows• Flexible Architecture provides on-going performance as data grows• Upgrades can be measured in Nodes• High Speed Backups• High Speed Loads/Access are enhanced by MPP architecture
Scale up as User Concurrency Grows• User Concurrency is scalable through workload prioritization• Support for mixed workload• Lower maintenance
Changes in Regulations• Encryption for data at rest
Changes in Technology• “Future Proofed” : enabled to take advantage of
technological advancements in CPU and Storage technology