View
212
Download
0
Tags:
Embed Size (px)
Citation preview
Copyright © 2004, SAS Institute Inc. All rights reserved.
Building and ImplementingIntegrated Data ModelsNancy Wills, Director, Access, Query and Data MgmtRalph Hollinshead, Manager, Solutions Data Integration
Copyright © 2004, SAS Institute Inc. All rights reserved.
Overview
Part One: Building an Integrated Data Model
Part Two: Deploying and Scaling the Data Architecture
Copyright © 2004, SAS Institute Inc. All rights reserved.
SAS® Banking Intelligence Solutions Framework
Customer Retention
Customer Retention
X SellUp sellX Sell
Up sell
MarketingAutomationMarketing
Automation
CreditScoringCredit
Scoring
Credit RiskCredit Risk
Banking Intelligence ArchitectureBanking Intelligence Architecture
Strategic Performance Management
Strategic Performance Management
INTEGRATED EXTENDABLE ARCHITECTURE
FOCUSED ON BUSINESS ISSUES
BASED ON EXPERIENCE
INTEGRATED EXTENDABLE ARCHITECTURE
FOCUSED ON BUSINESS ISSUES
BASED ON EXPERIENCE
New Solutions
New Solutions
Copyright © 2004, SAS Institute Inc. All rights reserved.
SAS® Cross-Sell and Up-Sell for BankingSAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
Solution Data MartsExtract and Cleanse Files
EnterpriseSource
Systems
Independent Solutions
Solutions
SAS® Credit Risk Management
Copyright © 2004, SAS Institute Inc. All rights reserved.
Integrated Data Model: Not All Customers are the Same
Customer A: No Data Warehouse• Interested Multiple SAS Solutions
Customer B: With Data Warehouse• Adverse to Data Replication Issues
Customer C: With Data Warehouse• No Data Marts allowed – Active Data Warehousing Approach
Copyright © 2004, SAS Institute Inc. All rights reserved.
Customer A: Full SAS Data Architecture
1
2
2
Solution Data Marts
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Cross-Sell and Up-Sell for Banking
SAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
SAS® Credit Risk ManagementSAS Banking Detail Data Store
Flexible Options to Meet Customer Needs!
Copyright © 2004, SAS Institute Inc. All rights reserved.
Customer B: Partial SAS Data Architecture
1
2
2
Solution Data Marts
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Cross-Sell and Up-Sell for Banking
SAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
SAS® Credit Risk ManagementCustomer Enterprise Data Warehouse
Flexible Options to Meet Customer Needs!
Copyright © 2004, SAS Institute Inc. All rights reserved.
Customer C: Customer Data Architecture
Information Maps
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Marketing Automation
Customer Enterprise Data Warehouse
Copyright © 2004, SAS Institute Inc. All rights reserved.
Scorecard for Data Architecture ApproachData Management Issue Score
Sensitivity to Data Replication -0-5
Sensitivity to H/W processor and storage budget -0-5
Existing warehouse quality -0-5
Implementation time constraints -0-5
Intentions to implement >1 SAS solution +0-5
Historical data requirements +0-5
Score Decision
-25 No DDS. Marts only if absolutely necessary. Information maps may be appropriate.
0 Use DDS to persist current extract from source systems. Marts hold multiple extracts up to full history.
+25 Implement full warehouse, persist history in DDS and as much as wanted in the marts.
Copyright © 2004, SAS Institute Inc. All rights reserved.
Techniques for Data Model Integration
Detail Data Store• Varying Industries
• General Standards
• Warehousing Techniques
Data Marts• Approach Compared to DDS
Copyright © 2004, SAS Institute Inc. All rights reserved.
Banking- Accounts
- Account Transactions, etc.
Telco- Subscriptions
- Equipment- Networks-Calls, etc.
Insurance- Premiums
- Claims- Benefits, etc.
CustomerSupplier
EmployeeGL
AccountProduct
etc.
Integrating Models at the Industry Level
Copyright © 2004, SAS Institute Inc. All rights reserved.
Detail Data Store Standards Needed for Integration
Data Types / Lengths / Classifier Codes
Naming Conventions
Standards for Data Structures• Hierarchies
• Subtypes
• Reference Data
Copyright © 2004, SAS Institute Inc. All rights reserved.
Data Administration StandardsDomain
Data Type
Width
Applicable Class Codes
Comment/Example
Identifier Varchar 32 ID Typically the identifier from the source system.
Small Code Varchar 3 CD Short length codes such as ADDRESS_TYPE_CD
Medium Code Varchar 10 CD Medium length codes such as EXCHANGE_SYMBOL_CD
Large Code Varchar 20 CD Long length codes such as POSTAL_CD
Standard Count Code Numeric 6 CNT Standard counts such as AUTHORIZED_USERS_CNT
Name Varchar 40 NM Proper name. For example, LAST_NM, FIRST_NM, etc.
Short Length Text Varchar 20 TXT Short freeform text.
Medium Length Text Varchar 100 TXT, DESCLonger freeform text and descriptions associated with code tables.
Indicator Field Character 1 FLG Binary indicatory flag (Y or N).
Surrogate Key Numeric 10 RK, SK Generated surrogate keys.
Currency Amount Numeric 18,5 AMT Standard currency amount.
Rates and Percentages
Numeric 9,4 PCT, RT For example, exchange rates.
DateTime Date DT, DTTM Accommodate dates as well as date/time.
Copyright © 2004, SAS Institute Inc. All rights reserved.
Detail Data Store: Data Warehousing StandardsSurrogate Keys, Point-in-Time, and Rapidly Changing Data
CUSTOMER_RK VALID_FROM_DT VALID_TO_DT ACCOUNT_RK MARITAL_STATUS_CD FIRST_NM LAST_NM
100 01JAN1999 29FEB2000 201 S John Smith
100 01MAR2000 31DEC4747 201 M John Smith
ACCOUNT_RK VALID_FROM_DT VALID_TO_DT CUSTOMER_RK FINANCIAL_ACCOUNT_TYPE_CD OPEN_DT
201 01JAN1999 31DEC4747 100 SAVINGS 01JAN2000
CUSTOMER
FINANCIAL_ACCOUNT
ACCOUNT_RK VALID_FROM_DT VALID_TO_DT BALANCE_AMT CURRENCY_CD
201 01JAN1999 31JAN1999 2500.75 USD
201 1FEB1999 28FEB1999 4300.25 USD
FINANCIAL_ACCOUNT_CHNG
Copyright © 2004, SAS Institute Inc. All rights reserved.
Conformed Dimensions
Copyright © 2004, SAS Institute Inc. All rights reserved.
Tools: Extending ModelsCUSTOMER
EXTERNAL_ORG
SUPPLIER
INTERNAL_ORG
INTERNAL_ORG_ASSOC
INTERNAL_ORG_ASSOC_TYPE
COMPETITORS
Copyright © 2004, SAS Institute Inc. All rights reserved.
Change Analysis Tool
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deploying the Integrated Data Architecture
Copyright © 2004, SAS Institute Inc. All rights reserved.
Option A: Full SAS Data Architecture
1
2
2
Solution Data Marts
Extract and Cleanse Files
EnterpriseSource
Systems
Solutions
SAS® Cross-Sell and Up-Sell for Banking
SAS® Customer Retention for Banking
SAS® Credit Scoring for Banking
SAS® Credit Risk ManagementSAS Banking Detail Data Store
Flexible Options to Meet Customer Needs!
Copyright © 2004, SAS Institute Inc. All rights reserved.
Populate DDS and Data Mart
Flat File
Step 1 - Extract cleanse and transform from source data into flat file
Data WarehouseDDS
Step 2 – ETL processing to load data warehouse•data validation•key creation•slowly changing dimensions
Banking Data Mart
Step 3 - Transform into data mart model
ExcelExcel
SASSAS
SAPSAPOracleOracle
PeopleSoftPeopleSoft
Source Data
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deployment Focus
Scalability and Performance
ETL flows
Physical data model
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deployment What did We do?
Create and Generate Data
Deploy Hardware and Software
Populate DDS
Populate Data Mart
Analyze ETL Flows
Analyze DDS Model
Change Management
Copyright © 2004, SAS Institute Inc. All rights reserved.
It All Starts with Data
Bought and Built Data Generators
Built Simulated Data
Applied Business Rules
Scaled - 5 gig -> 50 gig -> 500 gig -> 1TB
Copyright © 2004, SAS Institute Inc. All rights reserved.
Deploy Hardware and Software
Choose Software Components• SAS for the DDS or Data Warehouse
• Databases for the DDS or Data Warehouse
• SAS for the Data Marts
Install and Configure SAS Software
Configure Hardware
Design for Progressive Larger Deployment Growth
Copyright © 2004, SAS Institute Inc. All rights reserved.
Windows Server
*Dell PowerEdge 1600SC
Windows 2003
DualHyper-threaded 2.8 Ghz processors
4 GB RAM
4 internal IDE drives60 GB C drive 275 GB D drive
Single I/O channel
5gig -> 50gig of Data
Copyright © 2004, SAS Institute Inc. All rights reserved.
AIX UNIX Servers
IBM P630 eServer
AIX 5.3
4 processors
4 I/O channels
8 GB RAM
4x72 GB disks
14-drive SCSIS storage array
IBM P670 eServer
AIX 5.3
16 processors
8 - 1gig fiber I/O Channels
Dynamic logical partitioning
2 TB disks
50gig -> 500gig 5500gig -> 1TB of Data
Copyright © 2004, SAS Institute Inc. All rights reserved.
Populate DDS and Data Mart
Ran ETL Flows• Registered in SAS Metadata Repository
• Loaded Data into Tables
• Use Slowly Changing Dimension Load Process
Analyze ETL Flows
Copyright © 2004, SAS Institute Inc. All rights reserved.
Example of SAS ETL Studio Flow Analysis
Copyright © 2004, SAS Institute Inc. All rights reserved.
Change Management
Loaded New Release of DDS in TST Repository
Compared PRD Repository to TST Repository
Ran Batch Reports to Examine Differences.
Ran Impact Analysis on Column and Table
Copyright © 2004, SAS Institute Inc. All rights reserved.
What Did We Find
Specific Techniques that Work Best
Recommendations
Tremendous Performance Gains!
Copyright © 2004, SAS Institute Inc. All rights reserved.
Specific Techniques Examples
ETL Flows
Parallel ETL flows
SAS coding techniques to use
Use hash table instead of look up
Make sure the I/O buffer size is tuned
Drop constraints
Copyright © 2004, SAS Institute Inc. All rights reserved.
Specific Techniques Examples
DDS Model
Indexes – when and when not to add
Denormalized some tables
Separate tables for data with high volume changes
Partition data by usage (date ranges)
Copyright © 2004, SAS Institute Inc. All rights reserved.
Recommendations
Debugging techniques
Sorting and memory usage
Joins
Understand disk requirements
I/O optimization
Compression and performance
Copyright © 2004, SAS Institute Inc. All rights reserved.
Above All
Write ETL
Test, Tune
Test, Tune
Test, Tune!!!!
Copyright © 2004, SAS Institute Inc. All rights reserved.
Summary and Conclusions
Data integration is key
Different approaches for customers
Change management is vital
Performance tuning is vital
Technology evolving
Copyright © 2004, SAS Institute Inc. All rights reserved.
Questions?