50
KENT GRAZIANO @KentGraziano | kentgraziano.com AGILE DATA WAREHOUSING: USING ORACLE DATA MODELER (SDDM) TO BUILD A VIRTUALIZED ODS

Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Embed Size (px)

Citation preview

Page 1: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

KENT GRAZIANO@KentGraziano | kentgraziano.com

AGILE DATA WAREHOUSING: USING ORACLE DATA MODELER (SDDM) TO

BUILD A VIRTUALIZED ODS

Page 2: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Agenda

© Data Warrior LLC

Bio

Architecture and Approach› What is a Virtualized ODS?

Using SDDM for pattern-­based stage tables

Using views to load the stage tables› Building the views in SDDM› Using MD5 columns for Change Data Capture

Building ODS views in SDDM› Using Analytic Functions in views

Generating the DDL› SQL Server› Oracle

1

2

3

4

5

6

Page 3: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

My Bio

© Data Warrior LLC

› Senior Technical Evangelist, Snowflake Computing› Oracle ACE Director (BI/DW)› Certified Data Vault Master and DV 2.0 Practitioner› Data Modeling, Data Architecture and Data Warehouse Specialist› 30+ years in IT› 25+ years of Oracle-­related work› 20+ years of data warehousing experience

› Former-­Member: Boulder BI Brain Trust (http://www.boulderbibraintrust.org/)

› Author & Co-­Author of a bunch of books› Blogger: The Data Warrior› Past-­President of Oracle Development Tools User Group and Rocky Mountain Oracle User Group

Page 4: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Shameless Plug

© Data Warrior LLC

Available onAmazon.com

http://www.amazon.com/Better-­Data-­Modeling-­Enhancing-­Developer-­ebook/dp/B00UK75LYI/

Page 5: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Shameless Plug #2: Also On Amazon.com

© Data Warrior LLC

NOW IN SPANISHTOO!

http://www.amazon.com/Check-­Doing-­Design-­Reviews-­

ebook/dp/B008RG9L5E/http://www.amazon.com/VERIFICAC

I%C3%93N-­REALIZAR-­REVISIONES-­DISE%C3%91OS-­

MODELOS-­ebook/dp/B00NUS1GFM/

Page 6: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Architecture & Approach

© Data Warrior LLC

Goals

› New reporting environment

› Agile (i.e., quick delivery)

› Future Proof

Determination

› Use Data Vault 2.0

› Implement in Phases

Page 7: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Data Vault Definition

© Data Warrior LLC

The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional

areas of business.

It is a hybrid approach encompassing the best of breed between 3rdnormal form (3NF) and star schema. The design is flexible, scalable,

consistent and adaptable to the needs of the enterprise.

Architected specifically to meet the needs of today’s enterprise data warehouses

DAN LINSTEDT: Defining the Data VaultTDAN.com Article

Page 8: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Data Vault Components

© Data Warrior LLC

Copyright 2011 Dan Linstedt, used by permission

Page 9: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Phase 1: Operational BI

© Data Warrior LLC

Goals: 1. Support immediate business needs for operational reports 2. Provides architectural component (stage layer) that supports long term data warehouse (DW) framework3. Can be easily enhanced to accommodate information needs of other departments4. Foundation for eliciting solid analytic BI requirements

XLink(data source)

eRMS(data source)

DW Stage Layer

Virtual Operational Data Store (ODS)

BOBJ Operational Universe(s)

BOBJ Operational Reports

Page 10: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Phase 1: Operational BI

© Data Warrior LLC

Data Warehouse (DW) Stage Layer

› Based on source system structures

› May simply be replicated source tables

› Refreshed several times a day

› Perform change data capture in this layer to provide persistent, historical data for future reporting needs

Virtual Operational Data Store (ODS)

› Abstraction layer between source and report tool

› Views on stage layer initially

› Provides proper modeling for building the Operational Universe(s) for BI report tool

› Includes Business Names and Joins

Page 11: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Phase 2: Analytic BI

© Data Warrior LLC

Goals: 1. Provide foundation for long term analytics platform (single source of information)2. Create purpose-­built Universe for analytic needs3. Enable managed self-­service BI by making it simpler for users to find the reports they need

XLink(data source)

eRMS(data source)

DW Stage Layer

Virtual ODS

BOBJ Operational Universe(s)

BOBJ Operational Reports

Data Vault (Enterprise DW)

Virtual Data Marts BOBJ Analytics Universe(s)

BOBJ Analytical Reports & Dashboards

Page 12: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Phase 2: Analytic BI

© Data Warrior LLC

Data Vault

› Provides one consistent source of information for both operational and analytic information

› Source system agnostic structures

› Easier to adapt and extend in future than 3NF or star schema

› Can be easily expanded as new data is added to the data warehouse foundation layer

› Persistent, historical capture of transaction-­level data› Allows meeting future unknown needs, as they arise

Addition of Data Vault should be transparent to BOBJ operational report users

› Modification to physical references in the universe hides the change from the users;; Operational universe still looks like “modified” source system structures

› Therefore, no rework of existing reports

Page 13: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Phase 2: Analytic BI

© Data Warrior LLC

› Virtual data marts also sourced from data vault

› Marts provide an abstraction layer between DW and Business Objects› Can be easily expanded as new data is added to the Data

Vault

› Easy to create new data marts for future business needs

Analytics universe(s) sourced from new virtual data marts

› Looks like proper star schema with facts and dimensions› Re-­organizes the data to more effectively support

business reporting

› Enables long-­term universe support by most common BOBJ development skill set

› Can be converted to physical data mart (if needed)› For performance in a future release

› For highly complex business rules

Page 14: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Building Pattern-­Based Stage Tables

© Data Warrior LLC

Create Table Template› Include reusable meta

data columns

Reverse Engineer Source Table(s)› Copy and rename

Apply Template› Use built in

transformation script

› Alternative› Copy template table› Merge with copy of source

Re-­order columns as needed

Page 15: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Template Table

© Data Warrior LLC

Page 16: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Create Base Stage Table

© Data Warrior LLC

01.Copy source

table

02.Rename (add _stg)

03.Remove source

indexes

04.Change schema assignment

05.Add or Change table

comment

06.Assign Stage classification

(if you have one)

07.NOTE: You could script all this!

Page 17: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Base Stage From Source

© Data Warrior LLC

Page 18: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Apply Table Template Transform

© Data Warrior LLC

Use Table Template and Transformation Script

Tools -­> Design Rules -­> Custom Transformations

Look for “table template” delivered script› No change needed

Create table called table_template (or change script)› With required columns and properties to be copied

Select “Apply”› Changes all tables in design

Note: can script all sorts of stuff› Check /datamodeler/xmlmetadata/doc

1

2

3

4

5

6

Page 19: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

© Data Warrior LLC #VirtualODS

Page 20: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Use the Merge Tool

Alternate -­ Merging Tables

© Data Warrior LLC

Adding Standard Columns

› 5th button on tool bar

› Good for building denormalized reporting tables

› Also for one-­offs to add standard columns

Combines Two Tables

› Click merge button, then template, then target

› Edit result as needed

a. Copy template table

b. Merge with table needing the columns

Page 21: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

© Data Warrior LLC

MERGEDTABLES

Merge button Merged tables

Page 22: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Finalize Stage Table Design

© Data Warrior LLC

01

02

03

Re-­order columns› PRIM_KEY column is 1st

Add new PK constraint using PRIM_KEY column

Drop source PK constraint› Replace with Unique constraint

Page 23: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

BUILDING THE STAGE TABLE WITH MERGE

DEMO!

© Data Warrior LLC

Page 24: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Final DW Stage Table

© Data Warrior LLC

Source table name + stg suffix

New calculated PK for each stage record

Indicator of original source system PK

Additional meta-­data columns to support change capture, load time and source

Page 25: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Build Stage Load Views

© Data Warrior LLC

For db to db ELT type loading

Includes code for Type 2 SCD style CDC

Use SDDM View Builder› Select from source table (all columns)› Drag and drop› Alternate – Table to View wizard

› Add code from view template

Show code in DDL Preview

Test in SQL Developer› Fix› Repeat

1

2

3

4

5

Page 26: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Table To View Wizard

© Data Warrior LLC

Pick Tables to use

Auto create new subview diagram

Auto add PK & FK to views based on base table

Page 27: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

View Builder

© Data Warrior LLC

Pick Syntax

Pick Tables & Columns

Add Calcs & Aliases & Filters

Add Complex Sub queries if needed

Page 28: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

MD5 Keys & Columns

© Data Warrior LLC

Concatenate source data fields and hash to create MD5 keys & columns

MD5 Key Types

1

2

PRIM_KEY:› All source fields (in table

order) + LOAD_DTS› Uniquely ID’s all records

with DW› Can serve as an SCD-­2

key in virtual Dim’s / Facts

HASH_KEY:› Source field(s) (in table

order) used by SOR to ID data rows uniquely for change data capture purposes

HASH_DIFF:› All non-­CDC_KEY source

fields (in table order) to track deltas for change data capture purposes

Page 29: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

MD5-­Based Change Detection

© Data Warrior LLC

Think Type 2 SCD (Slowly Changing Dimensions)

Old Way:› Compare column by column› Source value != Current value in DW table

› 20 columns, then 20 compares

New Way:› Concatenate all columns to one string› Convert to one char(32) string with hash function› Compare to hashed value (HASH_DIFF) in target table› Does not matter how many columns

Page 30: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

What Does It Look Like?

© Data Warrior LLC

Encode using standard MD5 hash function (Oracle)› rawtohex(sys.utl_raw.cast_to_raw(dbms_obfuscation_toolkit.md5 (input_string => ...)

Need to minimize chance of duplicates› 12||3||45 and 1||2||345 hash to same value› Need a separator between each› Also handles case of null values› Example: Col1||’^’||Col2||’ ’||Col3

Page 31: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Other Considerations

© Data Warrior LLC

To generate most consistent string: standardize!

Convert data types

If 'NUMBER', 'NVARCHAR2', 'NVARCHAR', 'NCHAR‘› THEN 'TO_CHAR(' || column_name || ')‘

If 'RAW‘› THEN 'ENC_BASE64(' || column_name || ')‘

If 'DATE‘› THEN 'TO_CHAR(' || column_name || ', ''YYYY-­MM-­DD'')‘

If LIKE 'TIME%‘› THEN 'TO_CHAR(' || column_name || ', ''YYYY-­MM-­DD HH24:MI:SS'')'

Page 32: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Template View Code – SQL Server

© Data Warrior LLC

-­-­ SQL Server load view template columns PRIM_KEY, -­-­ place holder for PK columnHASH_KEY, -­-­ place holder for HASH KeyHASH_DIFF, -­-­ place holder for CDC columnGETDATE() AS LOAD_DTS, -­-­ current data and time'eRMS' AS REC_SRC – a source system name

-­-­ Template WhereWHERE -­-­supports load new keys and changes, no dupsNOT EXISTS( SELECT 1FROM dw_stage.rmcodp_stg stgWHERE stg.HASH_KEY = upper(CONVERT([Char](32),HASHBYTES('MD5', UPPER(RTRIM(RMC.CODCODTYP) + '^' + RTRIM(RMC.CODCODNUM) + '^')),2)) AND stg.HASH_DIFF = upper(CONVERT([Char](32), HASHBYTES('MD5', UPPER(RTRIM(CONVERT([Char](100),RMC.CODKEYNUM)) + '^' + ) …

Page 33: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Virtual ODS

© Data Warrior LLC

Simple database views on stage tables. Tables and columns renamed with business terms

FK Added to help BOBJ Developer define proper

joins

Page 34: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Defining The Virtual ODS Views

© Data Warrior LLC

Start with Table to View Wizard› On Stage Tables

Rename view

Used Excel & Metadata to create column alias› Extract metadata for stage tables (use SDDM Search)› Add calculated column to Excel › ="RMO."&E10350&" AS "&M10350&","

› Cut and paste into View Builder

Add nested table with analytic function› To only return current rows for ODS

Page 35: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Generating the Column Aliases

© Data Warrior LLC

Page 36: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Analytic Function To Get Current Rows

© Data Warrior LLC

SELECTCONVERT([Char](10),RMC.CODCODNUM) AS Business_Group_Code,RMC.CODKEYNUM AS Code_Key_Numeric,RMC.CODSYSTYP AS System_Value_Type,RMC.CODLNGDES AS Description,…RMC.LOAD_DTS AS LOAD_DTS,CASEWHEN RANK() OVER (PARTITION BY RMC.HASH_KEY

ORDER BY RMC.LOAD_DTS DESC) = 1THEN 'Y'

ELSE 'N'END CURR_FLGFROMDW_STAGE.RMCODP_STG RMCWHERERMC.CODCODTYP = 'BG‘

Page 37: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

BUT… Can’t Use Function In Where

© Data Warrior LLC

01.

Have to nest the query with the function as a virtual table in the FROM

02.

Then use CURR_FLAG in outer WHERE

03.

Works in Oracle, SQL Server, and SnowflakeDB

04.

Drop the final query into View Builder› Save› Generate DDL

Page 38: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Example: Virtual ODS View

© Data Warrior LLC

SELECTSRC.Business_Group_Code,SRC.Code_Key_Numeric,SRC.System_Value_Type,…SRC.Change_Time,SRC.LOAD_DTSFROM(SELECTCONVERT([Char](10),RMC.CODCODNUM) AS Business_Group_Code,RMC.CODKEYNUM AS Code_Key_Numeric,RMC.CODSYSTYP AS System_Value_Type,…RMC.CODCHGTIM AS Change_Time,RMC.LOAD_DTS AS LOAD_DTS,CASEWHEN RANK() OVER (PARTITION BY RMC.HASH_KEY

ORDER BY RMC.LOAD_DTS DESC) = 1THEN 'Y'

ELSE 'N'END CURR_FLG –-­ calculated columnFROMDW_STAGE.RMCODP_STG RMCWHERERMC.CODCODTYP = 'BG'

) SRC –-­ nested virtual tableWHERESRC.CURR_FLG = 'Y' –filter on calculated column

Nested Virtual Tablew/Rank column and other transforms

Get current rowsusing virtual column

Main select for view columns

#VirtualODS

Page 39: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Virtual ODS View In Query Builder -­ Nested

© Data Warrior LLC #VirtualODS

Page 40: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Virtual ODS View In Query Builder -­ Outer

© Data Warrior LLC #VirtualODS

Page 41: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Generate DDL

© Data Warrior LLC

Use DDL Preview to check

File > Export > DDL

Or click the DDL Icon

Pick the target DB type

Can switch at generate time

Same design can generate Oracle and SQL Server

Page 42: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Generate DDL

© Data Warrior LLC

Page 43: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Generate DDL

© Data Warrior LLC

Page 44: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Conclusion

© Data Warrior LLC

With planning and good architecture you can be agile

Data Vault provides a good framework

Oracle Data Modeler provides the tool

Think out of the box› Start with virtual ODS or Data Marts› Support for both Oracle & SQL Server› And Snowflake too!

1

2

3

4

Page 45: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

Want More In Depth Training?

© Data Warrior LLC

SQL Developer Data Modeler JumpstartOnline video training class with demos

Discount code GRAZIANO10S (20%off)

Go to https://kentgraziano.com/sddm1/

Page 46: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

© Data Warrior LLC

AVAILABLE NOW….

› On Amazon.com› Covers a ton of stuff› Reviewed by Kent & Jeff!

Page 47: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

© Data Warrior LLC

SUPER CHARGE YOUR DATA WAREHOUSE

› Available on Amazon.com

› Soft Cover or Kindle Format

› Now also available in PDF at LearnDataVault.com

› Hint: Kent is the Technical Editor

Page 48: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

© Data Warrior LLC

New DV 2.0 Book (includes more details on MD5)

› Available on Amazon:http://www.amazon.com/Building-­Scalable-­Data-­Warehouse-­Vault/dp/0128025107/

Page 49: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

© Data Warrior LLC

QUESTIONS?

Page 50: Agile Data Warehousing: Using SDDM to Build a Virtualized ODS

CONTACTINFORMATION

KENT GRAZIANOSnowflake Computingwww.snowflake.net

[email protected]

@KentGraziano

http://kentgraziano.com