18
Best Practices in Best Practices in Data Warehouse Data Warehouse Testing Testing By :- Anindya Mookerjea Prasanth Malisetty

Best Practices in Data Warehouse Testing GOOD

Embed Size (px)

Citation preview

Page 1: Best Practices in Data Warehouse Testing GOOD

Best Practices in Data Best Practices in Data Warehouse TestingWarehouse Testing

Best Practices in Data Best Practices in Data Warehouse TestingWarehouse Testing

By :- Anindya Mookerjea Prasanth Malisetty

Page 2: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

2

Agenda

Introduction to Data WarehouseWhy require Best Practices in testing DWHPhases in data warehouse testingData warehouse testing goalsTypes of Data Warehouse Testing Generic Challenges

Page 3: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

3

Introduction to Data WarehouseA data warehouse is a:

subject oriented Integrated Non-Volatile Time variant collection of data in support of

management’s decision.

Page 4: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

4

Why require Best Practices in testing DWH

How much confident a company can be to implement it’s data warehouse in the market without actually testing it thoroughly ?

How much testing is enough to implement their data warehouse in the market?

Page 5: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

5

Why require Best Practices in testing DWH

Would it be possible for a company to remain at the top of the business if the bug is detected at the later stage of testing cycles?

“Undoubtedly we need to have in place the best practices in testing a data warehouse application”

Page 6: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

6

Phases in data warehouse testing

Business understandingTest plan creation Test case creation Test data file for predictionsTest predictions creation Test case execution Deployment in production environment

Page 7: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

7

Data warehouse testing goalsNo Data lossesCorrect transformation rulesData validationRegression TestingOneshot/ retrospective testingProspective testingView testingSamplingPost implementation

Page 8: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

8

Types of Data Warehouse Testing

Business FlowOne-Shot / ProspectiveData QualityIncremental LoadPerformanceVersion Production Checkout

Page 9: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

9

Business Flow’ in Data Warehouse Testing

Understand Business Specification

Create Test Plan/Test Approach

Develop Test CasesAnd Predictions

Business Test Execution Cycle 1

(Initial Load)

Business Test Execution Cycle 2(Incremental Load)

Business Testing Sign off

Data Mocking

Page 10: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

10

One-Shot/Prospective in Data Warehouse Testing

Business Requirements Specifications

RetrospectiveTesting

Prospective Testing

Sampling of Data

Test Data

Back out Testing

Test Cases and Predictions

Compare and Regression Testing

Correct Analysis

Defect

NO

YES

Sampling of Data

Test Data

Test Cases and Predictions

Compare and Regression Testing

CorrectNO

Production

Source File

MVS

Data Sources

YES

Production Checkout

BA Sign Off

CorrectNO

Defect

Production Checkout Sign Off

YES

Page 11: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

11

‘Data Quality’ Data Warehouse Testing

Business Requirement Specification

Transformation Rules Testing

Test Data

DQ Testing for Key Fields

IS DQ % > Emergency %

IS DQ % > Threshold %

DQ E-mail And Abort ETL

DQ E-mail

Page 12: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

12

‘Incremental Load’ in Data Warehouse testing

Complete Initial Load BA Testing

(Cycle 1,Cycle 2)R: TESTER

Mocked Data for new row Insert scenario

(Natural Key not match condition) Change any Natural Key attribute in Existing

data

Mocked Data for existing row update scenario

(Natural Key match condition)Change any Non Natural Key attribute

in Existing data

Data loaded in the table (Initial Load)

R: SE

Source Data

NON Timeline Tables

Mocked data for Incremental

Load Testing in Source file R: TESTER

Mocked Data for do not delete scenarios

(Natural Key and all other Non natural key attribute match condition) Create a new record same as exiting record.

Condition 1

Condition 3

Condition 2

OUTPUT:Keep record as it is. No change in

Load and Update date.

OUTPUT:Existing record should get

updated with new changes. Update Date = Job run date

OUTPUT:New record should be inserted in the table with changed natural

key combination Load dt, Update Date = Job run

date

Compare Before and After results

Raise MQC ticket in MQC and Communicate the issue on open line conference to Version DBA

BA Testing Sign-off

Defects Found

Defects FoundDefects Found

Page 13: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

13

‘Version’ Data Warehouse Testing

Requirement in the form of Excel sheet

( Version change Sheet )

Validate the requirements with

Business Specification documents

Changes/Corrections Required

No Changes Develop Test Plan

Approach Document

Addition of columns to

objects

(This involves adds to physical tables and drops and recreates for view objects)

Physical Tables Running Describes on for physical tables for adding columns to the tables and

Renames

Views Select 1-5 rows (pre & post version predictions) to verify that columns were added with appropriate default value and appropriate changes

Deletion / Expansion of columns or

Datatype changes to columns

(This involves renames to physical objects, drops and recreates for view objects)

Compare Before and After results

Raise MQC ticket in MQC and

Communicate the issue on open line

conference to Version DBA

BA Testing Sign-off

Defects Found

No Defects

Page 14: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

14

‘Production Checkout’ Data Warehouse Testing

Testing only valid values get populated into table (as per BRD’s)

Testing default values cross threshold % (for given attributes)

Validates data is not corrupted (environmental factors)

Verification vs. predictions

Page 15: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

15

Generic Challenges Huge data volume and complexity.Data volume and complexity of data impacts performance & productivity.Scope of testing is broad as it also involves regression of data.There are different Change Requests which have different testing flow and associated rules.The data used for predictions is different and have to be mapped with actual data, based on different business rules.Chances of one scenario, in testing being repeated, is not much frequent.Currently all types of testing including regression is done by hand.

Page 16: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

16

Comparison of Flow - Manual Vs Automated

All Modules involve regression

Automation Tool Manual Testing

Regression data captured in

automation tool

Regression data captured in excel

format

Code Implementation

Captured data saved in excel

format

Execute Automation

Script

Data ComparisonData Sorting/Formatting

Test Results are published saved in the required format.

Create the query to Concatenate

key fields & extracting data.

Page 17: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

17

Any Questions?

Page 18: Best Practices in Data Warehouse Testing GOOD

www.Test2008.in

18