27
1 1 © 2007 Informatica. Company Confidential. Forward-looking information is based upon multiple assumptions and uncertainties and does not necessarily represent the company’s outlook. PowerCenter Data Validation Option (DVO) Overview Your Name Here Date

DVO Presentation

Embed Size (px)

DESCRIPTION

DVO

Citation preview

  • 1 1

    2007 Informatica. Company Confidential. Forward-looking information is based upon multiple

    assumptions and uncertainties and does not necessarily represent the companys outlook.

    PowerCenter Data Validation Option (DVO)

    Overview

    Your Name Here

    Date

  • 2

    Data Validation Option: Deliver Trusted Data with Lower

    Business Risk

  • 3

    Data Validation Option Benefits

    Increased likelihood of success, lower project risk

    Significant cost savings, faster time to market 50% source-to-target testing

    80% regression testing

    90% upgrade testing

    Ability to test all data, not just a small sample

    Ability to test in heterogeneous environments

    No need to know SQL

    Complete Audit Trail and comprehensive reporting of all testing activities

    No need to acquire additional server technology: leverage PowerCenters scalability, platform support, and data access

  • 4

    Common Data Validation Use Cases

    Source to Target Testing

    ETL Validation

    Data Masking

    Production to Development Testing

    ETL version upgrade

    ETL Migration

    Application retirement

    Database upgrades

    Production Reconciliation and auditing

    Data Warehousing

    MDM

    Operational integration

    Audit/Balance/Control solutions

    Data is Transformed

    Data is Identical

    Data moves to Production

  • 5

    Customer Pain Point Stories

    Large North American insurance company

    They currently use manual processes: SQL queries, writing code.

    They are currently only able to validate 20% of their data.

    Large North American investment company

    They move large quantities of data into their production systems nightly.

    They are currently only able to reconcile a small fraction of the data that is going live.

  • 6

    Testing

    Testing

    Development Investigation

    Upgrade

    Initial

    Development

    Power Center

    Upgrade

    Estimate: 30%

    Estimate: 70%

    How Testing Fits into the ETL Lifecycle

    DVO

  • 7

    Current Data Validation Approach

    Currently Data Validation is done manually by writing SQL scripts

    Data Validation should take 30% of project time for a Data Integration project

    Because of the complexity involved in testing and the pressure to meet production deadlines most customers admit they do not commit the time for data validation resulting in data quality issues and higher project risk

    PowerCenter upgrades can take up to 6 months

    With DVO, data validation testing can be reduced to ONE day, and provide complete test coverage

  • 8

    Time-consuming and expensive

    Time is spent writing queries and waiting for them to run

    Error-prone

    Manual inspection errors

    Coding errors

    Too slow for complete test coverage

    Time/Cost pressure leads to try it here and there approach

    The tester runs out of time/money before testing is done

    The usual problems associated with writing custom code

    No audit trail

    No reuse

    No methodology

    Problems with Manual Validation

  • 9

    DVO Product Overview

    Tool built on top of Informatica PowerCenter

    Users define data rules using easy-to-understand GUI

    Data is processed and evaluated using PowerCenter

    Results are displayed in the GUI and stored for later retrieval and reporting

  • 10

    Data Validation Option Architecture

    Databases

    Data Formats

    Real time

    Legacy

    Applications

    Databases

    Data Formats

    Real time

    Legacy

    Applications

    PowerCenter mappings are generated

    Session is executed 2

    PC API

    3 DVO is used to display results

    DVO is used to

    define Test Rules 1

    Results

    DB

    All results are

    stored in the

    Results DB 4

    5 Comprehensive

    Reporting on all

    Tests and Results

  • 11

    We used DVO to compare 14 tables and about 30 million rows in less than 5 hours.

    The largest of the tables was 94 columns.

    When I asked our QA people how long it would take them to run the scripts and test this amount of data, they mentioned months

    Tom Kato

    Data Integration Consultant

  • 12

    Financial Institution. They have a nightly load of 350

    tables into their enterprise data warehouse.

    With DVO they were able to provide complete test

    coverage quickly.

    They are able to do both batch and incremental data

    validation.

    With DVO they have a complete audit trail.

    .

    KEY BUSINESS IMPERATIVE

    INFORMATICA ADVANTAGE RESULTS/BENEFITS

    They spend hundreds of millions purchasing troubled debt in the USA

    The data and risk calculations on those assets must be correct.

    Bad data could cost them millions and put them out of business

    THE CHALLENGE

    IT INITIATIVE

    Production Data Validation

    Business users were complaining about missing

    data in the systems.

    Data errors can lead to very costly bad business decisions.

    They were doing manual testing via developer-written

    mappings and PL/SQL

    Other products available today could not meet their

    requirements.

    DVO found where data was missing. They found

    thousands of missing records

    due to bad coding and

    improperly rerun failed jobs.

    Reloaded all missing data in two weeks.

    They are looking to implement on-going incremental data

    validation for all new data

    loaded into tables.

    DATA INTEGRATION PROJECT

    Data Warehouse

    Large North American Financial Institution DVO for Production Table Balancing

  • 13

    DVO install and setup in One day

    Testing reduced from 6 to 8 weeks to less than 1

    day!

    Upgrade testing completed in 2 days with

    full coverage of data

    North American Retailer. Must do more with less and keep Data Integration productivity high

    KEY BUSINESS IMPERATIVE

    INFORMATICA ADVANTAGE RESULTS/BENEFITS

    Needed to upgrade PowerCenter quickly, with minimal DI disruption, and with low cost

    THE CHALLENGE

    IT INITIATIVE

    PowerCenter Upgrade

    Upgrade from PowerCenter 8.1 to 8.5

    required 1 full day of data

    testing (30 million rows)

    across old and new

    systems

    They projected 6 to 8 weeks for complete

    manual regression testing.

    .

    Reduced data validation test time by 85% while

    providing complete test

    coverage.

    DATA INTEGRATION PROJECT

    Data Warehouse

    North American Retailer DVO for Informatica PowerCenter Upgrade

  • 14

    Questions?

    Data Validation Testing with DVO Lowers

    Your Overall Data Integration Business Risk

  • 15

    DVO External Web Page

    www.informatica.com/more/DVO

    Contacts

    Val Rayzman VP DVO

    Saeed Khan Product Management

    Roger Nolan Product Marketing

    DVO Resources

  • 16

    Data Validation Connectivity Natively DVO supports (via PC mappings):

    Flat files Relational DBs ODBC

    Via ODBC :

    SAP iWay SAP ODBC Driver XML Data Direct ODBC Driver PWX Mainframe sources via PWX ODBC connectivity Other relational types & appliances: Netezza, Greenplum,

    Postgres, MySQL etc.

    DVO does not support:

    Non-relational files Hierarchical structures or recurrences unless they are

    flattened in some appropriate way

  • 17 17

    2007 Informatica. Company Confidential. Forward-looking information is based upon multiple

    assumptions and uncertainties and does not necessarily represent the companys outlook.

    DVO Customer Use Cases

  • 18

    Specific DVO Use Case Scenarios Mix These Slides Into Your Deck as Appropriate for Your Customer

  • 19

    DVO: Production Validation (Reconciliation) Test data before moving to Production Systems

    Informatica

    PowerCenter

    Oracle DB

    Microsoft SQL

    IBM DB2

    Mainframe Application

    Production

    Application

    DVO

    Process:

    Validate that data values are the data values expected as you move them into production This is typically a workflow run at night. The load only executes after the tests are passed successfully Validation testing catches errors in loads, transforms, and operational issues. Benefits

    Ensures the integrity of all data moved into production systems.

  • 20

    DVO: Source-to-Target Testing Data is Transformed During Move

    Informatica

    PowerCenter Target Tables

    DVO

    SourceTables

    Process:

    Validate that the data is after the move and transformation Validate that the data values are the values you expected

    Benefits

    Validate 100% of data GUI based tool with simple to use operators to create tests Testing is done in days not weeks etc.

  • 21

    DVO: PowerCenter Upgrade Ensure output of both versions is identical

    PowerCenter 8

    Output

    Process:

    Validate that the data produced by both versions of PowerCenter is identical. Test every row, and every column of every table

    Benefits

    Validates 100% of data Testing is done in days not weeks etc. Much of the testing can be automated

    DVO

    SourceTables

    PowerCenter 9.x

    PowerCenter 8.x

    PowerCenter 9

    Output

  • 22

    DVO: ETL Migration Ensure successful migration to PowerCenter

    PowerCenter 8

    Output

    Process:

    Validate that the new mappings produce the same output as the old ETL Test every row, and every column of every table

    Benefits

    Validate 100% of data Testing is done in days not weeks etc.

    DVO

    SourceTables

    PowerCenter 9.x

    Hand coding or

    competitive product

    PowerCenter 9

    Output

  • 23

    DVO: Application-to-Application Migration Verify data move to a new application

    Informatica

    PowerCenter

    Mainframe

    Application Enterprise

    Application

    DVO

    Process:

    Validate that the data is unchanged after move to new enterprise application Test every row, and every column of every table

    Benefits

    Validate 100% of data Testing is done in days not weeks etc.

  • 24

    DVO: Validation for MDM Validate data before moving to MDM

    Informatica

    PowerCenter

    Oracle DB

    Microsoft SQL

    IBM DB2

    Mainframe Application

    Informatica

    MDM MDM

    Landing

    Tables

    DVO

    Process:

    Validate that data values are the data values expected as you move them into MDM Benefits

    Validate 100% of data Testing is done in hours not weeks etc. MDM Golden Record is based on valid data

  • 25

    DVO: Regulatory Compliance Validate that data behind the reports is correct

    Process:

    Validate that the data was moved to the data warehouse correctly Validate that the data values in the warehouse are as expected Review the audit trail of data validation tests run and their results Assess the completeness of the data validation test coverage Benefits

    Validate 100% of data Testing is done in hours not weeks etc. Testing that can be shared with auditors to prove compliance Complete audit trail of all data validation tests run and their results

    Informatica

    PowerCenter

    Data

    Sources

    Data

    Warehouse

    Business

    Reports

    DVO

    Auditor

  • 27

    PowerCenter Upgrade DVO for PowerCenter Upgrades

    Production

    Environment (8.x)

    Development

    Environment (9.x)

    Test

    Environment (9.x)

    ILM

    ILM

    DVO DVO