Lucius McInnis, Systems Engineer – Client Services Group Kam Wong, Solutions Architect – iWay...

Preview:

Citation preview

Lucius McInnis, Systems Engineer – Client Services GroupKam Wong, Solutions Architect – iWay Software

March 22, 2012

Getting Data Ready for WebFOCUS

1

Data Quality/Business Intelligence Lexicon

2

GIGI

GOGO

GIGO Garbage-In-Garbage-Out

1960’s Dance Craze (Image: target.com)

1958 Romantic Musical (Image: imdb.com)

Get Rid Of The Garbage…

3

• Access

• Cleanse

• Standardize

• Monitor

• Manage

• Accurate data promotes accurate information and decisions…

4

• ERRORS

• CONFUSION

• DUPLICATION

When Business Data Is Not Managed

AGENDA

5

Fraud, Waste, and Abuse

Operations and Financial Mgmt.Information

Risk, Compliance, and Governance

Revenue Generation

Quality of Care/Service.

• The Path from Data to Information• Access to Data• Data Quality• Master Data Management/Data Synchronization

• Demonstration

Path from Data to Information

6

Infrastructu

re

•Allow for access to data

•Real-Time and Batch Information Movement

•Reusability

DataQualit

y

•Allow for Real-Time Data Quality

•Correct Data Quality issues before they propagate

Master

DataManageme

nt

•Centralize the management of information

•Control the information throughout to organization

Path from Data to Information

7

Infrastructu

re

•Allow for access to data

•Real-Time and Batch Information Movement

•Reusability

#1

Integration Approach – Start with an Integrated Infrastructure

8

Pre-packaged Integration Components

9

SFA/CRM

Amdocs/Clarify BMC/Remedy MSDynamics Oracle/Siebel Salesforce.com SAP

Data Warehouse

DB2 ETL Oracle/Essbase MS SSAS/OLAP Netezza SAP BW Teradata

B2B

Internet EDI Legacy EDI MFT Online B2B XML

ERP/Financials

Ariba I2 JD Edwards Lawson Manugistics Microsoft Oracle SAP

Industry

ACORD CIDX HL7 RNIF SWIFT 1Sync

Legacy Systems

CICS IMS VSAM .NET Java TUXEDO MUMPS

Enterprise Data Integration Scenario

10

Data Sources

Data IntegrationData Quality

ReportsDashboards

Path from Data to Business Intelligence

11

DataQualit

y

•Allow for Real-Time Data Quality

•Correct Data Quality issues before they propagate

#2

The Business Value of Data Quality

12

• Improves customer-facing processes: Promotes accurate client address and household information

• Enables advanced analysis: Facilitates the use of data-mining, market predictions, fraud detection, and future client value

• Credit and behavioral scoring:Helps financial institutions improve risk management - Basel Capital Accord III (2010)

• Assists healthcare organizations:Develop an Enterprise Master Patient Index (EMPI) leveraging connectivity to legacy systems and databases

Data Quality Center – Profiling

13

Profiling – Technical (Pre-built)• Basic Analysis

• Minimums• Maximums• Averages• Counts• Etc.

• Patterns / Masking• Extremes• Quantities• Frequency Analysis• Foreign Key Analysis

• Profiling – All• Charting• Grouping / Aggregate• Drilldown / Interactive Displays

Data Quality – Cleansing

14

•Parsing•data parsed into components (pattern based)

•Standardization•transformation into standard format (Jim Smith -> James Smith)•standard and nonstandard abbreviations (Str. -> Street)•language-specific replacements

•Data quality validation•validation against rules •validation against reference tables

•Large number of domain oriented algorithms

•Address•Party•Vehicle•Name•Identification number•Credit Card number•Bank account number

•Extension by custom validation steps

•using complex function and rules including

•Levensthein distance•SoundEx•internal (java-based) functions

Data Quality – Match & Merge

15

•Unification•identification of the candidate groups

•company•address•person•product•…etc.

•Deduplication•best representation of the identified subject•golden record creation

•Identification•new data entries – to identify subject (person, address, etc.) to which the new record is connected (matched)

•Fuzzy logic and scoring•Same name + same address•Same name + similar address•Similar name + same address•Similar name + similar address

•Complex business rules•using sophisticated algorithms and functions including

•Levensthein distance•Hamming distance•Edit distance•Data quality scores values•Data stamps of last modification•Source system originating data

16

Data Quality:Issue Management

Data Quality Issue Management

17

Issue Tracker Portal – Workflow Management

18

Issue Tracker Portal – Issue Resolution (1)

19

Issue Tracker Portal – Issue Resolution (2)

20

Path from Data to Business Intelligence

21

Master

DataManageme

nt

•Centralize the management of information

•Control the information throughout to organization

#3

Moving Towards MDM from Data Quality

22

1. Matching: Identification, linking related entries within or across sets of data.

2. Merging: Creation of the golden data based on one or several reference source and rules.

3. Propagating: Update other systems with Golden Data if required.

4. Monitoring: Deployment of controls to ensure ongoing conformance of data to business rules that define data quality for the organization.

MDM Architectures

23

Master is Single Version of Truth Data Quality at Master Updates occur at Sources Updates propagated to Master

Master

Source Source

Source Source

Consolidated

Registry Style

Master

Source Source

Source Source

• Other Styles Supported

• Multiple Versions of Truth

• Data Quality is Ongoing

• Updates occur at Sources

• Keys and Metadata in Registry

• Updates propagated to other Sources

Project Successes – Pathway to Maturity

24

1. Start with Data Profiling• Understand the data you have• Identify inconsistencies in the data• Disseminate the information about the data quality

2. Continue with Data Quality• Validate, standardize and cleanse for purpose

• Automate the process

• De-duplication (Match & Merge)

3. End with Master Data• Synchronize with closed loop feedback integration

• Provide a single view for all stake holders

Getting to MDM – “Golden Data”

4. Implement Data Governance – Issue Tracking

25

Demonstration

26

Data Management Life-Cycle

Thank You! - Questions?

27

iWay SoftwareBecause Everything Should Work Together.

WebFOCUS Because Everyone Makes Decisions.

Recommended