Upload
rachel-brighton
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Bob Hoffman Technical Account Manager
Eastern Area
Boston User Group Getting Data Ready for WebFOCUS
November 10, 2011
Cooking Food On the GRILL!
• Cleansed
• Marinated/Rubbed
• Well cooked
• Serve to family and friends
• Data access
• Cleanse
• Standardize
• Monitor
• Manage
Your Data Needs Attention Also!!
• REPORT
When Reporting Data Goes Unmanaged?
• ERRORS
• CONFUSION
• DUPLICATION
Agenda
The Path from Data to BI Access to Data Data Quality Master Data Management/Data Synchronization
Demonstration
Intelligence
Knowledge
Information
Data
Business Intelligence
Data For Analysis
GAP
Standardization
Cleansing
Data profiling
The Path from Data to Business Intelligence
Path from Data to Business Intelligence
Infrastructu
re
•Allow for access to data
•Real-Time and Batch Information Movement
•Reusability
#1
Data
Quality
•Allow for Real-Time Data Quality
•Correct Data Quality issues before they propagate
Master
Data
Manageme
nt
•Centralize the management of information
•Control the information throughout to organization
#3
#2
Path from Data to Business Intelligence
Infrastructu
re
•Allow for access to data
•Real-Time and Batch Information Movement
•Reusability
#1
Integration Approach – Start with an Integrated Infrastructure
Pre-packaged Integration Components
SFA/CRM
Amdocs/Clarify BMC/Remedy MSDynamics Oracle/Siebel Salesforce.com SAP
Data Warehouse
DB2 ETL Oracle/Essbase MS SSAS/OLAP Netezza SAP BW Teradata
B2B
Internet EDI Legacy EDI MFT Online B2B XML
ERP/Financials
Ariba I2 JD Edwards Lawson Manugistics Microsoft Oracle SAP
Industry
ACORD CIDX HL7 RNIF SWIFT 1Sync
Legacy Systems
CICS IMS VSAM .NET Java TUXEDO MUMPS
Enterprise Data Integration Scenario
ReportsDashboards
Data IntegrationData Quality
…
Data Sources
Path from Data to Business Intelligence
Data
Quality
•Allow for Real-Time Data Quality
•Correct Data Quality issues before they propagate
#2
Data Quality Center – Profiling
Profiling – Technical (Pre-built) Basic Analysis
Minimums Maximums Averages Counts Etc.
Patterns / Masking Extremes Quantities Frequency Analysis Foreign Key Analysis
Profiling – All Charting Grouping / Aggregate Drilldown / Interactive Displays
Copyright 2007, Information Builders. Slide 13
Data Quality – Cleansing
Parsing data parsed into components (pattern
based)
Standardization transformation into standard format
(Jim Smith -> James Smith) standard and nonstandard
abbreviations (Str. -> Street) language-specific replacements
Data quality validation validation against rules validation against reference tables
Large number of domain oriented algorithms
Address Party Vehicle Name Identification number Credit Card number Bank account number
Extension by custom validation steps
using complex function and rules including
Levensthein distance SoundEx internal (java-based) functions
Data Quality – Match & Merge
Unification identification of the candidate groups
company address person product …etc.
Deduplication best representation of the identified
subject golden record creation
Identification new data entries – to identify subject
(person, address, etc.) to which the new record is connected (matched)
Fuzzy logic and scoring Same name + same address Same name + similar address Similar name + same address Similar name + similar address
Complex business rules using sophisticated algorithms and
functions including Levensthein distance Hamming distance Edit distance Data quality scores values Data stamps of last modification Source system originating data
Data Quality:Issue Management
Data Quality Issue Management
Issue Tracker Portal – Workflow Management
Issue Tracker Portal – Issue Resolution (1)
Issue Tracker Portal – Issue Resolution (2)
Path from Data to Business Intelligence
Master
Data
Manageme
nt
•Centralize the management of information
•Control the information throughout to organization
#3
Moving Towards MDM from Data Quality Step
1. Matching: Identification, linking related entries within or across sets of data.
2. Merging: Creation of the golden data based on one or several reference source and rules.
3. Propagating: Update other systems with Golden Data if required.
4. Monitoring: Deployment of controls to ensure ongoing conformance of data to business rules that define data quality for the organization.
MDM Architectures
Master is Single Version of Truth Data Quality at Master Updates occur at Sources Updates propagated to Master
Multiple Versions of Truth Data Quality is Ongoing Updates occur at Sources Keys and Metadata in Registry Updates propagated to other Sources
Master
Source Source
Source Source
Consolidated
Registry Style
Master
Source Source
Source Source
Other Styles: Supported
Project Successes – Pathway to Maturity
1. Start with Data Profiling Understand the data you have Identify inconsistencies in the data Disseminate the information about the data quality
Getting to MDM – “The Golden Record”
2. Continue with Data Quality Validate, standardize and cleanse for purpose Automate the process De-duplication (Match & Merge)
3. End with Master Data Synchronize with closed loop feedback integration Provide a single view for all stake holders
4. Implement Data Governance – Issue Tracking
Demonstration
Copyright 2007, Information Builders. Slide 25
Data Management Life-Cycle
Thank You! - Questions?
iWay SoftwareBecause Everything Should Work Together.
WebFOCUS Because Everyone Makes Decisions.