If you can't read please download the document
Upload
alex-meadows
View
3.114
Download
0
Embed Size (px)
Citation preview
Data Quality OverviewAlex Meadows1/28/2013
Data Quality Facts
Cost of poor data quality in US - $600 Billion
Poor Data/Lack of visibility cited as #1 reason for project cost overruns
Poor data quality costs the US Economy $3.1 Trillion a year
Implementing data quality best practices boosts revenue by 66%
Median Fortune 1000 company could increase revenue by $2.01 Billion if they improved usability of data by 10%
Source: http://www.webmastat.com/blog/2012/09/07/7-facts-about-data-quality/
What is Data Quality?
Measuring data to determine if it isfit for purpose
Fit For Purpose?
Bad data is a myth!
Two Questions
What is the data used for?
What can be measured to make sure it meets the need?
Application use vs. Reporting/Analysis
Data Quality Dimensions
Consistency
Correctness
Timeliness
Precision
Unamiguous
Completeness
Reliability
Accuracy
Objectivity
Conciseness
Usefulness
Usability
Relevance
Amount of data
Source: Data Quality Fundamentals, The Data Warehousing Institute
Measuring Data Quality
Profiling understanding metadata
Point in time shows what data looks like now
Automating shows trendsAlert to new/potential issues as they happen
Potentially fix issues in near real time
Six Sigma Principals
Statistical Process Control
Automated inspection
Visibly shows process deviation
Data Profiling Analysis
Duplication
Pattern matching
Boolean/String/Number
Date Gap
Date/time
Day of Week
Character Set
Reference Data Matching
Value Distribution
Inter-Data Set Comparisons
Master Data Management
Create a gold standard for data
Distribute data so that all sources are uniform
Names
Addresses
Phone Numbers
Products
Can hook into third party sources
Data Governance Program
Central authority for data quality control
Applies information collected from data profiling, MDM, etc. Uniformly across the business
Communication channels between business and IT groups
Questions?