26
INTRODUCTION TO DATA QUALITY SERVICES Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net

Introduction to Data quality services

  • Upload
    pascal

  • View
    41

  • Download
    0

Embed Size (px)

DESCRIPTION

Introduction to Data quality services. Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net. Today’s Agenda. Overview of DQS Structure Knowledge Base DQS Project Operations Matching Cleansing Administration SSIS Component Shortcomings. About the Presenter. - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to  Data quality services

INTRODUCTION TO DATA QUALITY SERVICES

Presentation by Tim Mitchell (Artis Consulting)www.TimMitchell.net

Page 2: Introduction to  Data quality services

2

Today’s AgendaOverview of DQS

StructureKnowledge Base

DQS Project

OperationsMatching

Cleansing

Administration

SSIS Component

Shortcomings

Page 3: Introduction to  Data quality services

3

About the Presenter

Tim MitchellBI Consultant, Artis Consulting

North Texas SQL Server User Group

SQL Server MVP

Contributing author, MVP Deep Dives Vol 2

Coauthor, SSIS Design Patterns

TimMitchell.net | twitter.com/Tim_Mitchell

Page 4: Introduction to  Data quality services

4

Housekeeping

Questions

Surveys

Page 5: Introduction to  Data quality services

v

Overview of Data Quality Services

Page 6: Introduction to  Data quality services

6

What is DQS?

DQS is a knowledge driven data cleansing and matching servicesBuilt on top of SQL Server 2012Simple yet powerful interface

Page 7: Introduction to  Data quality services

7

What is DQS?

Page 8: Introduction to  Data quality services

8

What is DQS?

Replaces manual data quality work you’re already doing

Stored procedures

Triggers

Custom applications

Page 9: Introduction to  Data quality services

v

DQS Structure

Page 10: Introduction to  Data quality services

10

Knowledge Base

DQS Structure and Flow

Domains Matching Policies

Composite Domains

Matching Project

Cleansing Project

Matching Project

Cleansing Project

Cleansing Project

Page 11: Introduction to  Data quality services

11

Knowledge BaseStarting point for data quality provisioningUses locally customized data stores or marketplace data sourcesHighly reusable and evolutionaryKey elements:

Domains

Matching policies

Page 12: Introduction to  Data quality services

12

Knowledge BaseCreate by:

Knowledge discovery

Domain management

Matching rule

Page 13: Introduction to  Data quality services

13

Knowledge Base

Page 14: Introduction to  Data quality services

14

Domains

Domain = data fieldDomain rules

Composite domainsAllows greater flexibility in domain rules

Page 15: Introduction to  Data quality services

15

Data Quality Project

Create interactive projects for data matching and cleansing

Leverage one or more domains in an existing knowledge base

Somewhat reusable

Page 16: Introduction to  Data quality services

16

Data Quality Project

Nondestructive – no changes to source of data to be cleansed

No changes to the KB eitherSeparately, DQS project data can be used to improve the knowledge base

Page 17: Introduction to  Data quality services

17

Data Quality Project

Page 18: Introduction to  Data quality services

18

DQS Operations

CleansingProcess data against known entities and domain rules

Similar to Fuzzy Lookup transform in SSIS

MatchingGroup data together

Similar to Fuzzy Grouping transform in SSIS

Page 19: Introduction to  Data quality services

19

DQS Administration

Monitor past activity

Set logging options

Set confidence thresholds

Page 20: Introduction to  Data quality services

20

DQS Administration

Page 21: Introduction to  Data quality services

21

DQS and SSIS

SQL Server Integration Services has integrated hook into DQS

DQS Cleansing Component

Provide automated, noninteractive data cleansing operations

Page 22: Introduction to  Data quality services

22

DQS and SSIS

Page 23: Introduction to  Data quality services

v

Demos

Page 24: Introduction to  Data quality services

24

Shortcomings

V1 product

No API – must use DQS client interactively

SSIS component only does cleansing

Page 25: Introduction to  Data quality services

25

Final Thoughts

CU1 performance improvementshttp://bit.ly/IKmMow

DQS videos / blogshttp://technet.microsoft.com/en-us/sqlserver/hh780961

My blog (www.TimMitchell.net)

DQS/MDS virtual chaptermasterdata.sqlpass.org

Page 26: Introduction to  Data quality services

v

Questions?