16
White Paper: Accelerate Data Integration Automating the manual, time-consuming process of mapping data from source to target. Alex Gorelik Founder and Chief Technology Officer Exeros, Inc. June 2005

White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

White Paper: Accelerate Data Integration

Automating the manual, time-consuming process of mapping data from source to target.

Alex Gorelik

Founder and Chief Technology Officer

Exeros, Inc.

June 2005

Page 2: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#2 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Table of Contents

INTRODUCTION ............................................................................................................3

APPLICATION INTEGRATION CHALLENGES ......................................................4

ISLANDS OF INTEGRATION ..............................................................................................4 TIME AND COST SAVINGS WITH DATA-DRIVEN SOURCE-TO-TARGET MAPPING ................5

EXEROS ARCHITECTURE ..........................................................................................6

THE EXEROS DATAMAPPER PROCESS..................................................................7

MAPPING DISCOVERY PROCESS .......................................................................................7 TABLE MAPPING EXAMPLE..............................................................................................8

EXEROS DATAMAPPER AT WORK........................................................................10

LEGACY MIGRATION ....................................................................................................10 APPLICATION INTEGRATION .........................................................................................11 DATA MART CONSOLIDATION .......................................................................................12 NEW DATA WAREHOUSE CONSTRUCTION .....................................................................14

SUMMARY.....................................................................................................................15

ABOUT EXEROS AND THE AUTHOR .....................................................................16

Page 3: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#3 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Introduction

The business environment is constantly changing. Mergers, acquisitions, consolidations,

partnerships, off-shoring and a changing regulatory environment all drive an ongoing

stream of data integration projects that IT organizations must efficiently execute in

support of an evolving business. These projects include a wide variety of IT initiatives

such as application integration, data warehousing, legacy migration and server

consolidation. Each one of these initiatives comes with its own set of software solutions,

DW (Data Warehousing), EAI (Enterprise Application Integration), and ETL (Extract,

Transform, Load) to name a few, that are designed to automate much of the process.

And while the overall scope of these projects and their related software solutions can be

quite different from one another, all of these initiatives are similar in that they start by

first integrating and organizing enterprise data so it can be made available to other

applications and business users. To manage this integration, data must first be mapped

and transformed between systems so data that resides in Source System A can be

understood when it is moved into Target System B. Unfortunately, the process by which

data is mapped and transformed between application sources and targets (or between an

application and a data warehouse) is still manual and very time-consuming, accounting

for up to 70% of the data integration process. Many would say that due to the complexity

and the human element of these projects, this process is more like an art than a science.

Exeros DataMapperTM turns art into science with the first automated, data-driven

solution to this difficult problem. At customer sites, up to 80% of source-to-target maps

have been found automatically, and the remainder found by Exeros DataMapper’s guided

analysis.

Page 4: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#4 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Application and Data Integration Challenges

Let’s take a closer look at just one of these IT initiatives – application integration.

Application integration has consistently stayed among the top three CIO concerns for

years and is consuming, on average, 40% of IT budgets. The service to software ratio is

about 5:1 and the acceptance of packaged integration solutions in the market is fairly low.

Meanwhile, there has never been so much data, so many uses for it, and so much logic

and complexity built into the links between different systems.

DiscoverSemantics

DeployMaps

Effort

Project Lifecycle

Re-Deploy Maps

DesignSource-to-TargetMaps

Manual Effort Effort using

Integration Tools

(ETL, EAI)

Validate Source-to-Target

Maps

Redesign Source-to-Target Maps

As illustrated by this diagram, most of the time and effort in the design and deployment

phase of an integration project is spent on discovering and designing source-to-target data

maps. Domain experts, business analysts, data architects and integration technologists

work together to manually analyze and map the source data to the target data. The source-

to-target data maps are then deployed via the integration technology of choice (ETL, EAI

or custom scripts) and the data is loaded into the targets.

At this point, most project teams discover that the semantics assumed by the domain

experts do not reflect the reality in the actual data, causing iterations through validation,

redesign and reimplementation. Because the amount of rework required is unpredictable

and most integration technologies are not well designed to accommodate frequent and

rapid changes, it causes most projects to slip and exceed budget.

Islands of integration

The problem is further complicated because each kind of integration project typically

does not reuse source-to-target data maps created by other projects. For example, most

data warehouse development projects manually create source-to-target maps from

multiple applications to a common data warehouse model. They then import these

source-to-target maps into an ETL tool which moves the data from various sources to a

data warehouse target. Similarly, application integration or composite application

development projects use EAI software to integrate the same applications. However,

because users of EAI software are different from the ETL users, they will repeat the same

time-consuming source-to-target map discovery effort because either they don’t know

about or don’t trust the work done by the other team.

Page 5: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#5 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Not only is this current integration approach wasteful and expensive, it creates a breeding

ground for error and inconsistency:

� The same source-to-target maps may be implemented differently by different teams

� As business requirements and source applications change, integration logic is

frequently not kept up to date

� As complexity increases exponentially with more and more applications being linked

together, consistency checking and quality assurance are difficult to implement or

often lacking altogether.

There is an acute and immediate need for a solution that can automate the creation of

initial data warehouse target schemas, source-to-target data maps, transformations and

cross reference tables between disparate systems based on the data values themselves

and keep these source-to-target maps consistent across different integration platforms.

Time and cost savings with data-driven source-to-target mapping

Exeros DataMapper is the first data integration solution that automatically discovers up to

80% of the source-to-target data maps between enterprise systems by analyzing the data

values, not the metadata. Its innovative data exploration and analysis techniques enable it

to automatically discover relationships and source-to-target maps for most of the

structured data in the enterprise. The remaining 20% is discovered by the data analyst

using Exeros DataMapper’s guided analysis.

The dramatic reduction in the manual effort required to discover and design source-to-

target data maps for a typical integration project is represented in the diagram below.

Manual

Source-to-

Target

Mapping

Effort

Effort

Automated

effort using

DataMapper

5x-8x time

and effort

savings

Page 6: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#6 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Exeros Architecture

The Exeros solution consists of the following components:

Discovery Engine

ETL

EAI

EII

Mapping Studio

CRM

DW

ERP Custom

Apps

Metadata

Repository

Cross-

Reference

Tables

Profiling

& Staging

Database

Enterprise Data Map

Databots

ODBC

ORACLE

MSSQL

SAP

XML

Interface

Factory

Reports

Web Services

CWM

XQuery

PL/SQL

� Discovery Engine: The core component that analyzes data between

multiple data sources and generates source-to-target maps.

� Mapping Studio: Graphical mapping environment that displays

information about data sources, structures, and mappings discovered by the

Discovery Engine, as well as actual data. This allows analysts to rapidly

investigate, design, and validate mappings between disparate systems

� Databots: Automated processes that perform data sampling and profiling.

Databots run separately from the Mapping Studio.

� Reporting Module: Creates HTML and Excel metadata reports that

document discovered metadata, showing data lineage and relationships for

all data analyzed by DataMapper.

� Cross-reference Tables: Relational tables generated by DataMapper to

contain lookup values that logically connect two data sets

� Metadata Repository: File-based repository that stores all discovered

information

� Profiling and Staging Database: Relational database that stores profiling

results and data samples

� Interface Factory: Creates transformation logic for use by integration

products, expressed in the format understood by those products, such as

metadata adapters, SQL, XML, CWM, and ETL APIs.

Page 7: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#7 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

The Exeros DataMapper Process

Every enterprise has many structured data sources that are shared and managed by

different systems in the enterprise. The Exeros DataMapper Discovery Engine efficiently

identifies the local data models managed by different systems and generates source-to-

target maps between any two data sources.

Mapping discovery process

The following diagram illustrates the data discovery and mapping process.

XML

Data

Sources

Analyze and

Profile Data

Discover

Relationships

Within Data

Sources

Discover

Maps

Between

Data Sources

RDBMS

App

CSV New

ETL

System

XML

SQL

Source-to-Target

Maps and

Transformation

Logic

� Metadata from disparate data sources such as relational databases, flat files, XML

documents, message dumps and applications is imported into the Exeros Metadata

Repository and converted to the relational model.

� The data values in each data source are analyzed to discover the relationships (such

as primary and foreign keys) between the tables. Data can be analyzed directly in the

source systems or sampled and staged in staging databases. Sophisticated sampling

mechanisms are available to reduce the amount of data analysis without sacrificing

discovery effectiveness.

� Once each data source is organized into sets of related tables, the source-to-target

maps between the tables in different data sources are discovered. The analyst verifies

and approves the discovered relationships interactively.

Page 8: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#8 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Table mapping example

The following diagram illustrates the steps that Exeros DataMapper follows to

automatically map the Product Sales table from Application 1 to the Product Sales table

in Application 2. Exeros DataMapper reads the actual data values, not just metadata

(such as column names), in order to identify these data relationships.

Amber Green

Fred Sullivan

John Doe

Manager

Pipe2

Clip11

Widget12

Name

N11

XKKU3

XKJ334

ID

899887772001000000NNN2

260L19

501200201000SE12

Q2

Returns

Q2

Sales

Q1

Returns

Q1 SalesSID

Amber Green

Fred Sullivan

John Doe

Manager

Pipe2

Clip11

Widget12

Name

N11

XKKU3

XKJ334

ID

899887772001000000NNN2

260L19

501200201000SE12

Q2

Returns

Q2

Sales

Q1

Returns

Q1 SalesSID

F.

Sullivan

J. Doe

J. Doe

PM

L19

SE12

SE12

Supplier

Clip11

Widget12

Widget12

Product

260Q217

501200Q212

201000Q112

ReturnsSalesQuarterPID

F.

Sullivan

J. Doe

J. Doe

PM

L19

SE12

SE12

Supplier

Clip11

Widget12

Widget12

Product

260Q217

501200Q212

201000Q112

ReturnsSalesQuarterPID

Application 1: Product Sales Table

Application 2: Product Sales Table

12XKJ33

17XKKU3

56N11

PIDID

12XKJ33

17XKKU3

56N11

PIDID

Reverse

Pivot

DataMapper-created cross-reference table

1. First, Exeros DataMapper discovers that the natural key consisting of supplier id and

product name relates the two tables. This key is stored in the SID and Name

columns in Application 1 and the Supplier and Product columns in Application 2.

Only by reading the data values (not the metadata) in this case can Exeros

DataMapper find this relationship, since the column names Name and

Product cannot be logically related by themselves.

2. A cross-reference table is created between the primary keys in the two tables (ID in

Application 1 and PID in Application 2). Exeros DataMapper uses the natural keys

discovered in step 1 during this process to cross-reference the primary keys.

3. Exeros DataMapper discovers that the PM column in Application 2 consists of the

first character of the Manager column in Application 1, followed by a ‘.’, a space,

and the second token of the Manager column.

4. The values in Q1Sales, Q1Returns, Q2Sales, Q2Returns, etc., from Application 1

have been reverse pivoted (turned into rows) in Application 2. Exeros DataMapper

generates a separate mapping for each set of pivoted columns

that create a single row (e.g., Q1Sales and Q1Returns).

5. Finally, Exeros DataMapper discovers a filter on the Q1Sales column – only rows

with non-null Q1Sales have corresponding rows in App2.

Page 9: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#9 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Types of transformations discovered by Exeros DataMapper

Exeros DataMapper can discover the following types of transformations:

Type of transformation Example

Scalar One to one mapping Target.Name = Source.Name

Substring Target.ProductNumber =

Substring(Source.SerialNumber, 1, 7

Concatenation Target.Name = Source.FirstName ||

Source.LastName

Constants Target.Status = ‘S’

Tokens Target.FirstName = token(Source.Name, 1)

Type and date

conversions

Joins Inner, left outer

Aggregation Sum, average, minimum,

maximum

Reverse pivot

Cross-

Reference

Key, Code

Constant

filters

=, !=, <, <=, >, >=

in, not in, null, not null

Conjunctions

Units < 10000 and State in (‘NY’, ‘CA’)

Cross-references are stored in lookup tables. Exeros DataMapper can automatically

generate a lookup table or use an existing lookup table.

Page 10: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#10 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Exeros DataMapper at Work

Following are examples of the types of IT integration projects that can be accelerated by

Exeros DataMapper.

Legacy Migration

In this example, a company wants to replace a legacy application with a new packaged

application. Currently, this would have to be done manually, by examining metadata in

both applications, developing scripts to extract legacy data and load it into the new

application and rewriting all interfaces from the legacy application to other systems such

as the data warehouse. By using Exeros DataMapper, the company can significantly

reduce the manual effort required.

Legacy

Application

1) Sample “isotope”

records entered

into new application

2) Legacy data mapped

and loaded into new

application

DownstreamApps andDatabases

3) New application mapped

to downstream data

warehouses, applications,

etc., and ETL jobs created

to move the data

New

Application

1. Load sample data. First step in this process is to load some sample data (called isotope

records) from the legacy system into the new application. This is done through the

standard application screens and helps clarify the business semantics of the new system.

Isotope records help Exeros DataMapper pinpoint the location of the data in the new

application’s schema.

2. Map legacy data to new application. Once the isotope records are in the new system,

Exeros DataMapper maps them to the legacy system and generates ETL jobs or SQL

code to load the rest of the legacy data into the new system. The two systems can now

operate in parallel in test mode until the production deployment.

3. Map data warehouse to new application. Finally, the data warehouse, as well as any

other existing interfaces, is mapped to the new system and ETL jobs or SQL code is

generated to populate the data warehouse from the new application. At this point, after

the testing period, the legacy application can be safely retired.

Page 11: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#11 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

Application Integration

This example addresses application integration where two systems, CRM and ERP, need

to exchange messages involving customer records, orders, and so on. Without Exeros

DataMapper, the application experts for both systems have to agree on how to map CRM

messages to the corresponding ERP messages. The code (e.g., XSLT) to transform those

messages needs to be written and added to the EAI system. Code and key translation

tables would need to be generated by matching codes (e.g., order status code) and keys

(e.g., customer id) from CRM to ERP.

This is an extensive manual process that is not well supported by EAI systems. Finally,

the messages would need to be generated and tested. Sample messages would be

generated by the CRM application. These would be examined at the XML level,

transformed to the ERP XML representation having their codes and keys replaced with

ERP codes and keys, and applied to the ERP system. The effects of these messages

would then need to be examined through the user interface to make sure they are

correctly processed by the ERP system.

CRM

CRM

Database

EDI

Files

XML

ERP

ERP

Database

EDI

Files

XML

EAI

1) Map CRM data to ERP data

and vice versa as needed

2) Generate cross-reference tables

3) Map documents or messages

4) Generate transformation code

Cross-

Reference

Tables

XSLT

Exeros DataMapper eliminates much of the manual process by automatically matching

the correct messages to each other, building key and code cross-reference tables and

generating XSLT code to implement the transformation logic.

1. Map the data from one application to the other. In order to transform messages

going from the CRM system to the ERP system and vice versa, Exeros DataMapper

analyzes the data of the two applications and determines the transformation logic

needed. Application data includes both the application database and any application

interfaces such as XML web services messages, EDI messages, reports, etc.

2. Generate cross-reference tables. Once the data of the two applications are mapped

to each other, Exeros DataMapper can generate and populate the cross-reference

tables that relate the CRM application’s keys and codes to those of the ERP

application.

3. Map documents or messages. Next, the integration architect maps a new interface

between CRM and ERP systems using Exeros DataMapper. She selects a message

format from the ERP system and one from the CRM for the same business entity

(e.g., order). She then creates a message dump of the same orders from each system

and uses DataMapper to discover transformation logic required to convert messages

in CRM format to the messages in the ERP format.

Page 12: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#12 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

4. Generate transformation code. Finally, the integration architect selects the

integration technology (e.g., XSLT, XQuery, XPath) and the Exeros DataMapper

Interface Factory generates the appropriate transformation code that gets imported

into the EAI designer and used to convert messages published by CRM into ERP

messages.

The EAI system is now ready to process messages. It subscribes to the CRM message,

uses XSLT transformations generated in step 3 to convert it into the ERP format, uses

cross-references tables generated in step 2 to convert any CRM keys and codes to the

appropriate ERP keys and codes; and loads the message into the ERP system.

Data Mart Consolidation

In this example, a company has a legacy system that feeds a data warehouse, a number of

data marts and reporting databases as the diagram below illustrates. These data marts

were developed by outside consultants, contain a significant amount of redundant data,

and have grown beyond the organization’s ability to maintain them. The company has

decided that these data marts need to be integrated into the data warehouse and the

existing reports migrated to run against the data warehouse so the data marts can be

retired.

Transactional

System

(AS/400)

Independent

Data Marts

BI

Reports

BI

Reports

Data

Warehouse

BI

Reports

BI

Reports

Extract

Files

Extract

Files

Without Exeros DataMapper, this company’s IT organization would have to assemble a

team of domain experts who understand the systems, integration experts who understand

the technology used to load the data marts and the data warehouse, and business

intelligence experts who understand the reporting technology to try to manually discover

the source-to-target maps. Considering the number of tables, columns and the amount of

data involved, this would be a significant, extremely error-prone manual effort.

By using Exeros DataMapper, this customer can greatly reduce the manual effort as well

as lower the required skill and knowledge level of the team members. The domain and

integration experts can rapidly zero in and spend their time on the small percentage of

source-to-target maps that are not automatically discovered, instead of having to

manually discover every single map. Using the Exeros Mapping Studio, the manual

discovery effort is further reduced by having all the necessary information available

through a single graphical interface.

Page 13: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#13 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

The phases of this project are as follows:

Transactional

System

(AS/400)

Independent

Data Marts

BI

Reports

BI

Reports

Data

Warehouse

BI

Reports

BI

Reports

Extract

Files

Extract

Files

Migrate reports

from data marts

to DW

Migrate

reports from

AS/400

to DW

Fold data marts

into DW

1. Determine what data is already in the data warehouse and what data is missing

and needs to be added from the data marts. Without Exeros DataMapper, this step

would involve manual schema and data exploration and matching, combined with

reverse engineering of the ETL logic used to load the data marts and the data

warehouse in order to cross-reference the lineage. This reverse-engineering approach

may not be viable if custom code has been used (e.g., RPG, COBOL, Perl) for ETL.

Exeros DataMapper automates this step by discovering mappings between the data

marts and the data warehouse and producing reports showing where each table and

column from the data mart resides in the data warehouse and vice versa.

2. Augment data warehouse schema with missing data. Once the data marts have been

mapped to the data warehouse, the data warehouse schema can be augmented with

the missing tables and columns. PL/SQL scripts or ETL jobs can be generated by

Exeros DataMapper to migrate the missing tables and columns from the data marts to

the data warehouse and new ETL jobs can be created to populate these tables from

the transactional application.

3. Migrate reports from data marts to data warehouse. Next, the reports from the data

marts can be migrated to the data warehouse. Without Exeros DataMapper, this

would entail a manual process of determining the code that generates each report,

cross-referencing it against the new data warehouse schema and writing new code to

generate the report from the data warehouse. Instead, DataMapper maps the reports

(stored as comma separated files) to the data warehouse that now contains all the

data mart data and produces the SQL necessary to generate each report from the data

warehouse schema.

4. Migrate reports from AS/400. A similar process can be performed to migrate the

reports run off the AS/400 transactional system. Once the new reports are created to

run against the data warehouse, the data marts can be eliminated.

Page 14: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#14 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

New Data Warehouse Construction

The following diagram illustrates how Exeros DataMapper can be used to build and

deploy a data warehouse. Again, without Exeros DataMapper, this process is manual,

involving many domain and application experts from different areas of the company who

need to agree on how the data in each application corresponds to the data in the other

applications and the proposed data warehouse schema. This process is usually very time

consuming and iterative because of the complexity and difficulty inherent in mapping

large complex systems. Assumed semantics specified by the experts are frequently not

reflected in the actual data either due to lack of understanding or because of dirty (bad)

data. Multiple passes are usually required to resolve all the inconsistencies.

1) First data source added to data warehouse via ETL

2) Mappings and cross-reference tables generated betweensecond data sourceto data warehouse

Data

Warehouse

3) Mappings and cross-reference tables generated betweenthird and followingdata sources to datawarehouse

Cross-

Reference

Tables

Source-to-Target

Maps

(1)

(2)

(3)

ETL

By deploying Exeros DataMapper, the process is greatly streamlined, by looking at the

actual data values to discover mappings, and uncovering inconsistencies and addressing

them up front. The time required of the experts is significantly reduced because they only

need to resolve inconsistencies uncovered by Exeros DataMapper and specify the

mappings that it was unable to find. Since they can do this by examining the actual data

using Exeros DataMapper Mapping Studio, even this manual element proceeds

significantly faster and more smoothly.

1. First data source added to data warehouse. Most often, the first phase of a data

warehouse project involves adding data from one source to the warehouse. Exeros

DataMapper can be useful here to help identify data relationships within the first data

source that may not be obvious to the designer unfamiliar with the data source (e.g.,

a data warehouse architect may not know every possible SAP R/3 table that could be

useful in an inventory analysis data warehouse). Exeros DataMapper can also be

used to validate the mappings after they have been created. This step is frequently

the easiest, since the data warehouse often uses the structure and the codes of the

original system.

2. Second data source added to data warehouse. In the second phase an additional

source is added to the warehouse. This is usually the most difficult part of a data

warehousing project as the new source needs to be reconciled against the existing

data (from the original source) It is at this point that Exeros DataMapper becomes

most useful, as it automatically discovers source-to-target maps from the second

source to the data warehouse.. It generates transformation logic that can be used by

ETL tools or SQL scripts and creates cross-reference tables to hold data values used

to relate keys and codes between the various source systems. These will be used for

lookup during data loading processes.

Page 15: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#15 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

3. Third and following data sources added to data warehouse.

In following phases, additional sources are added to the warehouse using a similar

process as in step two. However, as more data sources are used in a data warehouse,

the complexity increases geometrically, and Exeros DataMapper becomes even more

valuable in automating and managing this complexity.

Summary

At customer sites, Exeros DataMapper has been shown to reduce the time and resources

required to deploy IT integration projects by up to 80%. As the only software product to

examine the data values themselves, instead of relying on metadata or specifications for

integration planning, Exeros DataMapper is a pioneer in the IT integration marketplace.

Exeros DataMapper can accelerate time to deployment for many IT projects, including:

� Legacy migration

� Application deployment and integration

� Data consolidation

� Merger or acquisition

� Data mart consolidation

� Data warehousing

� Data warehouse creation

� Data warehouse augmentation

Page 16: White Paper: Accelerate Data Integrationhosteddocs.ittoolbox.com/AGExeros070105.pdf · Exeros DataMapper TM turns art into science with the first automated, data-driven solution to

#16 Copyright 2005 Exeros Inc.

Exeros and Exeros DataMapper are Trademarks of Exeros Inc.

About Exeros and the Author

Exeros is based in Santa Clara, CA. The company was founded in 2002 and is led by a

team of seasoned enterprise software executives

Alex Gorelik serves as CTO and VP of Engineering at Exeros. Prior to starting Exeros,

Alex Gorelik worked with a number of large IT organizations at companies such as IBM,

Unilever and Jysk on their enterprise-wide data integration projects. Before that, Alex

was a co-founder, CTO, VP of Engineering and President of Acta Technology (acquired

by Business Objects in 2002) where he developed the industry-leading ActaWorks Real-

Time data integration platform that combined ETL and EAI and propelled Acta into the

leading visionary position in Gartner’s ETL Magic Quadrant.

Earlier, Alex was an architect at Sybase developing Sybase’s EAI vision and architecture.

He was also one of the original developers and later a manager of Sybase Replication

Server – the first event-based, real-time data replication system. Prior to Sybase, Alex

developed a database kernel at Amdahl.

Alex holds an MSCS from Stanford and BSCS from Columbia University.

Exeros DataMapper, Exeros Discovery Engine, Exeros Databots, and Exeros Mapping Studio are trademarks of

Exeros, Inc. All other brands or product names used herein are trademarks or registered trademarks of their

respective owners.

Published in Santa Clara, California © 2005, by Exeros, Inc. All rights reserved.

Phone: 408.919.0191

web: www.exeros.com

email: [email protected]