13
WHITE PAPER Usage Landscape Enterprise Open Source Data Integration

Usage Landscape of Enterprise Open Source Data Integration

Embed Size (px)

DESCRIPTION

Talend Document Download Download: Usage Landscape of Enterprise Open Source Data Integration

Citation preview

Page 1: Usage Landscape of Enterprise Open Source Data Integration

WHITE PAPER

Usage Landscape Enterprise Open Source Data Integration

Page 2: Usage Landscape of Enterprise Open Source Data Integration

Table of Contents

Introduction ................................................................... 3 Background .................................................................... 3 Diverse Data Integration Projects .......................................... 4 Data Integration Needs and Tools .......................................... 6 Open Source Data Integration vs. Proprietary Solutions ............... 8 Enterprise Requirements .................................................... 9 Community Support ........................................................ 10 Community Involvement ................................................... 11 Conclusion ................................................................... 13

Page 3: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 3 of 13

Introduction

Enterprise data integration needs are growing exponentially over

time, as is the interest in open source technologies and the adoption

of open source solutions.

With this in mind Talend conducted a survey to define the usage

landscape of open source data integration and to profile users of

this technology. The data used in this analysis was collected from

1013 survey participants. Responses came primarily from the U.S.

(56.5%), followed by Europe (35.2%), with the rest of the responses

(8.3%) originating in the rest of the World.

57%35%

8%

USEuropeOther

Survey respondents’ demographics

Background

As companies merge, acquire new applications, and build their IT

platforms by incorporating disparate applications with legacy

systems, information systems are becoming more and more

heterogeneous. As a result, data integration tools are now

indispensable if enterprise IT departments are to properly manage

the flows of data across the information system.

Page 4: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 4 of 13

In addition, alternative models of software deployment—such as

Software as a Service (SaaS)—and the need for interoperability with

partners, customers, providers, etc., all have an important impact

on data integration requirements.

The global economy is imposing cost controls on IT Managers, both in

terms of staff and software, at a time when data integration

represents an increasingly larger percentage of the enterprise IT

budget. Asked to do more with less, IT personnel would be better

off spending cycles on tasks other than the time consuming manual

scripting needed to meet custom requirements. In fact, software

resources with lower acquisition and operation costs would allow IT

Managers to more easily deploy enterprise-grade solutions.

 

In this context, open source solutions offer a very compelling

argument. Open source tools can automate and maintain tasks

formerly requiring manual scripts, and the existing skills of the IT

implementation team easily transfer to an open source offering. In

addition, IT departments don’t have to justify significant up-front

fees.

Diverse Data Integration Projects

Data integration is the collective term for technologies that include

ETL (Extract-Transform-Load) for business intelligence and data

warehousing, and operation data integration—the flows of data

across operational applications and systems. These needs can range

from high throughput batch transfers of data to near-real-time,

trickle-feed data flows.

Project Type

Consistent with the global data integration market distribution—

whether open source or proprietary—most of the survey participants

(61.5%) use open source solutions for their ETL projects, in

Data Integration The process of combining data residing at different sources and providing the user with a unified view of these data.

Page 5: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 5 of 13

particular for BI, Data warehousing and analytics. This can be

attributed to the fact that ETL is the most mature segment of the

entire data integration market.

0% 10% 20% 30% 40% 50% 60% 70%

Database Synchronization

Operational Data Integration: Real Time

Migration

Operational Data Integration: Batch

Data Loading

ETL

 

Types of projects for which open source data integration is used

Data loading (41.9%) and data migration (26.5%) are the second and

fourth most popular type of project. Both of these are good

candidates for open source solutions, as they are typically one-offs,

with no ongoing purpose that would justify a long-term investment

in an expensive proprietary tool.

Data synchronization (19.1%) is also a popular type of project

conducted by open source data integration users.

Batch vs. Real-Time

Operational data integration—whether batch or real-time—is also a

good fit for open source solutions. As business tempos speed up,

real-time and nearly real-time operational data integration projects

will prevail over bulk transfer projects. As of the date of the survey,

40% of participants used open source tools to manage their batch

operational data integration tasks, compared to only 22.9% for real-

time projects—but the latter is a much faster growing segment.

Data Synchronization The process of establishing data consistency on remote sources continually harmonizing the data over time.

Data Migration The process of transferring data between databases, applications or other systems, with the purpose of replacing a system with another.

Data Loading The process of loading data in an application or database—for example prior to its deployment.

Page 6: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 6 of 13

ETL vs. Operational Data Integration

Taken together, batch and real-time operational data integration

projects (62.9%) are slightly better represented than ETL usage

share (61.5%), even though the former market segment is less

mature. And, if we also add in data synchronization, the operational

project share reaches 82%. The reason for this over-representation is

simply that open source tools are particularly appropriate for

operational projects because they meet a number of data

integration requirements, whereas—traditionally—proprietary tools

focus on ETL. In addition, enterprises that want to diversify their

data integration tools are often discouraged by the licensing costs of

proprietary applications. Open source solutions offer a greater

breadth of connectivity and more flexibility in terms of adoption,

deployment, and maintenance.

Data Integration Needs and Tools

Although software companies are trying to provide unified

integration solution packages, the data integration needs for most

enterprises are so complex that they often need to multiply the

number and nature of the integration software products they use.

0% 10% 20% 30% 40% 50% 60%

Commercial software

Database utilities

Manual scripting

 

Data integration technologies used in conjunction with open source

Page 7: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 7 of 13

Survey participants proved to use a combination of commercial

applications, open source solutions, and database utilities to meet

their data integration needs.

The statistics show that using open source and commercial solutions

in combination is very common (31.2%), and that the two can, and

do, coexist on the same platform. In fact, open source solutions are

often complementary to an existing proprietary solution that—for

whatever reason—cannot address a specific need. In some cases it

may be that it’s not worth the expense of investing in a proprietary

solution extension.

The high incidence of database utilities shown in the survey results

(53.9%) is as expected—these utilities are a no-cost solution and are

usually included with the databases. Their usefulness, however, is

limited to dedicated database usage.

Applications are often stacked as needs arise—increasing

connectivity issues—whether enterprises want their CRM system to

communicate with their ERP module, or to have their disparate

databases exchanging information with their home-grown platform.

Faced with multiple connectivity issues, enterprises often have no

option other than manual scripting to keep data flowing across their

heterogeneous enterprise systems. This is why the survey results

rank manual scripting as one of the technologies most frequently

invoked (54.7%) by enterprises to meet their integration needs.

Although this is much higher than commercial (31.2%) packaged

technologies, it is not surprising that manual scripting is the solution

of choice as it carries the lowest initial cost.

Although manual scripting is often intended to be a short-term fix

for interchange issues, once in production it often becomes a

permanent solution. And, in the end, this simple stop-gap can

Page 8: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 8 of 13

become an entire home-grown platform. The drawback of hand

coding or home-grown platforms surfaces over time in the inevitable

maintenance problems that increase the TCO. The advantage,

however, is that it fits a particular need that none of the available

commercial or open source solutions can meet.

Open Source Data Integration vs. Proprietary Solutions

In an ongoing effort to lower their data integration software TCO,

many enterprises are now considering open source solutions, not

just for one-time projects, but also for their ongoing mission-critical

processes, to replace or complement their expensive CPU-

dependent solutions.

0% 20% 40% 60% 80% 100%

Source code access

No licensing costs

Avoid lock-in

Performance

Ease of use

Very important Important Neutral Not important

Decision criteria

Open source solutions are a real alternative to the proprietary

world. Key players have made major strides toward improving the

usability and friendliness of open source technologies, traditionally a

weak spot for these applications.

In just a few short years, open source has evolved from something

“geeky” into an enterprise-ready solution. Today, open source

solutions are sufficiently feature-rich to meet complex user

requirements. The survey results reflect these expectations.

Page 9: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 9 of 13

Respondents felt most strongly about ease-of-use (59%) and

performance (53.9%) as the most important aspects of an open

source data integration solution.

Surprisingly, licensing cost is not the gating criterion for enterprises

turning to open source solutions. It actually comes fourth after

performance, ease of use, and no lock-in (42.5%), with only 42.1% of

respondents considering it very important.

Access to the source code comes last on most priority lists when

enterprises are choosing open source tools.

It is a common misconception that control of the source code is

important for users of open source software. Most users today

understand that open source solutions are as mature as their

proprietary counterparts and, therefore, don’t feel the need to

enhance the code themselves.

Today, open source solutions are advantageously replacing the

source code escrow of proprietary software. However, few

enterprises want to allocate in-house resources (or even have the

expertise) to edit, enhance, and maintain their data integration

applications code.

Enterprise Requirements

An analysis of the survey data indicates that users expect the same

performance and enterprise-scale features from open source

solutions that they previously found only in proprietary products. In

order of importance these features include:

• centralized scheduling and execution dashboard

• shared repository

• administration tools

Page 10: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 10 of 13

0%

10%

20%

30%

40%

50%

60%

70%

Scheduling tool

Dashboard

Shared repository

Administration tool 

Enterprise open source data integration requirements

First, 60.5% of respondents want a scheduling tool that lets them

consolidate and centralize their technical processes. Second, 57.8%

users need a dashboard to centrally monitor processes as they

execute. Because enterprise users often work in teams and need to

share data on large-scale projects, 54.9% consider a shared

repository essential. Finally, 38.4% of enterprise users want an

administration tool to centrally manage users and projects.

However, not all companies have enterprise-scale requirements.

Single users and SMBs might not need that sort of enterprise-grade

feature. What emerges is that open source solutions address diverse

needs for a variety of user profiles, whether large or small.

Community Support

As shown, enterprises want the same support with open source

solutions that commercial applications provide. The major

difference lies in the fact that a significant number of open source

users (84.9%) would rather call on the community for help

addressing issues than get support from a dedicated service. This

lets them reduce the cost of support and decrease their data

integration budget; the return they get from the community is

Page 11: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 11 of 13

comparable in quality to traditional support from a proprietary

vendor.

0% 20% 40% 60% 80% 100%

Phone support

Guaranteed response times

Email-based or Web-based support

Community support (forums, etc.)

Community vs. commercial support expectations

Open source users value the forum and the other community tools at

their disposal, as well as the ease-of-mind that comes from knowing

that there is no pressure to upgrade or to buy new tools. The

community also tends to be more responsive than traditional support

services and community tools are no-cost to the enterprise.

However, enterprise users working on mission critical projects, do

need (and demand) vendor-provided, enterprise-grade technical

support. This still represents a minority of the total number of users

of open source data integration (20.9%), but is a fast growing

proportion.

Community Involvement

Two-thirds of the respondents say that they are willing to actively

participate in the community, and nearly half are ready to help

beta-test open source products. Open source communities have a

real, live QA lab of thousands at their disposal. Open source users

appreciate getting support from the community and feel at ease in

sharing their experiences and helping other users solve problems.

Getting involved in the community ensures the sustainability of the

Page 12: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 12 of 13

open source arena and, by extension, the sustainability and the

quality of the application they use.

0%

20%

40%

60%

80%

Forum

Beta testing

Code contributions

Expectation for community contributions

Other community tools—like bug/feature tracking systems—are also

broadly used by the community, especially for feature requests.

Because the development cycle of open source applications is

usually quite short, users know that the chances of getting a feature

request developed and made available in the next release of an

open source application is significantly greater than a similar

request in the proprietary domain. It’s a win-win situation.

Community enterprises are asked to Beta-test and report bugs on

features that they requested previously, ensuring both quick access

to these features and the quality of the developed application.

In addition, participating in the community is much less time-

consuming than getting involved in the development itself. Only

10.4% of users want to contribute to code development. A closer

look at this group indicates that most of them want to contribute

external features—such as connectors—rather than core code.

Page 13: Usage Landscape of Enterprise Open Source Data Integration

Talend White Paper Usage Landscape - Enterprise Open Source Data Integration

Page 13 of 13

Conclusion

The results of the survey clearly show that open source data

integration solutions are mature enough for mission-critical

enterprise use in every arena and, in most areas, open source is as

powerful as its proprietary counterparts.

Open source products are stable and continually evolving to meet

market requirements. Their total cost of ownership is significantly

better than proprietary solutions and users confirm the ease of use

and performance of these products.

Open source data integration is indeed enterprise ready.

 

 

© 2009 Talend. All rights reserved.