21
A JDBC driver supporting Data Source Integration Jian Jia

A JDBC driver supporting Data Source Integration

  • Upload
    mrinal

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

A JDBC driver supporting Data Source Integration. Jian Jia. Motivation. Manually integrate information sources is painful, because of Heterogeneous data source inconsistent / incomplete data structure information Platform dependency. A Solution. Unity - PowerPoint PPT Presentation

Citation preview

Page 1: A JDBC driver supporting Data Source Integration

A JDBC driver supporting Data Source Integration

Jian Jia

Page 2: A JDBC driver supporting Data Source Integration

Motivation

• Manually integrate information sources is painful, because of– Heterogeneous data source– inconsistent / incomplete data

structure information – Platform dependency

Page 3: A JDBC driver supporting Data Source Integration

A Solution

• Unity – Automate integration process– Uses ODBC to access multiple data sources– X-Spec to capture semantic meaning of the

data --- Standard dictionary– In C++– On Windows platform

Page 4: A JDBC driver supporting Data Source Integration

Goal of this Project

• Unity JDBC Driver– Embed integration function of Unity into a JDBC

driver

– X-Spec as the dictionary

– in Java – platform independent

– Access multiple data sources through JDBC

Page 5: A JDBC driver supporting Data Source Integration

Migrating Unity from ODBC to JDBC

Integration Module

JDBCJDBC JDBC …

DB1 DB2 DBn …

ResultsSemantic Queries

User Queries

SQLSQL SQL

Unity JDBC DriverUnity JDBC Driver

User Queries

Integration Module

ODBCODBC ODBC …

DB1 DB2 DBn …

ResultsSemantic Queries

SQLSQLSQL

UnityUnity

Page 6: A JDBC driver supporting Data Source Integration

Basic classes of JDBC API

DriverManager

Driver

Connection

Statement ResultSet

ResultSetMetaData

registers

provides

creates retrieves

provides

Page 7: A JDBC driver supporting Data Source Integration

JDBC Driver Types

• JDBC-ODBC Bridges plus ODBC drivers

• Native-API partly-Java drivers

• JDBC-Net pure Java drivers

• Native-protocol pure Java drivers

Page 8: A JDBC driver supporting Data Source Integration

Semantic Query

• An example SELECT [Employee].id, [Department;Employee].name

WHERE [Employee].age > 30

– All Fields/Tables that have the same semantic meaning should have same semantic name.

– Semantic query refers a field by its semantic name.– There is no explicit relation specifications(from table, join , union) in the query. – X-Spec Document stores information about all semantic names and corresponding system

names for every field/table. – No nested query.– Semantic Query should be parsed to create sub-query in standard form(SQL’92) for each

data source.

Page 9: A JDBC driver supporting Data Source Integration

Semantic Query GrammarSELECT

ALLDISTINCT

[Column] ,

FROM Tables

WHERE Search Condition

GROUP BY Columns

ORDER BY [Column] ,DESC/ASC

Page 10: A JDBC driver supporting Data Source Integration

Semantic Query GrammarSearch Condition

Expression [NOT] LIKE “[%] String”

ColumnIS [NOT] NULL

(Expression)

Expression =<><>

Expression

OR AND

NOT

Page 11: A JDBC driver supporting Data Source Integration

Parsing X-Spec

X-Spec Table 1

X-Spec Table 2

X-Spec Table 3

X-Spec Table n

X-Spec Field 1 X-Spec Field 2 X-Spec Field k …….

X-spec Key 1 X-spec Key 2 X-spec Key j…

X-spec Joins …

.

.

.…

……

Page 12: A JDBC driver supporting Data Source Integration

Parsing Semantic Query

Query Translator

Semantic query

S_list C_listF_list GroupBy_list OrderBy_list

PASS ONE PASS TWO

Selected Fields (Sys_Name)

Used for integration

Selected X-Spec Fields

Mapping semantic Name toSystem Name; Build sub-query

Sub Query 1 Sub Query 2 Sub Query n……..

Page 13: A JDBC driver supporting Data Source Integration

Sub-Query Generation

• S-List(Selection-List) Only those semantic fields that are in the data source can be

substituted by their system names, and added to corresponding sub-query selection list

• C-List(Condition-List) An expression can only be added to sub-query condition list only

when all semantic arguments are in the data source.

Page 14: A JDBC driver supporting Data Source Integration

Sub-Query Generation

S-List

C-List

OrderBy-List

Sub-S-Clause

Sub-From- Clause

Sub-Where- Clause

Sub-Order By- Clause

Page 15: A JDBC driver supporting Data Source Integration

Inside JDBC DriverSemantic query

Selected Fields (Sys_Name)

Sub Query 1

Sub Query 2

Sub Query n…

Query Translator

DB1 DB2 DBn …

JDBC 1 JDBC 2 JDBC n

ResultSet 1

ResultSet 2

ResultSet n

ResultSet-MetaData

ResultSet-MetaData

ResultSet-MetaData

Join Union

ResultSet

ResultSetMetaData

Integration

Page 16: A JDBC driver supporting Data Source Integration

Integration Method

• JOIN Merge Join by Global Keys

MultiValue Field – Data inconsistent

• UNION Simply append one ResultSet to another

Do not need match keys

Page 17: A JDBC driver supporting Data Source Integration

A Simple Example

• Semantic Query (for two data sources) SELECT F1, F2, F3, F4, F5

WHERE C1 AND C2

SELECT f1a, f3a, f5aFROM tableAWHERE C1

SELECT f1b,f2b, f4bFROM tableBWHERE C2

Sub-query for DB1 Sub-query for DB2

Page 18: A JDBC driver supporting Data Source Integration

Example Result

• Join ResultAssume Key_A and Key_B are two system names of the global KEY in two data sources

KEY_A f1a f3a f5a123

KEY_B f1b f2b f4b234

KEY F1 F2 F3 F4 F51 NULL NULL

34 NULL NULL

2

DB 1 DB 2

Page 19: A JDBC driver supporting Data Source Integration

Example Result

• Union

f1a f3a f5a f1b b2b f4b

F1 F2 F3 F4 F5NULL NULLNULL NULLNULL NULL

NULL NULLNULL NULLNULL NULL

DB 1 DB 2

Page 20: A JDBC driver supporting Data Source Integration

Future Work

• Operations cross data sources • Complete Algorithms for Result Integration• Automated Updates on heterogeneous data Sources• Implement Group By, From in semantic query

Page 21: A JDBC driver supporting Data Source Integration

??????