Upload
cynthia-saracco
View
678
Download
1
Embed Size (px)
Citation preview
© 2016 IBM Corporation
Introducing Big SQL Federation
Created by C. M. Saracco, IBM Silicon Valley LabJune 2016
© 2016 IBM Corporation2
Executive summary
§ What’s Big SQL federation? − Integration technology for Hadoop and remote data sources− Transparently query Big SQL (Hadoop) and RDBMS tables with standard SQL − Query optimization, security mapping, other critical features built in
§ Why federate? − Not always practical to move / replicate data from one source to another − Hadoop programmers need access to corporate RDBMS data to enhance analytics,
integrate public and proprietary data, etc. § What’s supported?
− Big SQL tables (and views) in DFS, HBase, or Hive warehouse − RDBMS tables (and views) in Oracle, Teradata, MS SQL Server, DB2, Informix,
Netezza, . . . − Query data across all sources (project, restrict, join, union, wide range of sub-queries,
wide range of built-in functions ) − INSERT INTO … SELECT FROM … − Issue data-source specific SQL − Collect statistics and inspect detailed data access plan − . . . .
© 2016 IBM Corporation3
Agenda
§Overview − Key features − When to federate
§Technology − Architecture − Set up, usage examples − Supported data sources
§Summary
© 2016 IBM Corporation4
Big SQL query federation = virtualized data accessTransparent§ Appears to be one source§ Programmers don’t need to know how /
where data is stored
Heterogeneous§ Accesses data from diverse sources
High Function§ Full query support against all data§ Capabilities of sources as well
Autonomous§ Non-disruptive to data sources, existing
applications, systems.
High Performance§ Optimization of distributed queriesSQL tools,
applications Data sources
Virtualizeddata
© 2016 IBM Corporation5
When to federate….
§ Budget§ Resources§ Time§ Ownership
§ Too ad hoc, temporary§ Too proprietary § Too recent § Too big
Physical integration not always a requirement/option
Barriers
© 2016 IBM Corporation6
Agenda
§Overview − Key features − When to federate
§Technology − Architecture − Set up, usage examples − Supported data sources
§Summary
© 2016 IBM Corporation7
Federation architecture and components
Wrapper
ServerServer
Nicknam
e
Nicknam
e
Nicknam
e
Federated server: BigSQL database enabled for federation.
Wrapper: library allowing access to a particular class of data sources or protocols (Net8, DRDA, etc). Contains information about data source characteristics
Server: represents a specific data source
Nickname: a local alias to data on a remote server (e.g, a specific table or view)
Federation catalog
4Stores information about4Wrappers,servers, nicknames4Server attributes4Nickname attributes4Remote functions
Federation server (Big SQL)
© 2016 IBM Corporation8
Federation in practice § Admin enables
federation
§ Apps connect to Big SQL database
§ Nicknames look like tables to the app
§ Big SQL optimizer creates global data access plan with cost analysis, query push down
§ Query fragments executed remotely
Nicknam
e
Nicknam
e
TableCost-based optimizer
WrapperClient library
WrapperClient library
Local + Remote Execution Plans
Remote sources
Federation server (Big SQL)
Native dialect
Connect to bigsql
© 2016 IBM Corporation9
Creating and using federated objects (example)
-- Create wrapper to identify client library (Oracle Net8) CREATE WRAPPER ORA LIBRARY 'libdb2net8.so'
-- Create server for Oracle data source CREATE SERVER ORASERV TYPE ORACLE VERSION 11 WRAPPER ORA
AUTHORIZATION\”orauser\” PASSWORD \”orauser\” OPTIONS (NODE 'TNSNODENAME', PUSHDOWN 'Y', COLLATING_SEQUENCE 'N');
-- Map the local user 'orauser' to the Oracle user 'orauser' / password 'orauser' CREATE USER MAPPING FOR orauser SERVER ORASERV OPTIONS (
REMOTE_PASSWORD'orauser');
-- Create nickname for Oracle table / viewCREATE NICKNAME NICK1 FOR ORASERV.ORAUSER.TABLE1;
-- Query the nicknameSELECT * FROM NICK1 WHERE COL1 < 10;
© 2016 IBM Corporation11
Data sources supported by Big SQL Federation Server
§ Current list of supported data sources available athttps://www-304.ibm.com/support/entdocview.wss?uid=swg27044495
Data Source Supported Versions Notes
DB2® DB2 for Linux, UNIX, and Windows 9.7, 9.8, 10.1, 10.5
DB2 for z/OS 8.x, 9.x, and 10.x
Oracle 11g, 11gR1, 11g R2, 12c
Teradata 12, 13, 14 Not supported on POWER systems.
Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems.
Informix 11.5
Microsoft SQL Server 2012, 2014
© 2016 IBM Corporation12
Agenda
§Overview − Key features − When to federate
§Technology − Architecture − Set up, usage examples − Supported data sources
§Summary