Upload
simba-technologies
View
110
Download
0
Embed Size (px)
Citation preview
Open ConnectivityBI, Integration, and Apps on Couchbase using ODBC
and JDBCJune, 2015
• Who I am• Driver(s) Overview• Relational Model• Demos• ODBC in-depth• JDBC in-depth• QA
Contents
• Worked in the data access space for eight years and counting• ODBC, OLEDB, JDBC, ADO.NET, ODBO, XMLA
• Core developer for the current generation of Simba’s data access technologies
• Collaborated at an Engineering Level with Simba ISV Customers to design and implement data drivers that are today being shipped world wide
Kyle at a glance
• Simba connects people to data.• HQ’ed in Vancouver, BC.• 100ish employees.• Founded in 1991.• In 1992, Simba co-authored the original ODBC standard with Microsoft.• Simba produces the SimbaEngine® SDK and drivers for the leading
data sources on multiple platforms.
Simba Technologies at a glance
Simba Technologies at a glance
• Partnership to create read/write ODBC and JDBC drivers• ODBC 3.80• JDBC 4.0 and 4.1
• Allow easy access to data within Couchbase from your favourite BI and ETL tools
Why is Simba here?
What is an ODBC / JDBCDriver?N1QL mode to alloweasy and advancedanalytics
• Couchbase is NoSQL• Dynamic schema, documents vary within a bucket
• ODBC and JDBC are SQL• Expect a fixed schema, each column is one type
• Must map from dynamic schema data to fixed schema data
Schema(less)
• SQL• Catalog, Schema, Table
• Couchbase• Namespace, Keyspace
Schema => NamespaceTable => Keyspace (sort of)
Relational Mapping
http://www.prabathsl.com/2013/02/document-oriented-database_14.html
Sample JSON Document:{“Id” : 1, “Name”: “Couchbase”, “Values” : [V1,V2]}
Simple Flattening
Id Name Values[0] Values[1]
1 Couchbase V1 V2
Sample JSON Document:{“Id” : 1, “Name”: “Couchbase”, “Values” : [V1,V2]}
Parent
Child
Re-Normalization
Id Name1 Couchbase
Id Index Value
1 0 V1
1 1 V2
Demos
• C API• Versions: 2.x, 3.0, 3.52, 3.80, etc…• Non-Windows platforms are ODBC 3.52
• Driver Managers• Windows, iODBC, unixODBC, etc…
• All functions have return codes• SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_ERROR, etc…
ODBC Technicals
• SQLHENV, SQLHDBC, SQLHSTMT, SQLHDESC
• Relationship is one-to-many
ODBC: Handles
• Allocate with SQLAllocHandle, free with SQLFreeHandle• Ensure you set the version of ODBC in use
SQLSetEnvAttr(hEnv, SQL_ATTR_ODBC_VERSION, (SQLPOINTER)SQL_OV_ODBC3_80, SQL_IS_INTEGER)
ODBC: Environment
• Allocate with SQLAllocHandle, created from SQLHENV• Maintains the actual connection to Couchbase• Create child statement objects to do work• Disconnect with SQLDisconnect, free with SQLFreeHandle
ODBC: Connection
Open a connection using SQLDriverConnectSQLDriverConnect(
hDbc, windowHandle, connStr, SQL_NTS, connStrLen&outStrLen,SQL_DRIVER_COMPLETE)
Last parameter allows driver to prompt for information if connStr doesn’t contain all necessary information.
ODBC: Connection
• Specify a Driver or DSN• Driver: Must specify all options in connection string• DSN: Can specify options in connection string
Example:“DSN=Couchbase;UID=kylep;PWD=testPassword;”
ODBC: Connection String
• Allocate with SQLAllocHandle, created from SQLHSTMT• Used for issuing queries, retrieving catalog metadata• Free with SQLFreeHandle
ODBC: Statement
• Use SQLExecDirect for one-off queries• Use SQLPrepare and SQLExecute for repeated queries
SQLExecDirect(hStmt, “<query>”, SQL_NTS)
orSQLPrepare(hStmt, “<query>”, SQL_NTS)SQLExecute(hStmt)
ODBC: Querying
• After execution, there is a cursor on the results, positioned before the first row
• Use SQLFetch to move the cursor• SQLGetData can be used to fetch cell by cell• SQLBindCol can also be used, much more efficient
ODBC: Results
• Reuse connections so driver caches are effective
• Use SQLBindCol over SQLGetData
• Use array fetches with SQLBindCol
• Use SQLPrepare once, SQLExecute multiple times with parameters for loading data
• Use parameter arrays when binding parameters with SQLBindParameter
• Bind types to match reported parameter or column types
ODBC: Performance Tips
SQLGetData vs. SQLBindCol
10 million rows, 3 columns of wide char, integer, decimal values
ODBC: Performance Tips
Method Time (s)SQLGetData 141.428SQLBindCol 6.102SQLBindCol (array[100]) 2.747
• Is the Java version of ODBC• Versions: 3.0, 4.0, 4.1, etc…• JDBC version is tied to Java version
• Simba will supply JDBC 4.0 and 4.1 versions• There is a driver manager, but role is very limited• Errors are reported via exceptions, warnings via getWarnings()
JDBC Technicals
• Relationship is again one-to-many
JDBC: Object Hierarchy
• Never used directly by your code• Must be loaded by referencing using Class.forName()• FQCNs• com.simba.couchbase.jdbc4.Driver• com.simba.couchbase.jdbc41.Driver
• URL• “jdbc:couchbase://<host>:<port>/<schema>;UseN1QLMode=0/1”
JDBC: Driver
• ExampleString url = “jdbc:couchbase://localhost:8093/default;”Class.forName(“com.simba.couchbase.jdbc4.Driver”);Connection con = DriverManager.getConnection(url);
• With JDBC 4.0 and later, Class.forName() can be omitted
JDBC: Driver
• Used directly by your code• FQCNs• com.simba.couchbase.jdbc4.DataSource• com.simba.couchbase.jdbc41.DataSource
• Can be used to programmatically set connection properties by using functions instead of a connection string
• Allows use of advanced features such as PooledConnection• Used less often than Driver
JDBC: DataSource
• Can create child statement objects for queries• Can create DatabaseMetaData objects for metadata via getMetaData()
• Common pitfalls• Not closing connections; use a finally block to ensure it is closed• Not checking warnings, ever; call getWarnings() regularly• Creating a new connection for every operation; reuse connections where
possible
JDBC: Connection
• DatabaseMetaData• Created by Connection objects• Actually only one object per connection, cached and reused • Provides access to catalog metadata• getCatalogs(), getTables(), getColumns(), etc…
• Provides access to database metadata• getDatabaseProductVersion(), getIdentifierQuoteString(), etc…
JDBC: Database MetaData
• Created by Connection objects• Three different types• Statement – for one-off querying• PreparedStatement – for queries with parameters• CallableStatement – for stored procedures with output
parameters*
• Statement objects will eventually dispose of themselves once out of scope, but best practice is to close() them when done
JDBC: Statement Objects
• Cannot use parameters• Use execute(), executeQuery(), or executeUpdate() to
execute SQL or N1QL queries
JDBC: Statement
execute() ExampleStatement stmt = conn.createStatement();try {
if (stmt.execute(“select * from beer-sample”)) {ResultSet rs = stmt.getResultSet();rs.close();
}}finally {
stmt.close();}
JDBC: Statement
• For use with parameters• Can get metadata about results before execution with getResultSetMetaData()
• Can get metadata about parameters using getParameterMetaData()
• Use set*() functions to provide parameter values
JDBC: PreparedStatement
• When loading data, use batches• Set all required parameters for one execution• Call addBatch() to add the current set of parameters• Call executeBatch() to execute all added batches at once
• Set parameters as reported types to avoid conversion overhead in the driver
• Reuse the statement for multiple executions
JDBC: PreparedStatement
• Represents query results or catalog metadata• Describe the result set using getMetaData()• Move through result using next()• Can use isAfterLast(), isFirst(), isLast(), etc. to check cursor
position.
• Retrieve cell values using get*() methods• The driver supports all conversions between types listed by the JDBC spec• Try to retrieve as requested type to avoid conversion overhead• Remember to check wasNull() after calling get*() method
JDBC: ResultSet
ResultSet ExampleResultSet rs = stmt.executeQuery(“<query>”);try { int numColumns = rs.getMetaData().getColumnCount();
while (rs.next()) { for (int i = 0; i < numColumns; ++i) { System.out.println(rs.getString(i)); }}
}finally {
rs.close();}
JDBC: ResultSet
Q & A
simba.com