50
DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications John Goodson Vice President, DataDirect, R&D

DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

Embed Size (px)

DESCRIPTION

DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications. John Goodson Vice President, DataDirect, R&D. Agenda. Topic: Understanding how drivers work and what factors influence performance Topic: Guidelines for improving performance Topic: Review. - PowerPoint PPT Presentation

Citation preview

Page 1: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

John GoodsonVice President, DataDirect, R&D

Page 2: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

2 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Agenda

Topic: Understanding how drivers

work and what factors influence

performance

Topic: Guidelines for improving

performance

Topic: Review

Page 3: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

3 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

At The End, You Should Understand:

How driver implementations trade off performance for functionality

How to design for performance

How to avoid pitfalls in some drivers

Page 4: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

4 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

ODBC

Provides a common API to all data stores Cross platform Initially a Microsoft standard based on a

draft pre-release of the X/Open SQL Access Group CLI

ODBC 3.0 aligns with the ISO and ANSI CLI standards

Page 5: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

5 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

JDBC

JDBC is a Java™ interface for accessing SQL data sources

Based on ODBC, Open Group SQL CLI, and ANSI/ISO CLI

JDBC 1.0 designed for basic usage with emphasis on ease of use

JDBC 2.0 enhanced for advanced features and server-side use

JDBC 3.0 goal was to “round out” API by providing missing functionality

Page 6: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

6 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Performance Issues

Developing code that works is hard enough without having to make it work fast!

Tough to know all the specifics of a single backend

No exceptions are thrown to say your code is running too slow

Most performance “help” is DBMS server focused … not middleware focused

Page 7: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

7 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

How Do Drivers Work?

Each database vendor defines a communication protocol of how the database server will communicate with clients

This communication protocol is defined in terms of packets that will be transferred using a network transport (i.e. information will be encoded into a specific database defined format that will be sent over the network (tcp-ip) to/from clients)

The database server only understands requests that are sent in this format

Page 8: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

8 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

How Do Drivers Work – Part 2

Every database vendor defines their packet descriptions and their model differently– Oracle and DB2 are cursor based models– SQL Server and Sybase are streaming models– Progress OpenEdge is a cursor based model

Every JDBC request will most likely result in many packets being sent to/from the database– Why?– (1) packets are fixed size and data might not fit in

one packet– (2) JDBC API does not map 1:1 with database

communication protocol

Page 9: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

9 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Why Do We Care How Drivers Work?

The two slowest things in a database system are– Network i/o– Disk i/o

The goal is to keep applications using the CPU and not waiting on movement of data across a network or to/from a storage device.

Page 10: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

10 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

What Will Applications Do?

– Connect– Look for data– Read data – Change data– Delete data– Make changes permanent (commit)– Disconnect

Page 11: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

11 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

What Happens When We Connect?

Login credentials verified by dbms Shared memory (allocated and) assigned to

connection– Depending on db between 24k and 64k

Various queues updated with connection presence

Connection recorded in audit files on disk Database context established Code page set Transaction information initialized Connection properties applied Initialization statements executed

Page 12: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

12 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Connecting

Connecting is the single slowest thing you can do in a well tuned application– Lots of network i/o’s … disk i/o’s … CPU cycles

Using connection pooling, we can make the slowest (network i/o and disk i/o intensive) operation in your application one of the fastest (using no network i/o or disk i/o)

With JDBC, Connection pooling requires use of JDBC DataSources … don’t connect via Driver Manager

Page 13: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

13 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

What Will Applications Do?

Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect

Page 14: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

14 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Looking for Data – Catalog Functions/DatabaseMetaData

Compared to other xDBC methods, catalog calls/DatabaseMetaData methods that return ResultSet objects are slow– getTables, getColumns, getProcedures, …

xDBC drivers have to execute complex queries – and many times multiple queries – in order to return the results mandated by the xDBC specs.

Page 15: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

15 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Getting a List of Tables

What you want: a list of all tables that I have starting with the prefix NYC

ResultSet WSrs = WSc.getTables (null, null, “NYC%”, null);

Page 16: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

16 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Getting a List of Tables (part 2)

What a driver thinks you want: A list of all tables, views, synonyms, system tables, aliases, or temporary tables owned by any schema in any catalog/database starting with the prefix NYC

ResultSet WSrs = WSc.getTables (“cat1”, “johng”, “NYC%”, “TABLE”);

Page 17: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

17 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Avoid Search Patterns for DatabaseMetaData Methods Using null arguments or search patterns to

Catalog/DatabaseMetaData methods results in time-consuming queries and, potentially, increases network traffic due to unwanted results.

Common places to see null that should be “fixed”– Schema name (most all)

– Table type (getTables/SQLTables)

– Catalog name (most all)

Page 18: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

18 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Oracle SQL … getProcedures … yuck!select a.owner, a.object_name, INSTR(a.object_type, 'PROC'), '1' from

sys.all_objects a where a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID'

union all select c.owner, c.synonym_name, INSTR(a.object_type, 'PROC'), '1' from sys.all_objects a, sys.all_synonyms c

where c.table_owner = a.owner and c.table_name = a.object_name and a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID'

union all select distinct b.owner, CONCAT(b.package_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b

where b.package_name IS NOT NULL group by b.owner, CONCAT(b.package_name, '.' || b.object_name)

union all select distinct c.owner, CONCAT(c.synonym_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c

where c.table_owner = b.owner and c.table_name = b.package_name and b.package_name IS NOT NULL group by c.owner, CONCAT(c.synonym_name, '.' || b.object_name)

union all select distinct c.owner, c.synonym_name, min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c

where c.owner = b.owner and c.table_owner=b.package_name and c.table_name=b.object_name group by c.owner, c.synonym_name

Page 19: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

19 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

getTypeInfo

Even getTypeInfo may result in a network round-trip– Do this once and cache the information

Page 20: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

20 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Use Prepared Statement Metadata Instead of getColumns (when possible)

getColumns/SQLColumns– driver sends a potentially complex SQL query to the server– SQL is preprocessed and executed on the server– driver must then obtain result set descriptions, bind results,

and retrieve rowset from the server– Usually at least 3 network i/o’s (probably more)

ResultSetMetaData/SQLDescribeCol– Simple query, which evaluates to no rows, sent to server –

NOT EXECUTED– Only result set information obtained– Usually 2 network i/o (prepare + close)

Page 21: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

21 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Example

PreparedStatement pstmt = WScon.prepareStatement (“select * from StockPurchasePlan where 0 = 1”);

ResultSetMetaData rsmd = pstmt.getMetaData();

int count = rsmd.getColumnCount;

// this code would replace

// getColumns (“ct1”, “johng”,

// “StockPurchasePlan”, “%”);

Page 22: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

22 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

What Will Applications Do?

Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect

Page 23: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

23 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Prepared Statements

Many times our applications do the same things over and over again by specifying SQL that is parameterized

SELECT firstname, lastname, address, phone FROM employees WHERE empid = ?

UPDATE balance SET balance = balance + ? WHERE acctid = ?

Page 24: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

24 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Prepared Statements (2)

Many times prepared statements are not as efficient as executing the SQL directly (using the Statement object in JDBC/SQLExecDirect in ODBC)

Use prepared statements if the query is to be executed more than twice in this session OR if it will be executed frequently by the system (JDBC)

Execute directly if the query is to be executed once or twice

Page 25: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

25 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Prepared Statement Side Effect

Side effect: There is no way to control what gets put in the prepared statement pool. If you create a prepared statement, then it’s going into the pool and might remove some other statement you want to stay in the pool.

Page 26: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

26 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Even More on Prepared Statements

Prepare, Execute, Describe– Always preferred– Saves network roundtrip with most databases

Prepare, Describe, Execute– Always slower

Page 27: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

27 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Stored Procedure Invocation

Use parameter markers for arguments to stored procedures … do not use literal arguments

{call getFunds (?,?,?)} // good

{call getFunds (1234, ‘JPG’, 53.23)} // bad

Page 28: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

28 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Why Not Use Literal Arguments?

The spec says not to use literals

More code for the programmer, but executions can be done via RPCs (remote procedure calls) instead of executed as language events

Use of RPCs improves performance by– Eliminating parameter processing on server– Parsing done on server

Page 29: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

29 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Example: {call foo (12,’abc’)} Server gets a generic SQL packet

– Parse SQL

– Semantically validate Does procedure foo exist? Is foo defined to have 2 arguments?

– SQL is always a Unicode character string Look up data type for foo’s parameter 1 (int) Convert ’12’ to an integer Look up data type for foo’s parameter 2 (varchar) Convert ‘abc’ to varchar

– Execute

Page 30: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

30 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Example:

{call foo (?,?)}

Server gets a specific stored procedure packet where the packet structure contains the arguments in the defined server format for the parameters– Execute only

Page 31: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

31 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Understand Cursor Types

Forward only cursors are always the fastest cursor type

(most of the time) static/insensitive scrollable cursors will retrieve all rows for the query to the client

(most of the time) keyset/sensitive scrollable cursors will retrieve all keys to the client, then each fetch (next) will result in an execute

Page 32: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

32 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Forward-Only Cursors driver will pre-fetch rows from the server

For small result sets, can fit entire contents of the result set in one network roundtrip

Side effects:– First fetch will be slower than subsequent fetches for

application– Cursor position is not where you “think” it is– Read Committed data might have already been read

Page 33: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

33 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Scroll Insensitive/Static Cursors

drivers will usually cache the entire contents of the result set on the client

“Lazy fetch” model … only fetch the data when you have to.

If you request a scroll sensitive cursor and the driver can’t support a scroll sensitive cursor, then it will downgrade to scroll insensitive

Not scalable!

Page 34: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

34 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Keyset/Scroll Sensitive Cursors

drivers usually cache a set of keys to the client

Example, ‘select * from table’ with Oracle– Results in the driver executing: ‘select rowid from

table’– Retrieves all rowids– Each call to SQLFetch/rs.next() will result in the driver

executing ‘select * from table where rowid = ?’ and retrieving the result

Page 35: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

35 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Use Batches Instead of Prepared Statements

Bulk load of data has typically been done by preparing an INSERT statement and executing many times.

Preparing an INSERT statement and executing 100 times:– 101 network round-trips

Using batch queries/addBatch() to compound 100 inserts:– 1 network round-trip

Network round-trips are more expensive than DBMS CPU cycles

Page 36: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

36 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Arrays

In ODBC, use SQLParamOptions to execute multiple parameterized queries with one network roundtrip

Not as easy in JDBC; however, use batch queries and eventually all drivers will optimize to use parameter arrays

Page 37: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

37 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Retrieving JDBC Data The hard work is not just positioning to the data, follow

a few more guidelines when putting data into buffers

Use getter methods with result column ordinals instead of with column names.

String fullname = rs.getString (3); // good

String fullname = rs.getString (“lname”); // bad

Page 38: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

38 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

JDBC Retrieving Data (2)

Only use getObject() unless it’s absolutely necessary.

String children = (String) rs.getObject (2); // bad

Int children = rs.getInt (2); // fast

Page 39: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

39 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Avoid Long Data (Including LOBs)

Select * from table– Easy! Most people do this!

– Slow! Even if you don’t ask for column data, a

performance penalty has been paid as the data has already been moved across the network to the client

Particularly bad when the table contains long varchar/long varbinary/LOB data

Compounded because the driver is trying to array fetch … which is might not be able to do

Page 40: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

40 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

CLOB/BLOB

Using CLOB and BLOB objects sparingly

Although you might think that only parts of the LOB are being transported across the network, most database wire protocols retrofit LOBs into existing packet structures

JDBC drivers emulate getClob/getBlob by retrieving the entire object locally

Page 41: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

41 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Various other Guidelines Use getGeneratedKeys()

– Saves network i/o’s

Don’t call rs.last() or fetch absolute last

Don’t write code that relies on the number of result rows from a query– Drivers have no idea how many rows a query will

return

– Drivers have to fetch all rows to know how many rows the query will return

Page 42: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

42 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

More guidelines Understand:

– setMaxRows

– setMaxFieldSize

Both limit network i/o by requesting that the server not send data across the wire

Side effects:– setMaxRows also applies to updates/deletes in some

databases

– These options will persist in the connection

Page 43: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

43 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

What Will Applications Do?

Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect

Page 44: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

44 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Transactions

Committing transactions is expensive!– Network i/o’s– Numerous non-sequential disk i/o’s

Don’t run in auto-commit mode

Distributed txs are at least 4 times as slow as local transactions

Use DTC/JTA only if required!

Page 45: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

45 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

What Will Applications Do?

Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect

Page 46: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

46 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Summary

Avoid:– Network i/o– Disk i/o

Think about what drivers are doing to process your requests

Stop lazy programming ... Fill in the null arguments

Page 47: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

47 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Summary

Avoid network I/O

Avoid disk I/O

Avoid sloppy programming

Page 48: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

48 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Questions?

Page 49: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

49 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation

Thank you for your time!

Page 50: DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications

50 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation