34
#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect JCC – DB2 Connect, IBM [email protected] Maryela Weihrauch Distinguished Engineer DB2 for z/OS, IBM [email protected]

Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

Embed Size (px)

Citation preview

Page 1: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

Continuous Availability and Scalability with DB2 for z/OS and WebSphere

Pallavi PriyadarshiniArchitectJCC – DB2 Connect, [email protected]

Maryela WeihrauchDistinguished EngineerDB2 for z/OS, [email protected]

Page 2: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

2

Agenda

• DB2 z/OS concepts for availability

• Client configurations

• Enabling Sysplex Workload Balancing in WAS environments

• Algorithms for Workload Balancing and Automatic Client Reroute

• Recommended WAS, JCC and DB2 settings

Page 3: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

3

DB2 Data Sharing - Availability

Goal: Continuous availability across any planned or unplanned outage

• Elimination of single point of failure • Remove all causes for planned outages

– rolling “on-line” upgrades and maintenance– online schema evolution– online utilities

• On a failure: – Isolate failure to lowest granularity possible– Automate recovery and recover fast

Challenge: the distributed application server needs to find the (best) available path to the data.

Page 4: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

4

DVIPA and Sysplex Distributor – the Concept

• Needs to be configured at network layer and used for initial connections to DB2 group

• Group DVIPA provides a virtual TCPIP address into the Data Sharing Group

• Sysplex Distributor routes the connection request to the most available member based on WLM recommendation

SD Vx

DB2A DB2Bz/OS 1 z/OS 2CF

1

2

3connect to Vx:446

distribute connectionto DB2B

Page 5: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

5

DVIPA and Sysplex Distributor – Consideration

• Benefit: – Connections are successful as long as one member is up– Connection level workload balancing between members– Setup is isolated to z/OS environment

• Drawbacks– SD on one lpar may route to a member on a different lpar, which

results into slightly higher response time compared to direct member access.

– Information about availability of data sharing members is only considered at creation of a "new" connection but application server typically maintain long-running connections (eg. WAS connection pool)

DVIPA and Sysplex Workload Balancing at the application server need to be enabled to ensure the highest availability characteristics in a distr. environment.

Page 6: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

DB2 Client – Panoramic view

PhP Python/Jython Ruby/JRubyJavascript node.js

Scala

Zend framework adapters

SqlAlchemy/Django Adapter

DB2

DB2 CLI and ODBC driver DB2 JCC JDBC driver

Rails Adapter Liftnode-odbc node.js

JSON driver

c

Python interpreter

java c

Ruby interpreter

java

Java

JDBC API

pureQueryAPI

JSON API

Hibernate

JPA

Page 7: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

7

DB2 Client Configuration to Access DB2 z/OS – Strategy

• Clients access DB2 z/OS directly (will not change licensing model)

• Sysplex Workload Balancing is supported by DB2 Connect gateway, JCC type 4, and Data Server Driver for CLI and .NET application

C based Clients

JDBC/SQLJ/pureQuery/ObjectGrid/Data Web services

CLI/ Ruby/ Perl

.NET DRDA

DB2 Connect (Gateway)

Type 4 DRDA

DRDA

DB2 z/OS

DB2 Group

DB2 z/OS

Java based Clients

JCC

Type 4-like DRDA

DRDA CF

Page 8: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

8

JCC Sysplex Support in WAS environment• Set DataSource IP as DVIPA (Initial connection to Sysplex Distributor)• Connects applications to a DB2 data sharing group as though it were a single

database server• Global JCC property OR DataSource specific setting (enableSysplexWLB)

Group DVIPA of data sharing group

Page 9: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

9

JCC Sysplex Support in WAS environment• Custom properties available on Data Source

Page 10: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

10

JCC Sysplex Support in WAS environment• Set DataSource property enableSysplexWLB to TRUE

Page 11: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

11

JCC Sysplex Support in WAS environment

• Enables Workload Balancing (WLB)– Workload spread amongst different members, based on server lists dynamically

provided by Work Load Manager (WLM)

– Transactions distributed amongst best available members for maximum throughput and scalability

• Enables Automatic Client Reroute (ACR)– On by default when WLB enabled– Enables client application to recover from loss of communication so that application

continues with minimal interruption

– When one member of the group fails, connection should be re-established to the next best available member of the group

Page 12: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

12

Basic concepts underlying EnableSysplexWLB

• Server lists

– Available members identified by identified by member IP address and port

– Member priority (Weight) that indicates capacity of each member to handle work relative to others

– Weights obtained from z/OS Workload Manager

– Returned whenever a connection is established or reused

• Connection pool

– Logical connection from the application to the driver

– No physical network connection while in pool

• Transport pool

– Physical connection from driver to DB2

– Global pool of Transport objects per JVM

– Connection reuses/changes transport at transaction boundary

– EnableSysplexWLB automatically enables use of transport pool

Page 13: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

13

WebSphere Application Server - End-to-end Example

Transport 1

Transport 2

Application Server

Application

ResourceAdapter

JCAConnectionManager

DBConnection

Pool

DB2 Universal Driver JDBC/SQLJ

LogicalConn 3

LogicalConn 1

LogicalConn 2

disconnectat commit/rollback

CFpooled connectionsto DB2 Data Sharing

Group

JVM

• Always use connection pool closest to application

• Enable Sysplex Workload Balancing for best availability.

Page 14: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

14

Without enableSysplexWLB – Application outage

Connection Pool with size 2

C1 T1

C2

WAS

DB2 B

DB2 A

T2

JCC Driver DB2 Data Sharing Group

Connection Pool with size 2

C1 T1

C2

WAS

DB2 B

DB2 A

T2

JCC Driver DB2 Data Sharing Group

X-4498StaleConnectionException

App1

App2

App1

App2

Page 15: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

15

With enableSysplexWLB – Application continuity

Connection Pool with size 2

SysplexWLB=true

C1 T1

C2

WAS

DB2 B

DB2 A

T2

JCC Driver

DB2 Data Sharing Group

Connection Pool with size 2

SysplexWLB=true

C1 T1

C2

WAS

DB2 B

DB2 A

T2

JCC Driver

DB2 Data Sharing Group

X

T3

-30108 network failure -20542 SQL failure

App1

App2

App1

App2

DDF Shutdown

Page 16: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

16

Changed driver defaults for Sysplex properties• V9.7FP6/V10.1FP2 and above recommended for Sysplex support

• Only set enableSysplexWLB to true

• Stay with driver defaults as much as possible

• Better defaults for Global properties– db2.jcc.maxTransportObjects

• Max # of connections to DB2 server across all datasources

• Default value of 1000 (previous default -1, unlimited transports)

• # of transports driver can go upto = #logical connections * # ds members

– db2.jcc.maxTransportObjectIdleTime • Maximum elapsed time in seconds before an idle transport is dropped

• Default value is 60 sec

– db2.jcc.maxTransportObjectWaitTime• Number of seconds the client waits for a transport to become available

• Default value is 1 sec (previous default = -1, unlimited wait)

Page 17: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

17

Changed driver defaults for Sysplex properties– db2.jcc.maxRefreshInterval

• Max elapsed time in sec before server list is refreshed

• Default is 10 sec (previous default 30 sec)

• Better defaults for Connection and DataSource properties– maxRetriesForClientReroute

• Maximum number of connection retries for automatic client reroute

• Default is 1 (previous default -1 unlimited, previous recommendation was to set to # of ds members)

• Cycling through all members of a group counts as 1 retry (previous behavior was retry against each member was counted as 1)

– retryIntervalForClientReroute

• Number of sec between consecutive connection

• Default 0 sec

Page 18: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

18

Work Load Balancing

• Initial connection to DB2 data sharing group using distributed IP address

• Initial connection returns a member IP list with WLM weights

• Subsequent connection directly to members – Any following connection uses

client Sysplex workload balancing algorithm to determine which member to

use

• Driver asks for new server list after every maxRefreshInterval secs

• Workload i.e. Active transactions should be distributed in the ratio of the

server returned member weights

• Single logical connection from an application can be directed to one member

for one transaction and directed to different member for another transaction.

• At start of each transaction, best member computed

Page 19: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

19

Work Load Balancing

• At the end of a transaction, server determines whether logical connection can execute next transaction on a different member. This decision, called reuse permission, is conveyed to client in the commit or rollback reply.

• DB2 returns indicator as part of COMMIT/ROLLBACK if transport can be reused and returns SET statements to replay connection state at reuse

– No open WITH HOLD cursor– No declared global temporary tables must exist

• Used declared global temp tables must be explicitly or implicitly dropped– No reference to packages bound with KEEPDYNAMIC YES

• If reuse permission, transport given back to the transport pool at the end of the transaction

Page 20: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

Work load balancing (WLB) algorithm• DB2/WLM reports following to the driver

– M1 : priority=70 IPADDR=/148.171.100.23 port=7330

– M2 : priority=30 IPADDR=/148.171.100.21 port=7330

– Member 1 has a weight of 70 i.e. ratio is 70/100 = 0.7

– Member 2 has a weight of 30 i.e. ratio is 30/100 = 0.3– Goal is for 70% of the work to be allocated to member 1, and 30% to member 2

• Next transaction should be assigned to a member where – (Member’s currently active transactions) / (Total active transactions for the group)

<= (Member priority) / (Total of all member priorities)

• With reference to WLB– No of active transactions on a member = No of transports in use

Page 21: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

Work load balancing (WLB) algorithm

• What does the new algorithm do– Member 1 has 35 active connections (in a UoW)

– Member 2 has 10 active connections

– Current ratios -

• Member 1 = active connections / total connections = 35 / 45 = 0.78 (which is > 0.7 target ratio)

• Member 2 = 10 / 45 = 0.22 (which is < 0.3 target ratio)

– Next transaction would be assigned to member 2.

– Weight ratios are re-calculated each time a new member list is received

Page 22: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

22

Transport Pool Behavior• Every time a connection needs a transport, WLB algorithm first provides the

‘best member’

• Logical connection requests the pool for a transport to this 'best member' using

the Database Name, Server, Port, SSL, Server Type, Auth Mechanism as a

composite key

• Every logical connection remembers the transport object it last used against a

given member.

• It passes in the key to the last transport (if any) that the connection used for this

member. Algo attempts to use this transport object when it needs to go to the

member

• Until maxTransportObjects is hit, the pool prefers to create a new transport if

one is not available

Page 23: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

23

Transport Pool behavior• This leads to:

– maxPossibleTransports = # of data sharing members X # of logical connections

– If maxPossibleTransports > maxTransportObjects , the transports should remain at maxTransportObjects, the excess need will result in logical connections being put on a ‘wait queue’.

– At end of maxTransportObjectWaitTime, default is 1 second at the end of which connection receives a SQLCODE -4210, SQLSTATE 57033

• “[jcc][t4][10391][12247][3.64.82] Timeout getting a transport object from pool. ERRORCODE=-4210, SQLSTATE=57033”

Connection Pool with size 2

SysplexWLB=true

C1 T1

C2

WAS

DB2 B

DB2 A

JCC Driver

T3

T4

T2

Active

Active

Idle

Idle

App1

App2

Page 24: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

Transport Pool Statistics

• For monitoring driver Sysplex behavior (global transport object pool)

• Configuration property settings– db2.jcc.dumpPool=DUMP_SYSPLEX_MSG|DUMP_POOL_ERROR

– db2.jcc.dumpPoolStatisticsOnSchedule=60

– db2.jcc.dumpPoolStatisticsOnScheduleFile=/home/WAS/logs/srv1/poolstats

• Example fields– npr - total number of requests made to the pool since the pool was created.

– nsr – number of times pool returned an object (successful requests)

– lwroc – light weight reuse

– hwroc – heavy weight reuse

– lbt – longest time thread waited to get an object

– tpo – transport object count

• Application API also exists to gather transport pool statistics (DB2PoolMonitor class)

Page 25: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

25

ACR Behavior in a Data Sharing Group

• Seamless – On a connection failure, driver silently re-establishes the connection

to another member. Calling application remains unaware of this.

• Non-Seamless – On a connection failure, driver reports an exception

(SQLCODE -30108) to the calling application if the connection could be

established to an alternate member.

• When working with a Data Sharing Group, driver always evaluates the failing

transaction for a Seamless failover. Unless certain conditions are met,

Seamless failover can not be performed.

• Conditions such as 2nd or more SQL of a transaction, held cursors, open LOB

streams prevent seamless failover.

Page 26: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

26

Behavior at Connection Errors• If first SQL stmt in transaction fails and reuse OK (Seamless reroute)

– No errors reported back to application– SET statements associated with the logical connection are replayed with first SQL

on another transport• If subsequent SQL fails and reuse OK (Non-seamless reroute)

– -30108 reuse error returned to application (transaction is rolled back)– Up to application to retry transaction– Reconnection happens only when application resubmits (different behavior than

older releases)– SET statements are replayed to recover connection state

• If subsequent SQL and reuse not OK (Non-seamless reroute)– -30081 connection failed error returned to application. – Connection returned to initial (default) state – application needs to reestablish connection state and retry transaction

• If all members in the member list are tried and none seems to be available, the initial data source url (distributed IP address) is retried to make sure that really no member is available.

Page 27: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

27

• Connection object pool maintained by WebSphere– Saves creating/destroying Connection objects which is relatively expensive

• Connection object holds references to other Java objects like preparedStatement objects that would be destroyed with the Connection object

• Connection Pooling Properties – Max Connections

• max connections from JVM instance– Min Connections

• lazy minimum number of connections in pool– Reap Time

• How often cleanup of pool is scheduled in seconds– Unused Timeout

• How long to let a connection sit in the pool unused– Aged Timeout

• How long to let a connection live before recycling– Purge Policy

• After StaleConnection, does the entire pool get purged or only individual connection

WebSphere Connection Pooling

Page 28: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

28

Recommended WebSphere Connection Pool Properties with SysplexWLB

• Disable Reap Time, Unused Timeout and Aged Timeout by setting values to zero

• Actual physical connections are handled by driver while connections handled by WAS are logical connections.

Page 29: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

29

End-to-end Pooling

Connection pool

WAS JCC Driver

App threads Transport pool

DB2 for z/OS

DDF connections DBATs

Idle/Pooled

Inactive Pooled

Page 30: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

30

Important DB2 zparm Values• CONDBAT – Max. # of distributed connection into DB2 system

– Includes inactive and active connections, may be large because of many inactive connections

– Default is 10,000

• MAXDBAT - Max # database access threads (DBATs) that can be active concurrently. – In many installations, max. value determined by available storage in DBM1 (check

IFCID 225) – Default is 200

• IDTHTOIN – time in sec an active server thread remain idle before it is canceled – Inactive connections are not subject to idle thread timeout– Strongly recommended to not set to 0, default of 120 secs works well

• To control connections waiting on DBAT, new parms introduced -– MAXCONQW – how long in seconds a transport/client connection should wait for a

DBAT before re-eroute– MAXCONQN - how many transports/client connections should wait for a DBAT

before reroute

Page 31: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

31

WAS/DB2 - Tuning Considerations

DB2 Zparms:

•MAXDBAT•CONDBAT•MAXCONQN•MAXCONQW•IDTHTOIN

Application Server Connection Pool:•max connection•unused timeout

DB2 C-based /JCC client:

•EnablesysplexWLB•MaxTransportsObjects•MaxTransportObjectIdleTime•MaxTransportObjectWaitTime

Page 32: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

32

Summary

• Highly available environment for strategic, enterprise applications involves:

– DB2 data sharing group enabled with DVIPA and Sysplex Distributor

– Application servers/DB2 Client drivers enabled for Sysplex Workload

Balancing

– Applications are well-behaved

• DB2, Driver and WAS teams have worked together to come up with optimal out

of box behavior for most scenarios

• Some tuning may be needed for HA features of DB2, Drivers and WAS

Page 33: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

33

References

• DB2 for z/OS and WebSphere Integration for Enterprise Java Applications

http://www.redbooks.ibm.com/abstracts/sg248074.html?Open

• DB2 9 for z/OS Data Sharing: Distributed Load Balancing and Fault Tolerant Configurationhttp://www.redbooks.ibm.com/abstracts/redp4449.html?Open 

• DB2 Version 9.1 for z/OS Data Sharing: Planning and Administration (SC18-9845)http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.dshare/db2z_dshare.htm

• DB2 for z/OS best practice education https://www.ibm.com/developerworks/data/bestpractices/db2zos/

Page 34: Continuous Availability and Scalability with DB2 for z/OS ... · PDF file#IDUG Continuous Availability and Scalability with DB2 for z/OS and WebSphere Pallavi Priyadarshini Architect

#IDUG

34

Thank you

[email protected]@in.ibm.com