Upload
others
View
17
Download
0
Embed Size (px)
Citation preview
RAC Deep Dive for Developersbuilding efficient and scalable RAC-aware applications
<Insert Picture Here>
Transparent Application Failover Technology
<Insert Picture Here>
Oracle RAC DD4DDmitry Volkov/Igor Melnikov
LMS
ASM
REAL APPLICATION CLUSTERS
SERVICE
3 WAY MESSAGING
GRD
CACHE FUSION
TAF
FAN, FCF
DBMS_PIPE
OCR INTERCONNECT
CLUSTERWARE
CRS_STAT
CRSCTL
VOTING DISK
VIP
ONS
OCR GC current block
<Insert Picture Here>
Agenda
• Current node failure• Failure handling• Application’s response to the loss of session• Demonstrations
<Insert Picture Here>
Current node failure
Session Failure
• The connection to the cluster node can be lost• The cluster node is crashed• A network problem
• The application receives an error, for example: “ORA-03114 not connected to Oracle”
• Even if the problem was solved rapidly, the user has still to exit and restart the application
• Some work is lost• It is unacceptable in many cases (in business-critical
applications)
Transparent Application FailoverConnection Loss Handling
• During a SQL-call (OCI call) it is determined that the connection was lost and it reconnects to the other node
• RAC and Transparent Application Failover (TAF) protect application:• Automatically and transparent for app reconnect to other node• Application and queries can continue work
NodeA
NodeB
NodeA
NodeB
Node A broken, users reconnect
Transparent Application Failover (TAF)
A normal situation
Transparent Application Failover (TAF)
A failure situation
Transparent Application FailoverConnection Loss Handling
• Two methods• BASIC: a new connection is established• PRECONNECT: a backup preconnection to another node is
established
• Two types• SESSION: the session is recovered• SELECT: cursors are saved
TAF - BASIC Method
TAF
TAF – PRECONNECT Method
TAF
CRM =(DESCRIPTION=(FAILOVER=ON|TRUE|YES)(ADDRESS=
(PROTOCOL=tcp) (HOST=sales1-server) (PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=CRM) (FAILOVER_MODE=
(TYPE=select) (METHOD=basic)(RETRIES=20)(DELAY=15))))
Transparent Application FailoverTAF Setup on a Client
Oracle Net waits for 15 seconds before trying to reconnect again.Oracle Net attempts to reconnect up to 20 times.
Transparent Application Failover
(type SESSION)
D E M O N S T R A T I O N
TAF Facility
• TAF restores or reestablishes: • Client-Server connections• Special SQL statements• Active cursors (select command) starting to fetch row set
• TAF does not save and does not protect: • Active transactions (ORA-25402 transaction must roll back)• Server-side program variables of PL/SQL packages
• Applications not using OCI8• All ALTER SESSION statements are lost
• It automatically• Opens a new session• Rolls back the active transactions
• You have to rerun manually• Alter session …. commands• Transactions (update, insert…)
• Exceptions handling• ORA-254xx
TAF Abilities (continued)
What Happens after the Failover
• For all active transactions, insert, update and delete commands
• There is ORA-25402 exception: transaction must roll back
• The application must rollback• The application must repeat the transaction
• FAILOVER_TYPE = SESSION• In case of SELECT operators only, the user does not
have to reconnect• FAILOVER_TYPE=SELECT
• In case of large queries the user will not notice a failover
Transparent Application Failover
(type SELECT)
D E M O N S T R A T I O N
TAF - type SELECT
SQL> SELECT * FROM emp where dep_no = :x;
empno name------- -------7369 Smith7499 Allen7521 Ward7566 Jones7654 Martin7698 Blake
Connection breaks
Instance 1
Instance 2
• Oracle Client stores:• bind-variables• count of fetched rows • CRC of fetched rows• SCN on time query begin!
Oracle Client :• re-execute query by SCN (when lost query run)• invisible fetch 3 rows• Calculate new CRC• compare old CRC with
new CRC !
• Client stores status of query and results so far• Bind variables• Count fetched rows• CRC of fetched rows• SCN at tine when query begun
• Replay query when connecting is established again• Re-execute query by old SCN
• Oracle Client• Calculate new CRC of fetched rows• Compare with old CRC• If not equal – exception generated (order of record has changed! )
TAF in TYPE=SELECT mode
Types and Methods combined
Server resources and reconnect speed
Clie
nt re
sour
ces
and
resu
me
func
tiona
lity
Method = PRECONNECTMethod = BASIC
Type=
SESSION
Type=
SELECT
Client is automatically logged into surviving
node of cluster
Query replayed on surviving node and remaining records
returned
Session activated on surviving node.
Session activated on surviving node, query
replayed and remaining records returned
Transparent Application Failover
• Is introduced to Oracle Client• Starting from version 8.0.6• Generally it does not depend on RAC and may be used
for:• A non-RAC database (single instance)• High Availability Clusters• A replicated database• A standby database• RAC
• Failover continues as long as the service is available
Use services when using TAF
• Issue with default service• Default service gets registered to listener when database is in
mount mode• Connections can get redirected to a database in mount mode
and fail
• Additional services register only when database open• Any new service is controlled by the clusterware• Clusterware will not enable service until database instance is
really available for connections
• Don’t use default service for automatic reconnect to standby DB
Transparent Application Failover -API for Developers
TAF Callback
• If necessary, the application may determine a function (callback) which would be called back at the time of failure
• The handler may be used:• For the issue of a message to the user
• For example: “Please wait”
• For restoring the session condition• For repeating the work
“C” TAF Callback (OCI ≥ 8.0.6)“C” Example
sb4 callback_fn(svchp, envhp, fo_ctx, fo_type, fo_event )::
{switch (fo_event) {
case OCI_FO_BEGIN: {printf(" Failing Over ... Please stand by \n");
::OCIFocbkStruct failover; ::
failover.callback_function = &callback_fn;if (OCIAttrSet( srvh, OCI_HTYPE_SERVER,
&failover, 0, OCI_ATTR_FOCBK, errh)!= OCI_SUCCESS)
::
Implement the callback procedure
Register TAF callback procedure
Take action
Java TAF Callback (OCI ≥ 9.0.1)Java Example
import oracle.jdbc.OracleConnection;import oracle.jdbc.OracleOCIFailover;::
CallBack fcbk= new CallBack();::
((OracleConnection)conn).registerTAFCallback(fcbk, msg);::
class CallBack implements OracleOCIFailover { public int callbackFn (Connection conn, Object ctxt,
int type, int event) {::switch (event) {case FO_BEGIN:
Instantiate the callback class
register TAF callback function
Implement the callback class
React
ODP .NET CallbackC# Example (for .NET-applications)
public static FailoverReturnCode OnFailover(object sender,OracleFailoverEventArgs eventArgs)
{switch (eventArgs.FailoverEvent){case FailoverEvent.Begin :Console.WriteLine(" \nFailover begin - Failing Over”); break;
case FailoverEvent.End :Console.WriteLine("Failover ended ... ");break;
::
con.Failover += new OracleFailoverEventHandler(OnFailover);
register TAF callback function
Implement the callback
Delphi win32/.NET CallbackSupport TAF in Devart ODAC (DOA – not supported !)
TTAFDemo = classpublic
… … …procedure OraSessionFailover(Sender : TObject;
FailoverState : TFailoverState;FailoverType : TFailoverType;var Retry : Boolean);
end;
varv_xSession : TOraSession;
begin… … …
FSession.OnFailover := OraSessionFailover;
Using TAF Callback
D E M O N S T R A T I O N
Server side TAFTAF Setup for the Service
begindbms_service.modify_service(
service_name => ‘oltp’, aq_ha_notifications => true, failover_method => dbms_service.failover_method_basic,failover_type => dbms_service.failover_type_session,failover_retries => 60,failover_delay => 3);
end;
• Centralized on the server:• Solving maintenance problems of tnsnames.ora on clients• The server side TAF settings have priority over the ones of the
client• The PRECONNECT-method is not supported
Setup of TCP/IP-stack for TAFFor Linux Platforms
net.ipv4.tcp_keepalive_time=10net.ipv4.tcp_keepalive_intvl=5net.ipv4.tcp_keepalive_probes=5net.ipv4.tcp_syn_retries=1net.ipv4.tcp_retries2=3
• Add to /etc/sysctl.conf
• Run the sysctl –p command
• Add the ENABLE=BROKEN attribute to tnsnames.oraTAF = (DESCRIPTION =(ENABLE = BROKEN)(ADDRESS_LIST =(ADDRESS = (PROTOCOL = TCP)(host = rac1-vip)(PORT = 1521))(ADDRESS = (PROTOCOL = TCP)(host = rac2-vip)(PORT = 1521))(FAILOVER = true))
(CONNECT_DATA =(failover_mode= (type=session) (method=basic)(retries=2))
(SERVICE_NAME = racdb.ru.oracle.com)))
message interval keepalive
probe packet interval
sets the number of probe packets keepalive
the number of attempts to transfer SYN packets
sets the number of failed attempts
TAF - Summary
• Transparent Application Failover is a powerful technology to increase the availability of applications
• It enables the users to work continuously• It automatically continues to run queries after
failures• Developers may extend TAF by its functionality
with the callback functions• Is may be used for other purposes, for example
for planned hardware shutdown or software updates
ComputerA
ComputerC
ComputerB
ComputerD
<Insert Picture Here>
Igor MelnikovSenior Consultant, Oracle CIS
Email : [email protected] Phone : +7 (495) 641 14 00Direct: +7 (495) 641 14 42Mobile: +7 (915) 205 26 27