18

Click here to load reader

Note 83000 - DB2Backup and Recovery Options.pdf

Embed Size (px)

Citation preview

Page 1: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 1 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

Note Language: English Version: 19 Validity: Valid Since 03.06.2004

Summary

SymptomToo disruptive backup and quiesce procedures, inefficient recovery

Other termsCOPY, RECOVER, conditional restart, prior point in time, BACKUP SYSTEM,RESTORE SYSTEM

Reason and PrerequisitesThis note addresses the following questions:

o What kind of backup and recovery procedures should be implementedin the SAP on DB2 for OS/390 environment?

o What is specific to the SAP System in terms of backup and recoveryand what should we pay a special attention to?

o Which part of the SAP database do we need to recover in the case ofrecovery to the current state?

o How do we prepare for a prior point in time recovery and can weavoid quiescing the system that is in some cases too disruptive forthe standard SAP operations?

o Which tools should we use in the backup and recovery procedures?

Solution

DB2 V8----------------------------------------------------------------The backup and recovery enhancements that are introduced in DB2 V8, e.g.the new utilites BACKUP SYSTEM and RESTORE SYSTEM, are covered in thedocumentation "SAP on DB2 for z/OS: Database Administration Guide 6.40".

General---------------------------------------------------------------

Backup and Recovery are processes and procedures that ensure an SAPdatabase can be reinstated with minimal disruption in operations after anykind of hardware, software, operational or environmental errors or outages.Being a crucial factor in the system availability and reliability, theydeserve a careful assessment of their requirements, understanding of theprocesses, and skillful planning, developing and practicing of theprocedures.

Some of these processes are done automatically by DB2 without any outsideintervention, such as recovering the SAP database to its consistent statejust before an OS/390 system crash or DB2 abnormal termination. Thatautomatic recovery happens at the next DB2 start. For other processes thereare in DB2 integrated tools, sometimes optionally enhanced with S/390

Page 2: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 2 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

hardware and software features, that can be used for building efficient andreliable backup and recovery procedures.

The backup and recovery procedures need to be set up by DBAs for eachindividual SAP database. Their characteristics depend on:

o System availability requirements

o Database change rate

o Database size

o Hardware and software resources

The higher system availability requirements, database change rate and size,the more advanced hardware and software resources are needed and moredemanding the backup and recovery procedures are required.

It is important to bear in mind that an SAP database includes all thetablespaces, indexes and DB2 catalog and directory entries (practically allthe catalog and directory tablespaces and indexes) that are pertinent tothe SAP system. From the operational and semantical integrity's viewpointan SAP database as a whole needs to be considered a single unit ofrecovery. In other words, if a single SAP tablespace needs to be recoveredto a point in time, all other SAP tablespaces and indexes need to be eitheralso recovered to the same point in time, or already be at the state thatthey had at the time. A prior point in time recovery is an example when theentire SAP database might need to be recovered, while a recovery to thecurrent state is an example when only damaged tablespaces and indexes mustbe recovered; namely the rest is already at the current level.

Some general recommendations relevant to the backup and recovery in the SAPenvironments are:

o Planning

- Understand DB2 backup and recovery processes.

- Assess the factors that influence the characteristics of thebackup and recovery procedures.

- Develop procedures for all kinds of backup and recoverysituations that might arise in your installation.

- Practice these procedures at the time when it does notinterfere with normal operations.

- The recovery to a prior point in time is more complicated ifyou allow critical, non-SAP applications to be managed by thesame DB2 subsystem (or DB2 data sharing group) as your SAPdatabase. We strongly recommend to dedicate a DB2 subsystem orDB2 data sharing group to an SAP database only.

o Operations

- Use dual logging for the active log, archive log, and bootstrapdata sets.

Page 3: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 3 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

- Place the copies of the active log data sets and bootstrap datasets on different DASD volumes.

- Do not discard archive logs that are more recent than theearliest consistent copy of any SAP tablespace, or even olderthan that depending on your needs for a prior point in timerecovery.

- Produce object-based copies (image copies of tablespaces) evenif your main backup/recovery strategy relies on volume-basedbackups.

- Consider producing multiple backup copies.

- Keep back level backups to extend the interval when a priorpoint in time recovery is possible as well as to avoid theimpact of possibly damaged (inconsistent) backup data sets.

- Avoid backing up tablespaces that contain inconsistent data(intra-page data inconsistency). Use the COPY utility optionCHECKPAGE and the DSN1COPY, DSN1CHKR and CHECK INDEX utilitiesto detect such inconsistencies in the users andcatalog/directory tablespaces.

- Make backups of the DB2 catalog and directory, especially afterthe activities that involve a lot of DDL, such as initial load,major transports, release upgrade.

- To speed up recovery use more and larger active logs, considerarchiving to disk, or be sure to have enough tape drives. Also,keep the buffer pools and log buffers at the values recommendedfor SAP (basically: large).

The following sections present a summary of backup and recovery optionsthat could be used in the SAP environment: it is not meant to be a detailreference material for which the following DB2 for OS/390 books arerecommended:

o Administration Guide, Section 4 - Operation and Recovery

o Utility Guide and Reference

o Command Reference

o Data Sharing: Planning and Administration

The web pages http://www.storage.ibm.com/hardsoft/diskdrls/technology.htmand http://service.sap.som/split-mirror contain the documentation on theSplit Mirror Backup/Recovery Solution, a sophisticated method forgenerating very fast online backups, quick system recoveries and disasterrecovery options.

Backup---------------------------------------------------------------

The appropriate backup procedure is a key factor for the data recovery.

Page 4: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 4 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

Namely, in simple terms, the recover process consists of selecting a backupof the tablespace and applying all the changes (recorded in the log) thatoccurred between the time the backup was taken and the time the recovery isrequested to. Theoretically, DB2 could recover tablespace from the logonly, but in practice that capability should not be counted on.

The main characteristics of a backup procedure are:

o its frequency, i.e. how often an object is backed up, and

o the tools used to produce backups.

The optimal backup procedure is a trade-off between its usage of resources(CPU, DASD, tapes) and increased contention with other concurrentactivities in the system on one hand, and, the speed of recovery on theother hand. The shorter the log apply phase, the faster recovery: thefrequent backups of all the data pages containing committed rows onlyprovides the fastest recoveries. However, that comes with a cost inresources and impedes the concurrent activity in the system to an extentthat might not be acceptable.

There are basically two main types of backups that are in this documentreferred to as the online backup and the offline backup.

Online Backup-------------The online backup of an object (tablespace, partition, index, volume) is acopy of the object during which continuous, concurrent read/write activityon the object is allowed. Therefore, except for some processor and DASDoverhead, the online backup has no impact on the concurrent SAP activities.As it can contain uncommitted data, such a backup alone is never enough forthe object's recovery: DB2 complements it with the log.

There are two types of online backups: object-based and volume-based. Theobject-based backups are image copies of DB2 tablespaces, partitions andselected indexes. The volume-based backups are copies of the volumes onwhich DB2 objects reside.

The DB2 COPY SHRLEVEL(CHANGE) utility is an efficient tool for creatingobject-based online backups. The utility generates backups of DB2tablespaces and partitions, and since DB2 V6 indexes as well. Otherinteresting COPY options are:

o FULL, which specifies whether a full or incremental image copy isto be created. The incremental image copy is a copy of only thosedata pages that have been changed since the last backup.

o CHANGELIMIT, which allows you to let DB2 decide whether to take afull or incremental image copy, depending on the number of pageschanged since the last image copy. Unless you regularly take fullimage copies, we recommend to use the CHANGELIMIT specificationwhere only one value is specified in order to avoid the situationwhere no full image copy exists.

o COPYDDN and RECOVERYDDN, which allow you to create up to 4identical copies of the tablespace.

o CHECKPAGE (since DB2 V6) checks page consistency within a page and

Page 5: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 5 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

makes subsequent DSN1COPY check superfluous. Note, however, thatthe checks performed by CHECK INDEX and DSN1CHKR (for tablespacesthat contain internal links) are not done by CHECKPAGE.

o PARALLEL (since DB2 V6), which can significantly improve COPYperfomance by parallel copying of objects specified in the sameCOPY control statement.

The volume-based online backups require availability of disk subsystemscapable of generating very fast volume copies, such as IBM ESS or RVA, EMCSymmetrix, HDS Lightning, and StorageTek SVA. Coupled with a feature tosuspend and resume on demand DB2 logging activity (available since DB2 V6)these disk subsystems provide for an additional, very efficient means oftaking online backups of the entire SAP database. Here is an outline of theprocedure:

o Suspend DB2 logging activity by issuing the DB2 command SET LOGSUSPEND.

At this time DB2 initiates a system checkpoint (in non-data sharingenvironments only), writes to DASD any unwritten log buffers,updates the BSDS with the high-written RBA, and acquires thelog-write latch to prevent any further log records from beingcreated. This will prevent any further updates to the data base. Ahighlighted message (DSNJ372I) will be issued showing that logginghas been suspended. The scope for this command is single-subsystemonly, so the command will have to be entered for each member whenrunning in a data sharing environment.

Note that, although the SET LOG SUSPEND command initiates acheckpoint it does not wait for all the data pages to be writtenout. Because of this, you can improve the performance of subsequentrecoveries by minimizing the number of data pages that are notwritten out at the time the disk volumes copies are taken. This canbe done by triggering a checkpoint 5-10 minutes earlier. Anon-demand checkpoint is triggered by issuing the DB2 command SETLOG LOGLOAD(0). This additional checkpoint will also address the32K-page consistency exposure described in SAP note 363189.

o Take disk volume copies of all the volumes containing DB2 user andsystem data (logs, BSDS, ICFs).

Note that this step should be done fast which is possible whenexploiting modern DASD subsystems. Otherwise, suspending updateactivity for too long can cause timing related events such as locktime outs or IRLM diagnostic dumps when delays are detected. Thesystem backup obtain by this can be used for starting DB2. Thisstart will be performed as in the case of DB2 restart after anabnormal system termination: the inflight units of recovery will berolled back which brings the SAP database to a consistent state.

o Resume DB2 logging activity by issuing command SET LOG RESUME.

At this time the log-write latch is released and updates to thedata base resumed. The console message will be deleted.

Offline Backup--------------

Page 6: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 6 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

The offline backup is a copy of a quiesced (no uncommitted units ofrecovery accessing the object) object during which concurrent writeactivity on the object is not allowed. As a result, all the data is incommitted state which means that this backup alone could be used for theobject's recovery, but only to the point in time at which the offlinebackup was taken.

Note that the 'offline' attribute does not mean that either the DB2subsystem or the SAP System must be offline: the concurrent read/writeactivity can continue on all other objects, as well as read only on theobject being backed up. However, for the objects that are heavily updatedby many concurrent users and in that sense considered 'central' to the SAPdatabase, quiescing and preventing the write access during the offlinebackup can be extremely disruptive to the entire SAP System. Thatparticularly applies if the method for taking the offline backup is notfast enough.

Like online backups, the offline backups can also be object-based andvolume-based. The DB2 COPY SHRLEVEL(REFERENCE) utility is a tool to createobject-based offline backups. The options mentioned in the section aboutSHRLEVEL (CHANGE) are applicable to SHRLEVEL(REFERENCE) as well. However,the most interesting option for this purpose is CONCURRENT which cansignificantly reduce the time during which the tablespaces is unavailablefor write activity.

Volume-based offline backups are normally used in the context of creatingthe entire SAP database offline backup. They include all the tablespaces,indexes, DB2 catalog and directory, logs and other control objects takenduring the time when no write activity is allowed in the system.

SAP database offline backups are very restrictive ways of backing up thedata and should be used only where the SAP operations can afford such anoutage (e.g. if there are some periods of no activity). On the other hand,the SAP database offline backups provide an excellent prior point in timerecovery targets.

DB2 and OS/390 offer a number of ways to implement such a backup. Here isone example of how the offline backup could be obtained:

o Quiesce the SAP database, i.e. ensure that no users with updateintentions are present in the system. You can achieve that by thefollowing commands:

START DATABASE (*) ACCESS(RO) START DATABASE (DSNDB01) ACCESS(UT) START DATABASE (DSNDB06) ACCESS(UT)

o Run the DB2 COPY utility:

COPY TABLESPACE tspace1 COPY TABLESPACE tspace2 . . List all the SAP and DB2 catalog and directory tablespaces. . Note that since DB2 V6 you can make the process more . efficient by copying indexes and specifying multiple . objects on the same COPY invokation. . COPY TABLESPACE DSNDB01.SYSUTIL

Page 7: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 7 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

COPY TABLESPACE DSNDB06.SYSCOPY COPY TABLESPACE DSNDB01.SYSLGRNX

In order to speed up the process, create a number of COPY jobs thatcan be run in parallel. Since DB2 V6 take advantage of intra-COPYparallelism by specifying a list of objects to be copied in thesame COPY invokation and using the option PARALLEL. The elapsedtime can be improved significantly if you group the tablespaces ina way that reduces DASD path contention.

Instead of using the COPY utility that operates on a tablespace andindex level, you can make copies of all the volumes on which theSAP data reside by using the disk system fast backup capabilities(where available). Where these advanced features are not available,the DFSMSdss DUMP and COPY functions can be used instead. Thevolume copies are not registered within DB2, so the recoveryprocess is not fully controlled by DB2 which makes it moredemanding for the DBAs, but the speed by which these backups aredone warrants the additional effort at recovery time.

o Restart the DB2 subsystem and databases to allow normal access.

Another example of creating an offline backup of the R/3 database uses theCONCURRENT COPY:

COPY TABLESPACE tspace1 TABLESPACE tspace2 . . List all the R/3 and DB2 catalog and directory tablespaces . except SYSUTIL, SYSCOPY and SYSLGRNX. . Note that since DB2 V6 you can copy indexes as well. . CONCURRENT COPY TABLESPACE DSNDB01.SYSUTIL CONCURRENT COPY TABLESPACE DSNDB06.SYSCOPY CONCURRENT COPY TABLESPACE DSNDB01.SYSLGRNX CONCURRENT

Note that this method does not need separate quiesce and restart steps: thedatabase activity will be quiesced and made available again automatically.However, bear in mind that the concurrent copy might fail in the phase of'hardening' the data (physical copy), i.e. after the logical copy hassuccessfully completed and the R/3 database made available for read andwrite. In this case the copy as a whole has not succeeded and the offlinebackup has not been created. After removing the cause of the problem youneed to repeat the process.

Backup Procedure: recommendations---------------------------------As described earlier, each SAP installation should choose a backupprocedure that is optimal to its particular needs and conditions. Thefollowing are recommendations for a backup procedure that should be usedinitially by all the SAP installations and adjusted later according to thespecific needs and conditions.

o After a successful SAP installation, upgrade, migration, systemcopy or a prior point in time recovery, take an offline backup ofthe SAP database. Consider this a mandatory step.

Page 8: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 8 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

o If the operations schedule allows, take the offline backups of theSAP database occasionally. If there is not enough time available tobackup the entire SAP database, make offline backups of the heavilyupdated and critical tablespaces.

This is not a mandatory, but nevertheless highly recommended step.Namely, the offline backups are very efficient targets for a priorpoint in time recovery, especially if taken by a tool that copiesboth tablespaces and indexspaces.

o For any SAP, catalog and directory tablespace regularly createonline backups. How often the backup should be taken and whether itis full or incremental, depends on the tablespace's change rate. Asa starting point, we recommend that you run the backup jobs every1-2 days and to specify CHANGELIMIT(10). Once you categorize yourtablespaces into the heavily, moderately, or lightly updated, youcan change the frequency of their backups into daily, weekly, ormonthly, respectively. You can use transaction ST10 to categorizetables by their access patterns.

Producing the regular online backups is a mandatory step of anybackup procedure.

Note that with the CHANGELIMIT option you might end up withseldomly created full backups which is not efficient from therecovery point of view. Because of that make sure to have a fullbackup created periodically by specifying FULL(YES) orCHANGELIMIT(0). Also consider running the MERGECOPY utility thatconsolidates a full and a number of incremental backups into a new,more recent full backup.

Since DB2 V6 you have an option to copy indexes as well. The indexrecovery time can be significantly improved if the recovery isbased on the index copy rather than on the index rebuild. Ingeneral, we do not recommend copying of every SAP index, especiallyfor small indexes. However, large indexes with a lower update rateshould be considered for copying, optimally whenever the underlyingtablespace gets copied.

o Where available, use the disk system capabilities of generatingfast volume copies, combined with the SET LOG SUSPEND/RESUMEcommands to produce fast online system backups.

If produced at the time when there are no long running, notcommitting jobs in the system, these backups provide for veryefficient prior point in time recoveries. For the highestflexibility in recovery scenarios, separate DB2 data, logs, BSDS,catalog/directory and ICF catalog on different sets of volumes thatwould allow independent restore for any of these objects.

It is very important to understand that these volume-based backupscannot replace the object-based backups (image copies produced bythe COPY utility). Namely, there are cases where the recovery willrequire an image copy and the volume-based backup would not besufficient. Also, in the case of a single object recovery tocurrency, an image copy of that object is the best basis for such arecovery.

Page 9: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 9 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

When recovering to any prior point in time, the volume-basedbackups are used with the RECOVER's utility LOGONLY option. As thisoption requires that indexes are defined with COPY YES (regradlessof actually not using the image copies for the recovery) you shouldconsider defining all the indexes with COPY YES. Otherwise youwould need to use the REBUILD INDEX utility instead of RECOVERLOGONLY for the indexes.

A separate procedure should be developed for a disaster recovery. There arenumerous ways and tools to implement the procedure. The IBM Disk StorageSystems web page athttp://www.storage.ibm.com/hardsoft/diskdrls/technology.htm describes anadvanced approach that involves PPRC and fast volume backups.The disaster recovery topics exceed the scope of this note and are notdocumented here. Please refer to the corresponding IBM publications.

Recovery--------------------------------------------------------------

DB2 provides a versatile tool to recover its data: the RECOVER utility. Itallows you recover DB2 objects of various granularity: tablespaces,indexes, partitions, individual data sets, individual pages. The RECOVERutility can recover data to:

o the state captured in a particular backup (the TOCOPY option),

o the state at the time corresponding to a Relative Byte Address(the TORBA option) or a Log Record Sequence Number (the TOLOGPOINToption); the former is used in non-data sharing and the latter inthe data sharing environments,

o the current state by not specifying any of the above options.

The RECOVER utility also has the LOGONLY option which allows you to recoverthe data using the log only starting with a backup that is created outsideof the DB2 control (e.g. storage subsystem fast copy capabilities).

For improved performance the RECOVER utility supports both inter- andintra-RECOVER parallelism. The inter-RECOVER parallelism means submittingmultiple RECOVER jobs concurrently. The intra-RECOVER parallelism is evenmore efficient, especially since DB2 V6, where both the restore and logapply phase utilize parallelism. In order to exploit it specify multipleobjects on the same RECOVER execution. The option PARALLEL is used torequest parallelism in the restore phase. The log apply phase will beparallelized depending on the amount of storage allocated for the process.The value is given in the system parameter, LOGAPSTG. For prior point intime recoveries set the value temporarily to 100MB (in normal circumstancesset it to 10MB).

Since DB2 V6 indexes can be recovered either by rebuilding (the REBUIDutility), or recoverying them (the RECÓVER utility providing the index isdefined with the COPY YES option).

Depending on the time to which we want the data to be recovered there aretwo types of recoveries: to the current state and to a prior point in time.

Page 10: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 10 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

Recovery to the current state-----------------------------A recovery to the current state is generally less demanding and usuallymore often needed than a prior point in time recovery. A typical examplewhen a recovery to the current state is necessary is a DASD volume failurethat resulted in a loss of all or some of the data on the volume. Theprocedure in this case is to find out which tablespaces and indexes hadresided on the volume and recover only these tablespaces and indexes, oreven only the partitions or individual data sets that are affected. Therest of the system is already at the current state (from the operationaland semantical integrity's viewpoint) and need not to be recovered.

For recovery to the current state both backups known to DB2 (taken by theCOPY utility) and those unregistered in DB2 (such as volume-based backups)can be used. Which is more efficient depends on the recovery case. E.g. ifthe entire volume needs to be replaced, a volume-based backup as the basisfor the RECOVER LOGONLY process is the most efficient (providing the backupcaptures all the data sets that were residing on the faulty volume). On theother hand, if a single tablespace needs to be recovered (especially if itcrosses multiple volumes), the recovery will be most efficient if it uses avalid image copy (taken by the COPY utility) of the tablespace.

Recovery to a prior point in time---------------------------------This type of recovery is used to reinstate the SAP database at someprevious point in time. All the changes that had occurred after that timewill be lost and the system will appear as it was at that time in the past.Obviously, the decision to bring the system back in time must be carefullyconsidered. The typical situation when a prior point in time recovery mightbe needed is an application program logic error that introduced unwantedchanges into the system that could not be 'reverse engineered'. Namely, insome cases the prior point in time recovery and the loss of data associatedwith it can be avoided by writing so called 'compensating transactions'.Note, however, that this can be done only by highly skilled specialistswith a deep expertise in both the SAP as an integrated system and theproblem application area. In all other cases taking the entire system to aprior point in time is the only safe course of actions.

There is a number of different methods to accomplish a prior point in timerecovery of an SAP database. Depending on which time is selected as therecovery target point and whether volume-based backups are available, therecovery methods can be categorized as follows:

(1) Recovery to the state at the time an offline backup of the SAP database was created.

(2) Recovery to the state at the time the SAP database was quiesced.

(3) Recovery to any prior point in time using object-based backups.

(4) Recovery to the state at the time a volume-based online backup of the SAP database was created.

(5) Recovery to any prior point in time using volume-based backups.

Common for most of these techniques is that the SAP data that is not storedin DB2 (such as SAP application-based archived data) cannot be recovered tothe same prior point in time. In principle that is not a problem as this

Page 11: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 11 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

data is not considered a recoverable resource from the databaseperspective.

You can speed up any of the methods by splitting the job into multipleparallel recovery streams and avoiding the DASD path contention. Keep inmind that the REUSE option of the RECOVER and REBUILD utilities willsignificantly reduce the overall recovery elapsed time.Also, the SAP system should be stopped and access to DB2 either restrictedto the recover jobs only, by e.g. START DB2 ACCESS(MAINT), or competelydenied (STOP DB2), depending on the recovery method used.

Which of the recovery methods will be used depends on how fast the datamust be available again, how far prior to the point when the system gotdamaged we can afford to bring the system to, availability of offlinebackups, whether indexspaces were included in such a backup, availabilityof quiesce points etc.

1. Recovery to the state at the time an offline backup of the SAPdatabase was created

This is the simplest and fastest prior point in time recovery method,but it is also the most restrictive.

Firstly, it requires creating the offline backups of the SAP databasewhich potentially causes large periods of system unavailability thatsome SAP installations cannot tolerate. Secondly, depending on thebackup frequency, it can bring the system much further back thannecessary. E.g. if the offline backups of the SAP database arescheduled weekly on Sundays, and the data was damaged on Friday, allthe changes made from the last Sunday to Friday would be needlesslylost.

This method can be implemented by the TOCOPY option of the RECOVERutility, storage subsystem based volumes restore, DFSMSdss RESTOREetc. depending on how the offline backup was created. E.g. if youcreated an offline SAP database backup by making the storage subsystemdriven copies for all the involved volumes, restoring the copies ofthese volumes and starting DB2 completes the recovery.

2. Recovery to the state at the time the SAP database was quiesced

Recovery to the state at the time the SAP database was quiesceddepends on the existence of so called 'system quiesce points'. Theseare the points in time when there are no uncommitted updatetransactions in the system. Such a point is represented by thecorresponding RBA (or LRSN). There is a number of ways to quiesce theSAP database, here listed from the least to the most restrictive tothe concurrent SAP activity:

- ARCHIVE LOG MODE(QUIESCE) TIME(n) command

- QUIESCE utility

- START DATABASE ACCESS(RO) command for all DB2 databases

- STOP DB2 MODE(QUIESCE)

Page 12: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 12 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

The least disruptive method to quiesce a DB2 system is the ARCHIVE LOGcommand. While other methods can be used they are not recommended dueto serious impact on the concurrent online transactions.

After ARCHIVE LOG MODE(QUIESCE) TIME(n) is issued the new updatetransactions wait for the command completion. The command issuccessfully completed if all the update transactions that wererunning at the time when the command was issued were committed beforethe time specified in the TIME option expired. Otherwise the commandfails and the quiesce point has not been established.

The TIME value should be 5-10 seconds lower than the resource timeoutinstallation parameter value. That will prevent timeouts on thetransactions that are queued after the ARCHIVE LOG command. Aconvenient way to specify the TIME value is to omit the option andaccept the default that corresponds to the QUIESCE PERIOD installationparameter, providing the parameter is set according to the aboveconsideration.

Note that the ARCHIVE LOG command also has potential of being quitedisruptive to the concurrent SAP activity. The command can repeatedlyfail to succeed (due to concurrent long running, non-committingtransactions). This results in inability to establish quiesce pointsand cumulatively has a bad effect on the concurrent update activity.

The RBA value that corresponds to in this way established quiescepoint is recorded in the bootstrap datasets (BSDS) and on the MVSconsole.

Once a quiesce point has been established the SAP database can berecovered to that point should such a need arise. The TORBA orTOLOGPOINT options of the RECOVER utility should be used. The recoveryprocess can be significantly sped up if only the objects that havebeen changed since the recovery target time are recovered. See themethod (3) for more details on this concept.

Note that this recovery method is more advantageous than the method(1) because establishing a quiesce point is less disruptive to the SAPsystem than creating an offline backup of the SAP database. Therefore,it is more likely that the recovery will be possible to a point thatis closer to the required time and the loss of data will be reduced.On the other hand, establishing frequent quiesce points may pose asevere performance exposure which is not acceptable for some SAPinstallations. This is why the recovery methods described in (3), (4),and (5) are generally more attractive as they are based on onlinebackups and do not require system quiesce points.

3. Recovery to any prior point in time using object-based online backups.

This recovery method uses the DB2 conditional restart technique. It isthe least obstructive to the everyday operations in terms of creatingall the prerequisites for a prior point in time recovery of the SAPdatabase.

The main characteristics of the method are that neither offlinebackups nor quiesce points need to be provided which makes it theprime choice in truly 24x7 SAP environments. It can also bring the

Page 13: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 13 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

system closest to the time when the SAP database is known to besemantically and operationally consistent.

The recovery method assumes that a set of valid object-based backups(i.e. tablespace, partitions and index image copies taken by the COPYutility) is present.

Conceptually, the recovery method includes the steps decribed below.Note that this is not a detailed, ready-to-run process, but the mostimportant points and considerations your recovery procedure will needto take into account.

o Find out which RBA or LRSN approximates the time T you want tobring the system back to. Let's call this RBA the target RBA.

From now on, whenever the term RBA is used it implies LRSN as well,unless explicitly stated differently.

The process of finding the target RBA differs depending on whetheryou are operating in a data sharing environment or not. For datasharing, translate the time T (given as a timestamp) into its STCKformat; that's the LRSN you want to restart the data sharing groupto. For non-data sharing, find out which log data set covers theinterval that contains time T by using the print log map (DSNJU004)utility. Run DSN1LOGP SUMMARY on the above identified log data setand find out which RBA is the closest to the time T. Make sure theRBA is a multiple of 4096.

If possible, try to avoid selecting target RBA that falls in a longrunning unit of recovery and would cause lengthy backouts.

o In data sharing, delete CF structures.

o Create a list of objects that need to be recovered.

Namely, it is likely that for a large number of objects the currentDASD contents is identical to the contents at the timecorresponding to the target RBA. In other words, a lot of objectshave not changed since the target RBA and currency. The objectsthat have changed need to be recovered. You can find these eitherby running REPORT RECOVERY utility for all the tablespaces in thesystem or by running DSN1LOGP SUMMARY report. The log needs to bescanned from the last checkpoint before the target RBA (from thecheckpoint's begin RBA, providing that the checkpoint completed) tothe currency.

In addition to these you'll need to recover the objects that wereREORGed with NO LOG (or LOADed with NO LOG, but the LOAD utilityuse in SAP environments is rare), or that were dropped since thetarget RBA. The objects that were created since the target RBA canbe ignored from the consistency viewpoint, but you might want toidentify them as well in order to delete (AMS DELETE) thecorresponding orphan data sets.

The reorged objects can be identified by selecting the matchingSYSCOPY rows (before the catalog is recovered to the target RBA).

The objects that were dropped or created since the target RBA can

Page 14: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 14 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

be found by matching the result of select of all the rows fromSYSTABLESPACE and SYSINDEXSPACE with the corresponding underlyingdata sets in the ICF. The select must be done after the DB2 catalogis recovered to the target RBA. The dropped or created objects canbe found more efficiently if you regularly trace the DROP andCREATE events (DB2 performance trace IFCID 62).

Finding only the objects that have to be recovered can verysignificantly reduce the total elapsed time for the systemrecovery. You should prepare and test this procedure in advance(execs to create REPORT or DSN1LOGP input job specifications basedon the current data, analyze the output and create appropriateRECOVER and REBUILD specifications).

o Copy BSDS and all the logs that contain RBAs that are later thetarget RBA. This will allow you to repeat the recovery in the caseyou decide you want to recover the data again, but to a later pointin time.

o Use DSNJU003 to create a conditional restart record. Set ENDRBA(ENDLRSN) to target RBA and leave all other CRESTART options attheir defaults. If DB2 does not find an appropriate checkpointrecord for ENDRBA in BSDS, you can use the CHKPTRBA option of theCRESTART statement to specify a checkpoint. Using the DSN1LOGPSUMMARY(ONLY) option, you can find a valid checkpoint for ENDRBA inmessage DSN1153I.

o Start DB2, but previously update system parameters (panel DSNTIPS)and specify DEFER ALL.

This option means that all the objects that were in the startedstate at the target RBA will not be started at the next DB2 start,i.e. will not go through normal restart process. Note, however,that DEFER does not affect processing of the log during restart,i.e. DB2 still processes the appropriate log range but the loggedoperations are not applied to the deferred start data sets.

During the start a number of pages might be placed in theLPL/GRECP. These pages will be removed from the LPL/GRECP in thecourse of the corresponding tablespace and index recoveries thatare done subsequently in the procedure.

o RECOVER catalog and directory tablespaces and indexes (only thoseidentified as to need the recovery) in exactly prescribed order tothe 'current point in time', i.e. with no TOCOPY norTORBA/TOLOGPOINT. The order and some other special considerationsare described in the 'Recovering Catalog and Directory Objects'section of the 'RECOVER TABLESPACE' chapter in the 'DB2 for OS/390:Utility Guide and Reference'.

o RECOVER the selected (only those identified as to need therecovery) tablespaces and indexes (with COPY YES) to the 'currentpoint in time', i.e. with no TOCOPY nor TORBA/TOLOGPOINT.

o REBUILD the remaining indexes on the tablespaces recovered in theprevious step.

o Do not forget to reinstate RESTART ALL in DSNTIPS. This will allow

Page 15: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 15 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

DB2 to do normal restart processing during the subsequent subsystemstarts.

o It is possible that some of the SAP transactions that useasynchronous update protocol are not fully rolled back after thesystem recovery, but they can be identified by the transaction SM13and an appropriate action taken. If you want to decide what to dowith these transactions instead of SAP system deleting them atstart-up time, set rdisp/vbreorg to 0 for the first SAP start-upafter a prior point in time recovery (the default is 1).

o Take an offline backup of the SAP database.

In some, very special cases that should never be exercised without adeep expertise in SAP basis and applications (regularly with a directSAP involvement), a prior point in time recovery consists of writingcompensating transactions for some tables and recovering only a subsetof the SAP database. In such cases the conditional restart methodcannot be used on the target system, but it can still play a role inthe overall recovery. Namely, the conditional restart recovery methodcan be performed on a system that is a system clone of the targetsystem, and only selected tablespaces brought back to the targetsystem.

4. Recovery to the state at the time a volume-based online backup of theSAP database was created

As the online, volume-based backups includes every relevant DB2 systemand user data set, you can use such a system backup for starting DB2.All the volumes (data, log, BSDS, ICF) need to be restored and DB2normally started. This start will be performed as in the case of DB2restart after an abnormal system termination: the inflight units ofrecovery will be rolled back which brings the SAP database to aconsistent state.

After DB2 comes up, you can use AMS to reformat the volumes that wereadded after the recovery point time. This does not affect consistencyof the system, but removes orphaned data sets and extents.

Note that in data sharing environments you must force a group restartby purging all coupling structures from the coupling facility usingthe SETXCF FORCE command before any DB2 data sharing members arestarted.

This is a very simple yet powerful way of recovering an SAP system.Obviously, you need to be sensitive to at what time such avolume-based backup is taken (avoid doing so durig long running unitsof recovery). Also, using this method, the system can be recoveredonly to this specific point when the backups were taken. Forrecovering to an arbitrary point in time using the volume-based onlinebackups, you need to follow the procedure described in the nextsection.

5. Recovery to any prior point in time using volume-based online backups

This method is similar to one described in section (3). It uses theDB2 conditional restart technique, it does not require quiesce pointsand it can bring the system closest to the time when the SAP database

Page 16: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 16 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

is known to be semantically and operationally consistent.

The recovery method assumes that a valid online volume-based backup isavailiable.

Conceptually, the recovery method includes the steps decribed below.Note that this is not a detailed, ready-to-run process, but the mostimportant points and considerations your recovery procedure will needto take into account.

o Find out which RBA or LRSN (so called target RBA) approximates thetime T you want to bring the system back to. See method (3) fordetails.

o In data sharing, delete CF structures

o Find out the volumes backup taken most recently before the targetRBA and restore DB2 data only (do not restore the DB2 logs andBSDS!).

As the volumes backup is taken under a log suspend, let's call theassociated RBA, the log suspend RBA.

o Create a list of objects that need to be recovered. Namely, it islikely that for a large number of objects the current DASD contents(the contents after volumes restore) is identical to the contentsat the time corresponding to the target RBA. In other words, a lotof objects have not changed since the time of taking the volumebackup (since the log suspend RBA) and the target RBA.

The objects that have changed need to be recovered. You can findthese either by running REPORT RECOVERY utility for all thetablespaces in the system or by running DSN1LOGP SUMMARY report.The log needs to be scanned from the last checkpoint before the logsuspend RBA (from the checkpoint's begin RBA, providing that thecheckpoint completed) to the target RBA.

In addition to them you'll need to recover the objects that wereREORGed (LOADed) with NO LOG, or that were created since the timethe log suspend RBA. The objects that were dropped since the logsuspend RBA can be ignored from the consistency viewpoint, but youmight want to identify them as well in order to delete (AMS DELETE)the corresponding orphan data sets.

The reorged objects can be identified by selecting the matchingSYSCOPY rows (after the catalog is recovered to the target RBA).

The objects that were dropped or created since the log suspend RBAcan be found by matching the result of select of all the rows fromSYSTABLESPACE and SYSINDEXSPACE with the corresponding underlyingdata sets in the ICF. The selects must be done after the catalog isrestored. The dropped or created objects can be found moreefficiently if you regularly trace the DROP and CREATE events (DB2performance trace IFCID 62).

Finding only the objects that have to be recovered can verysignificantly reduce the total elapsed time for the systemrecovery. You should prepare and test this procedure in advance

Page 17: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 17 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

(execs to create REPORT or DSN1LOGP input job specifications basedon the current data, analyze the output and create appropriateRECOVER and REBUILD specifications).

o Use DSNJU003 to create a conditional restart record. Set ENDRBA(ENDLRSN) to target RBA. Leave all other CRESTART options at theirdefaults. If DB2 does not find an appropriate checkpoint record forENDRBA in BSDS, you can use the CHKPTRBA option of the CRESTARTstatement to specify a checkpoint. Using the DSN1LOGP SUMMARY(ONLY)option, you can find a valid checkpoint for ENDRBA in messageDSN1153I.

o Start DB2, but previously update system parameters (panel DSNTIPS)and specify DEFER ALL. During the start a number of pages might beplaced in the LPL/GRECP. These pages will be removed from theLPL/GRECP in the course of the corresponding tablespace and indexrecoveries that are done subsequently in the procedure. For datasharing, SCA and LOCK will be rebuilt from the logs (this is agroup restart).

o Use RECOVER LOGONLY to recover the tablespaces and COPY YES indexes(only those identified as to need the recovery) to the 'currentpoint in time', i.e. with no TOCOPY nor TORBA/TOLOGPOINT.

Recover catalog and directory tablespaces and indexes first(observe the prescribed order). The objects that need to berecovered because of NO LOG REORG or being created after the backuptime cannot be recovered by LOGONLY. You need to use a RECOVERYbased on image copies.

o REBUILD the remaining indexes on the tablespaces recovered in theprevious step.

o Do not forget to reinstate RESTART ALL in DSNTIPS. This will allowDB2 to do normal restart processing during the subsequent subsystemstarts.

o As in method (3) it is possible that some of the SAP transactionsthat use asynchronous update protocol are not fully rolled backafter the system recovery, but they can be identified by thetransaction SM13 and an appropriate action taken. If you want todecide what to do with these transactions instead of SAP systemdeleting them at start-up time, set rdisp/vbreorg to 0 for thefirst SAP start-up after a prior point in time recovery (thedefault is 1).

o Take an offline backup of the SAP database.

There are numerous optimizations of these processes. For example, iffor a given recovery target point there is a volume-based backup takenshortly after the target point, you can restore the volumes(efectively creating a new 'currency' and then use the object-basedrecovery (method 3) to recover only hopefully very few objects thatwere changed between the target RBA and the new currency.

Page 18: Note 83000 - DB2Backup and Recovery Options.pdf

11.02.2013 Page 18 of 18

SAP Note 83000 - DB2/390: Backup and Recovery Options

Header Data

Release Status: Released for CustomerReleased on: 03.06.2004 09:04:10Master Language: EnglishPriority: Recommendations/additional infoCategory: ConsultingPrimary Component: BC-DB-DB2 DB2 for OS/390

Valid Releases

Software Component Release FromRelease

ToRelease

andSubsequent

SAP_APPL 30 31H 31I

SAP_APPL 40 40A 40B

SAP_APPL 45 45A 45B

SAP_APPL 46 46A 46B

SAP_APPL 46C 46C 46C

SAP_BASIS 46 46D 46D

Related Notes

Number Short Text

363189 DB2/390: Volume Copies Consistency

194757 DB2/390: CHECKPAGE Option on COPY Utility

127303 DB2/390: REUSE Option for Selected DB2 Utilities

108469 DB2/390: One DB2 subsystem manages only one R/3 system

Attributes

Attribute Value

Transaction codes COPY

Transaction codes DB2

Transaction codes FULL

Transaction codes MEAN

Transaction codes OMIT

Transaction codes SM13

Transaction codes ST10

Transaction codes SUCH

Transaction codes TIME

Operating system OS/390

Database system DB2/390