31
Elizabeth Chamberlain Elizabeth Chamberlain Mike Dickinson Mike Dickinson Buckinghamshire Chilterns Buckinghamshire Chilterns University College University College

Elizabeth Chamberlain Mike Dickinson Buckinghamshire Chilterns University College

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Elizabeth ChamberlainElizabeth ChamberlainMike DickinsonMike Dickinson

Buckinghamshire Chilterns Buckinghamshire Chilterns University CollegeUniversity College

Disaster PlanningDisaster Planning

Or Or

““Don’t panic Captain Mainwaring!”Don’t panic Captain Mainwaring!”

Disaster PlanningDisaster Planning

Unix Sun Solaris System + Oracle Dbase Unix Sun Solaris System + Oracle Dbase Live & Test/Backup servers Live & Test/Backup servers 200,000 items, 3 Branches (½ hour apart)200,000 items, 3 Branches (½ hour apart)

Reasons for a Disaster Recovery planReasons for a Disaster Recovery planDisasters we have (nearly) had!Disasters we have (nearly) had!Thoughts on the backup processThoughts on the backup processProcess of recovery/restoreProcess of recovery/restorePotential banana skinsPotential banana skinsOpen forumOpen forum

Reasons for Disaster PlanningReasons for Disaster Planning

Business continuity – the disaster always Business continuity – the disaster always happens in the wrong place at the wrong happens in the wrong place at the wrong time!time!To avoid headless chicken conditionTo avoid headless chicken conditionRisk assessment is being carried out Risk assessment is being carried out throughout the organisationthroughout the organisationValidation from external bodiesValidation from external bodiesPrevious ‘disasters’ or ‘near disasters’Previous ‘disasters’ or ‘near disasters’To improve communication to usersTo improve communication to users

Disasters we have (nearly) had!Disasters we have (nearly) had!

3 July 2003 – Partial power failure in main 3 July 2003 – Partial power failure in main machine room.machine room.

10 July 2003 – Complete air conditioning 10 July 2003 – Complete air conditioning failure in main machine roomfailure in main machine room

26 August 2003 – Nachi virus struck 26 August 2003 – Nachi virus struck BCUCBCUC

6 January 2004 – Complete power failure6 January 2004 – Complete power failure

Implications of these eventsImplications of these events

BCUC cut off from the outside world BCUC cut off from the outside world (some for several days)(some for several days)

Key contact & address data not available Key contact & address data not available (mainly during power failure events)(mainly during power failure events)

Need to run key business processes – e.g. Need to run key business processes – e.g. payroll, BACS runpayroll, BACS run

General inconvenienceGeneral inconvenience

Thoughts on the backup processThoughts on the backup process

Do we need to Do we need to havehave a system? a system?

Thoughts on the backup processThoughts on the backup process

Do we need to Do we need to havehave a system? a system?

How long will the server be out of actionHow long will the server be out of action

Thoughts on the backup processThoughts on the backup process

Do we need to Do we need to havehave a system? a system?

How long will the server be out of actionHow long will the server be out of action

Understand the time required (test)Understand the time required (test)

Thoughts on the backup processThoughts on the backup process

Do we need to Do we need to havehave a system? a system?

How long will the server be out of actionHow long will the server be out of action

Understand the time required (test)Understand the time required (test)

Understand your backup regimeUnderstand your backup regime

Thoughts on the backup processThoughts on the backup process

Do we need to Do we need to havehave a system? a system?

How long will the server be out of actionHow long will the server be out of action

Understand the time required (test)Understand the time required (test)

Understand your backup regimeUnderstand your backup regime

Plan the detailPlan the detail

Recovery ProcessRecovery Process

Recovery ProcessRecovery Process

UnscheduledUnscheduled

Recovery ProcessRecovery Process

UnscheduledUnscheduled

Put users on StandalonePut users on Standalone

Recovery ProcessRecovery Process

UnscheduledUnscheduled

Put users on StandalonePut users on Standalone

Retrieve most recent full backup tapeRetrieve most recent full backup tape

Recovery ProcessRecovery Process

UnscheduledUnscheduled

Put users on StandalonePut users on Standalone

Retrieve most recent full backup tapeRetrieve most recent full backup tape

Restore data to backup serverRestore data to backup server

Recovery ProcessRecovery Process

UnscheduledUnscheduled

Put users on StandalonePut users on Standalone

Retrieve most recent full backup tapeRetrieve most recent full backup tape

Restore data to backup serverRestore data to backup server

Modify server specific settings (e.g. iLink Modify server specific settings (e.g. iLink url, Opac urls, Wf config, Self-issue)url, Opac urls, Wf config, Self-issue)

Recovery ProcessRecovery Process

UnscheduledUnscheduled

Put users on StandalonePut users on Standalone

Retrieve most recent full backup tapeRetrieve most recent full backup tape

Restore data to backup serverRestore data to backup server

Modify server specific settings (e.g. iLink Modify server specific settings (e.g. iLink url, Opac urls, Wf config, Self-issue)url, Opac urls, Wf config, Self-issue)

Run missed reports + other actionsRun missed reports + other actions

Recovery ProcessRecovery Process

UnscheduledUnscheduledPut users on StandalonePut users on StandaloneRetrieve most recent full backup tapeRetrieve most recent full backup tapeRestore data to backup serverRestore data to backup serverModify server specific settings (e.g. iLink Modify server specific settings (e.g. iLink url, Opac urls, Wf config, Self-issue)url, Opac urls, Wf config, Self-issue)Run missed reports + other actionsRun missed reports + other actionsTestTest

Recovery ProcessRecovery Process

UnscheduledUnscheduledPut users on StandalonePut users on StandaloneRetrieve most recent full backup tapeRetrieve most recent full backup tapeRestore data to backup serverRestore data to backup serverModify server specific settings (e.g. iLink url, Modify server specific settings (e.g. iLink url, Opac urls, Wf config, Self-issue)Opac urls, Wf config, Self-issue)Run missed reports + other actionsRun missed reports + other actionsTest Test Upload Standalone transactions Upload Standalone transactions

Recovery ProcessRecovery Process

UnscheduledUnscheduledPut users on StandalonePut users on StandaloneRetrieve most recent full backup tapeRetrieve most recent full backup tapeRestore data to backup serverRestore data to backup serverModify server specific settings (e.g. iLink url, Modify server specific settings (e.g. iLink url, Opac urls, Wf config, Self-issue)Opac urls, Wf config, Self-issue)Run missed reports + other actionsRun missed reports + other actionsTestTestUpload Standalone transactions Upload Standalone transactions Return to “normal” operationReturn to “normal” operation

Approximate TimingsApproximate Timings

Standalone/retrieve backup tape ½ hourStandalone/retrieve backup tape ½ hour

Restore data to backup server 2-4 hrsRestore data to backup server 2-4 hrs

Modify settings & run reports ½ -2 hrsModify settings & run reports ½ -2 hrs

Testing ½ hourTesting ½ hour

Uploading standalone data ¼ hourUploading standalone data ¼ hour

Total 3 ¾ - 7 ¼ hoursTotal 3 ¾ - 7 ¼ hours

Restore ProcessRestore Process

ScheduledScheduled

Restore ProcessRestore Process

ScheduledScheduled

Stop all activities – users on StandaloneStop all activities – users on Standalone

Restore ProcessRestore Process

ScheduledScheduled

Stop all activities – users on StandaloneStop all activities – users on Standalone

Run full backupRun full backup

Restore ProcessRestore Process

ScheduledScheduled

Stop all activities – users on StandaloneStop all activities – users on Standalone

Run full backupRun full backup

Restore data to live serverRestore data to live server

Restore ProcessRestore Process

ScheduledScheduled

Stop all activities – users on StandaloneStop all activities – users on Standalone

Run full backupRun full backup

Restore data to live serverRestore data to live server

Modify server specific settings backModify server specific settings back

Restore ProcessRestore Process

ScheduledScheduled

Stop all activities – users on StandaloneStop all activities – users on Standalone

Run full backupRun full backup

Restore data to live serverRestore data to live server

Modify server specific settings backModify server specific settings back

TestTest

Restore ProcessRestore Process

ScheduledScheduled

Stop all activities – users on StandaloneStop all activities – users on Standalone

Run full backupRun full backup

Restore data to live serverRestore data to live server

Modify server specific settings backModify server specific settings back

TestTest

Upload Standalone transactions Upload Standalone transactions

Restore ProcessRestore Process

ScheduledScheduledStop all activities – users on StandaloneStop all activities – users on StandaloneRun full backupRun full backupRestore data to live serverRestore data to live serverModify server specific settings backModify server specific settings backTestTestUpload Standalone transactions Upload Standalone transactions Return to “normal” operationReturn to “normal” operation

Approximate TimingsApproximate Timings

Run full backup 1 hourRun full backup 1 hour

Restore data to live server 2-4 hrsRestore data to live server 2-4 hrs

Modify settings ½ hourModify settings ½ hour

Testing ½ hourTesting ½ hour

Uploading standalone data ¼ hourUploading standalone data ¼ hour

Total 4 ¼ - 6 ¼ hoursTotal 4 ¼ - 6 ¼ hours

Potential banana skinsPotential banana skins

WorkFlows configurationWorkFlows configuration

Opacs, Self-issue & other equipmentOpacs, Self-issue & other equipment

Communicate with users (live/backup)Communicate with users (live/backup)

Test & document then test & documentTest & document then test & document

Report suspensionReport suspension