13
IT Major Incident Management Response and Communication A Collaborative Approach to IT Major Incident Management 1 Nancy Proctor Chris Wright Kevin Chenoweth October 2012

IT Major Incident Management Response and Communication

  • Upload
    nerys

  • View
    85

  • Download
    3

Embed Size (px)

DESCRIPTION

IT Major Incident Management Response and Communication. A Collaborative Approach to IT Major Incident Management. Nancy Proctor Chris Wright K evin Chenoweth October 2012. Collaborative Approach to IT Major Incident Management. Objective: Expeditious Return to Normal Operations Managed - PowerPoint PPT Presentation

Citation preview

Page 1: IT Major Incident Management Response and Communication

1

IT Major Incident Management Response and Communication

A Collaborative Approach to IT Major Incident Management

Nancy ProctorChris WrightKevin ChenowethOctober 2012

Page 2: IT Major Incident Management Response and Communication

2

Collaborative Approach to IT Major Incident Management

• Objective: Expeditious Return to Normal Operations

– Managed

– Measured

– Deliberate

– Safe

• Collaborative effort: IT; Hospital Administration; VUMC Communications; (Inpatient) Systems Support Services; (Outpatient) Operations Systems Engineering, Operational Units; Others as needed

Page 3: IT Major Incident Management Response and Communication

3

IT Major Incident Definition*

• High impact, or potentially high operational impact

• Requires a response that is above and beyond that given to normal incidents. Typically, requires:– Cross departmental coordination– Management escalation– Mobilization of additional resources– Increased communications

*ITIL Incident Management

Page 4: IT Major Incident Management Response and Communication

4

IT Major Incident Phased Response and Escalation Points

• Phase 1 From IT Incident Awareness to activation of IT Technical Conference Call Bridge

• Phase 2 From Activation of IT Conference Call Bridge to Activation of Administrative Conference Call Bridge

• Phase 3 From Activation of Administrative Conference Call Bridge to Return to Normal Operations

Page 5: IT Major Incident Management Response and Communication

6

Communication Protocols• Early “Heads Up” Advisory to Hospital Administrative Coordinators (via text page)

• Technical IT Conference Call Bridge (CCB#1)

– Reserved for IT troubleshooting and internal IT communications only

• Administrative Conference Call Bridge (CCB#2)

– Reserved for IT and Hospital Administration communication, and coordination of Operational response

– Activated by IT on request from Hospital Administration

• Medical Center Alerts and Communications

– IT and Operations jointly approve content of alerts and medical center communications

Page 6: IT Major Incident Management Response and Communication

7

Roles and ResponsibilitiesIncident Manager

• Helpdesk Manager On Call assumes the role of Incident Manager*

– Coordinates initial incident response and IT impact assessment

– Liaise with Helpdesk, workgroups, and Informatics Admin On Call

– Communicates IT impact assessment and status updates to Informatics Admin On Call, (via DR Admin on Administrative Conference Call Bridge if activated)

– Maintains list of impacted systems and status

Currently Helpdesk Manager but this may change

Page 7: IT Major Incident Management Response and Communication

8

Roles and ResponsibilitiesInformatics Administrator On Call (IC AOC)

– Provides IT leadership and direction

– Communicates impact assessment and status updates to Hospital Administration (ACs and AOC, depending on “phase”)

– Presence on either Conference Call Bridge dependent on Incident Management Phase

– Provides input to, and approves content of, enterprise communications

– IT Counterpart to Hospital Administrator On Call

Page 8: IT Major Incident Management Response and Communication

9

Roles and ResponsibilitiesSystems Support Services/Operations Systems Engineering

• Receive initial notification from Helpdesk – Receive notification when Technical CCB is opened; SSS Primary On-Call

will join the bridge to receive status briefing– Escalate to Systems Support Management as necessary– Assist with impact assessment(s), workarounds, end user

communications, resource requirements, issue verification, and recovery verification

– Provide advice re need for House-wide downtime– Assess need for StarPanel banner message; provide input to IT AOC and

Hospital ACs on content– Receive notification when Administrative CCB is activated

• Interact with Informatics Center Admin On Call, Hospital ICs, IT Leadership Liaison, Hospital AOCs on CCB #2 (Admin Bridge Line)

Page 9: IT Major Incident Management Response and Communication

10

Roles and ResponsibilitiesHospital AC (Administrative Coordinator)

• Receives initial “heads up” text alert, notification of IT Conference Call Bridge activation (but does not join call), and IT impact assessment from IC AOC

• Conducts operational impact assessment

• Determines need to activate Administrative Conference Call Bridge (CCB#2)

• Determines need for overhead announcements, alerts and enterprise communications

• Collaborates with IC AOC on content of announcements, alerts, and enterprise communications

• Engages other Operational (non-IT) resources as necessary

• Determines need to escalate within Hospital Administration hierarchy

Page 10: IT Major Incident Management Response and Communication

11

Roles and ResponsibilitiesHospital AOC (Administrator On Call)

• Receives initial notification and/or updates from Hospital Administrative Coordinator

• Joins Administrative Conference Call Bridge

• Leads Operational Response Activities

• Establishes communication with IC AOC via Administrative Conference Call Bridge

• Determines need to activate Emergency Operations Center (EOC)

• Determines need for overhead announcements, alerts and enterprise communications

• Collaborates with IC AOC on content of announcements, alerts and enterprise communications

• Engages other Operational (non-IT) resources as necessary

Page 11: IT Major Incident Management Response and Communication

12

6HD Notifies:

HD Manager,IT

Workgroups

7HD and

Workgroups “work” the incident

8Need for CCB#1?

11Helpdesk initiates CCB#1

activation

9Need to “Alert” Hosp ACs IC AOC, SSS?

10Send Alert

2bYes

9aYes

9bNo

8aYes

2Major Incident?

3Standard Incident

Response

4Incident

Resolved?

5Close Incident

2aNo

4aYes

4bNo

1Incident/Event

reported

1aHelpdesk alerted

to incident

1bAutomated

(monitored) event alerts to

workgroups

8bNo

VUMC Helpdesk defines a Major Incident as an incident that has generated 3 or more end users reports about the same thing

Page 12: IT Major Incident Management Response and Communication

13

14HD Notifies:HD Manager

IC AOC,IT Workgroups,

Systems Support

16IT Impact

Assessment

15HD alerts Hospital

ACs

18IC Admin and Hospital AC’s

confer

19Initiate Admin

Conf Call Bridge #2?

20Helpdesk initiates activation of Conf

Call Bridge #2

19bYES

19aNO

17Hospital AC Conducts

Operational Impact Assessment

19aNO

11Helpdesk initiates CCB#1

activation

12Helpdesk contacts

Computer Operations to

activate CCB#1

13Computer Operations

activate and host CCB#1

(From Phase 1)

Page 13: IT Major Incident Management Response and Communication

14

23Hosp AC, IC

Admin, SSS join Admin CCB #2

24IT Incident

Manager on CCB#1 provides

updates to CCB#2

25IC Admin provides updates to Hosp

AC

26Hosp AC, IC AOC,

IT Liaison, SSS collaborate to

manage incident

21HD contacts CompOps to

activate CCB#2

22CompOps activate CCB#2

20Helpdesk initiates

activation of Conf Call Bridge #2

27IT Major Incident

Resolved?

28Operational

Impact resolved?

30Close CCB#1 and CCB#2

27aYES

27bNO

29Continue to work

Operational impact

29bNO

29aYES

(From Phase 2)

31Return to Normal

Operations