Service Desk Incident Triage Matrix

8/14/2019 Service Desk Incident Triage Matrix

1/23

Incident Management Process:

24x7 Response and Control

April 6, 2005

V1.12


2/23

Revision History

Revision History

Version Date Author Notes

1.08 23 Feb 2005 Nan McKenna (Initial tracked version)

1.09 15 Mar 2005 Erik Cummings

Extract return to work as Appendix C, add proposed

15/30 minute response times. Add Revision Historypage.

1.10 22 March 2005 Erik CummingsDifferentiate between Initial PCG IncidentClassification and Final Incident Classification.Added PCG Process Flowchart

1.11 23 March 2005 Bruce Campbell

Updated Revision History Table

From header, removed Draft

In header, body of document, moved OperationsExcellence to top left margin, placed IncidentManagement Process top, right margin

Re-applied styles, numbering, and organization

Added On-Call to Appendix EAdded Management On-Call to Appendix F

Turned-off numbering in Appendixes E & F

Re-organized appendixes so that process flowdiagrams were one-after-the-other

Updated references to the various appendixesthroughout document

Reworded Section 2.3 a Note

1.12 04 April 2005 Erik Cummings

Removed Appendix C (PCG Process)Renumbered Appendix D C and any references toit.Changed Appendix D (now C! - CommunicationsMatrix). Removed Contact Action, 1 st and 2 nd LevelNotification columns. Added Client Comm Intervaland SME Work Started columns.Added new Appendix D Priority and InternalResponse Time CommitmentsAdded new definitions Priority, Impact, Urgency

Table 1 Revision History

8/8/2008 v1.12 Page ii


3/23


4/23


5/23


6/23

List of Tables

List of Tables

Table 1 Revision History................................................................................................ ............... ............ii

Table 2 Detailed Incident Control Process ...................................................................................... ......11

Table 3 Explanation of High-Level Incident Management Process Flow............................... .............13

Table 4 Incident Level Classification Matrix........................................................................ ................ ..17

Table 5 Return-To-Work Guidelines...................................................................................................... ..21

8/8/2008 v1.12 Page iv


7/23

Operations Excellence Incident Management Process

1.0 Executive Summary1.1Document Contents

1.a. This document contains processes, through the use of which the newProduction Control Group will be able to quickly and efficiently respond to,

manage, and resolve incidents. Documentation includes on-calldefinitions and guidelines, escalation processes, process flow diagrams,and data tables, sets general expectations, defines roles andresponsibilities, and provides general guidelines.

1.2Intended Audience

1.a. This document is directed at and intended for executive level andmanagement personnel, ITSS personnel, including all of those areincluded in this process, such as: Subject Matter Experts (SMEs)Technical Leads, Line Managers, Systems Administrators, DBAs, projectleaders, and facilities personnel.

8/8/2008 v1.12 Page 5 of 21


8/23


2.0 BackgroundIt is expected that most services supported by ITSS are available 24x7. As a result of this expectation, it is in the best interest of ITSS Shared Services workgroups and ITSSas a whole to develop and establish a combined staff the Production Control Group(PCG) dedicated to proactively managing and responding to events as they occur.

Eventually, the role of the PCG will include incident evaluation, and depending on theseverity of the event, escalate to upper management. In some situations, the moreexperienced level technical personnel will take action to effect repairs and/or restoreservices.As the PCG acquires experience, and as ITSS adds monitoring and troubleshootingcapability, they will assume additional incident response responsibilities.2.1Primary Responsibilities of the Production Control Group

1.a. Managing and controlling a widespread service outage, including incidentreporting and escalation.

2.2Incident reporting and escalation techniques will:

1.a. Specify a point-of-contract (owner) for all issues and ensure that servicesare restored through the prudent use of departmental resources, includingdocumentation of the incident from beginning to its resolution.

1.b. Effectively manage the communication of information within ITSS whenthere are issues that actually or potentially impact ITSS-supportedservices or facilities.

1.c. Pro-actively respond to issues that impact ITSS-supported services andfacilities; evaluate, classify, escalate, and manage service restorationefforts efficiently and as expeditiously as possible, up through incidentresolution.

2.3Additional Responsibilities of the Production Control Group

1.a. Note: It is anticipated that any single-shift of the PCG will NOT beconsumed by continuously resolving issues. Because of this,supplemental duties and tasks, detailed below, will be assigned.

1 Assist offsite Subject Matter Experts by performing requested tasks,such as visual inspections of hardware and recycling the power onequipment as instructed.

2 Manage and prepare magnetic media for rotation, offsite shipmentand storage, including organizing and filing transmittal logs.

3 Control building and facility access, escort vendors to restricted areasfor the purposes of inspection, maintenance, and repair of equipment.

4 Monitor building/facility/ data center environmentals, such as: air conditioning, fire suppression system, lighting, and so on, log timesand results of the monitoring activity.

5 After normal working hours, perform 1st tier triage of reported issues,classify and escalate as necessary.

6 Receive and log calls from end users, and generate Remedy tickets,escalate as necessary.

7 Set up Video/Telephone conferences.

8/8/2008 v1.12 Page 6 of 21


9/23


8 Accept and sign for emergency delivery of replacement parts fromvendors.

9 Perform other tasks deemed necessary by department supervision.

3.0 Roles and Definitions

Account Manager A member of the ITSS Account Management team in ClientSupport who is responsible for the relationship with one or several key clients (e.g.GSB, H&S, Libraries)

Client A primary paying customer of ITSS services and support End User Person who directly uses a service. An end user could be an internal or

external to ITSS. End users are directly impacted during an outage, and generallyhave an established relationship with the Client or Service Owner

Impact Level of effect or impact on the Stanford Campus. This is relative to theCampus as a whole, not specifically to the client. (Values= Campus-Wide, Major School or Dept wide, Minor Group or Single User, and Non-Service Affecting)

Incident Manager The Shared Services Line Manager who is designated as

responsible for a specific incident Incident/Event/Problem/Issue For the purposes of this document, these terms

are intended to mean a failure of any component of any system or service, and areused interchangeably throughout this document

ITSS Client Support Group which does client relations, account management,functional analysis, sales & marketing, documentation, software licensing, end user training, and Help Desk and CRC support

ITSS Engineering and Projects Group which does technology R&D, serviceenhancements, new product and service projects

ITSS Shared Services Group which does operations ITSS Strategic Planning Includes technology strategy & architecture and finance

groups Line Manager Workgroup managers in ITSS Shared Services On-Call Subject Matter Expert (SME) SME (see below) who is designated to be

available to respond to reported outages, triage the incident, perform the neededtasks to restore services, assist other workgroups in the restoration process, or determine which other members within their own workgroup are needed to assist inservice restoration

Operations Owner The ITSS staff person who has the ultimate authority for aservice including its functionality and approval for any changes to the service

Priority Level of response and effort directed towards resolving an incident. It isdetermined by the inherent service level commitment of the service, as well as acombination of Urgency and Impact. Priority is sometime referred to as severity.(Values = Urgent, High, Medium, Low)

Product Manager Own product quality and client satisfaction for a service Production Control Group (PCG) Group which will perform monitoring and basic

problem determination and evaluation, escalation, communication and in somecases, incident resolution

Subject Matter Expert (SME) Any technical ITSS staff person whose job requiresextensive technical knowledge of network and service components and their related

8/8/2008 v1.12 Page 7 of 21


10/23


requirements. SMEs are considered experts and possess a detailed knowledge of service functionality, restoration, component/service repair.

Satellite Operations Center (SOC) The SOC is a partner with the UniversityEmergency Operations Center (EOC) during Level 2 (major building fire, extendedpower outage) or Level 3 (major earthquake or extensive flooding) emergencies.The ITSS SOC team provides real-time field information to the EOC as well ascoordinating and directing emergency responses.

Urgency End user or clients assessment of the importance and/or urgency of theissue as it affects their ability to perform their work. This value is provided by thecustomer. (Values = Urgent, High, Medium, Low)

8/8/2008 v1.12 Page 8 of 21


11/23


4.0 Process Review4.1Process Outline

1.a. Note; There are six major steps in this process, from the time of incidentdetection through root cause analysis and implementing preventative

measures.4.2Incident Detection and Reporting

1.a. An incident can be detected by:

1 From an end-user

2 From a client

3 From an SME

4 From automated monitoring

1.b. It is important that the sharing of information occur between and amonggroups.

1.c. The process of reporting of problems is different between normalworking hours, 8:00 A.M. to 5:00 P.M., M-F, and after those hours.

4.3Incident Level Classification: See Appendices C and D

1.a. This includes assigning a severity level to the incident, and its subsequententry into the Remedy incident tracking system.

4.4Incident Notification

1.a. This includes notification to an ITSS Incident Manager and clients, andincludes outage information posted on the SU Web site, Cable TV,informational messages left on the designated voice mail box, and emailsent to designated personnel and other client notification as deemedappropriate.

4.5Incident Escalation

1.a. This includes escalation to the ITSS Incident Manager, and anysubsequent escalation calls deemed necessary. Note that the severitylevel will dictate who in the management chain of command to contact,and when to provide them status reports. Additionally, the PCG willdetermine whether or not the incident needs to be escalated to the SOC.

4.6Incident Resolution

1.a. This covers work performed during the incident itself, with responsibilitiesas follows:

1.b. The Incident Manager is responsible and accountable for the overallrecovery effort, performing the following functions:

1 Establishing recovery priorities

2 Coordinating and delegating responsibilities as they relate to therecovery effort.

3 Issuing requests for additional resources

8/8/2008 v1.12 Page 9 of 21


12/23


4 Ensuring the participation of critical internal and external supportgroups and vendors, such as the recall of media from the off-sitestorage vendor, or the purchase of replacement parts and equipment

5 Reviewing and approving tactical plans

6 Communicating incident status to ITSS management/executives asneeded

7 Working with Client Support to approve and authorize the release of information to other schools and departments

1.c. SMEs and Line Managers are responsible for analyzing technicalproblems and making technical decisions, implementing tactical plans,and communicating to other SMEs as well as the Incident Manager.

1.d. The PCG is responsible for coordination of the incident resolution effortand for communication as deemed necessary.

4.7Post-Incident Activities

1.a. This covers the activities after the incident is resolved.

1 The first task is to ensure that any post-incident cleanup is completed

2 Perform root cause analysis of the incident,

3 To avoid similar, future incidents, determine what processimprovements and preventative measures that can be put into place.

4 Implement changes in process or technical support as appropriate.

5 Ensure that PCG receives feedback and input from the user community,

6 Perform client follow-up and ensure that an incident response qualitysurvey form is available for end-user and client feedback.

8/8/2008 v1.12 Page 10 of 21


13/23


5.0 Detailed Incident Control Process5.1Detailed Process Flow Explanation Table. Reference Appendix A

Process # Process Name Detailed Description Action ByIncident Detection and Reporting

1 Problem Reporting:End Users

End-users will call 5-HELP or use the web athttp://helpsu.stanford.edu/ . Telephone calls aredirected to the ITSS Help Desk where the problem isevaluatedIf the Help Desk (any tier) determines that this isan urgent incident, the call/ticket should bedirectly escalated to the PCG

End-User

2 Problem Reporting:Clients

In most cases, clients should call 5-HELP or use theweb at http://helpsu.stanford.edu/ . In some specialcases, clients may have direct access to the PCG for reporting problems and receiving updates. In this case,skip to step 12.

Client

3

Problem Reporting:

End Users After Hours

If an end-user calls 5-HELP after hours, the user will getthe recorded phone tree. Users can choose to getthrough to the PCG directly, or leave a recorded

message. For after hours calls, the PCG will determinewhether call is urgent. If the issue is not urgent, thePCG will enter a ticket in Remedy for review thefollowing business day.

End User, PCG

4 Problem Reporting:Monitoring to SMEs

In some cases, monitoring may notify a SME or aproblem before a user, client or the PCG. If the issue isurgent, escalate directly to the PCG for coordinationand entry into Remedy.

SME

5 Problem Reporting:Monitoring to PCG Monitoring reports information directly to PCG PCG

6 Resolve? Help Desk assesses whether the ticket can be resolvedat this point. If so, the Help Desk will resolve and close. Help Desk

7 Urgent?If the ticket cannot be resolved, Help Desk to determinewhether the ticket should be forwarded to SME/HelpDesk Tier 2 or to the PCG

Help Desk

8 Forward To SME If the case does not appear to be severity Urgent/High,forward to SME Help Desk

9 Resolve Quickly? Can the case be resolved by the SME and is it SeverityLevel Medium/Low? SME

10 Enter Solution InRemedyIf the SME can quickly resolve the case, enter solutionin Remedy and close ticket. SME

11 Forward To PCGIf the SME determines that there is impact beyond asimple fix and the Severity Level is Urgent/High, notifythe PCG.

SME/PCG

Classification

12Assign SeverityLevel

Assign a severity level to the incident; using thestandard ITSS categories (see Appendix C and D). Theseverity levels govern:Level of action to be taken by the Production Control

GroupNotification and escalation guidelinesTime intervals in which to provide status reportsTime intervals in which to initiate escalation andmanagement decision processes

PCG

13 Enter In Remedy Enter a ticket for the incident into the Remedy HelpDesk application. PCG

Table 2 Detailed Incident Control Process

8/8/2008 v1.12 Page 11 of 21
http://helpsu.stanford.edu/http://helpsu.stanford.edu/http://helpsu.stanford.edu/http://helpsu.stanford.edu/http://helpsu.stanford.edu/


14/23


6.0 High-Level Incident Process Explanation6.1Detailed Process Explanation: See Appendix B

NotificationSME Notify appropriate SME(s) if necessary, using AMCOM on-call system PCGUpdate itss-service-alerts@lists

Send a message to [email protected] PCG

Post Messages To Web,Phone, TV

Message information will include: the date and time, a brief description of the problem, and if available, the estimated time of resolution/restoration.

Web: Update status on down.stanford.edu

Telephone: In the event of a major network failure, update the designatedvoicemail box: 7-DOWN

SU Cable TV ITSS can have pre-worded messages set for broadcast,where the group can just fill in the blanks.

PCG

Escalation

Notify Line Manager Contact the Shared Services Line Manager of the affected system. If aLine Manager is unavailable, use the AMCOM system to determine thebackup.

PCG

Determine IncidentManager

If the incident falls into the area of a single Line Manager, that LineManager will contact the Incident Manager. If multiple Line Managers areinvolved, they must determine a single Incident Manager.

SharedServicesLineManagers

Send Email

Send first email to appropriate lists/clients, based on Service LevelAgreements. Use the [email protected] list for campus-wide outages; the Incident Manager should approve any messages whichgo to this list.

PCG,IncidentManager

Escalate To Senior Management

The Severity Level (see Appendix C and D) will determine the escalationto management PCG

Resolution

Incident Management

The Incident Manager will take ownership of the problem and manage the

incident. Responsibilities:Establish priorities

Coordinate and delegate responsibilities in regards to the recovery effort

Request additional internal or external resources

Ensure and manage the participation of critical internal and externalsupport groups and vendors

Review and approve tactical plans

Communicate incident status to ITSS management/executives as needed

Work with Client Support to release information as needed to clients/users

across campusResolve Incident SMEs are responsible for analyzing technical problems, implementingtactical plans, and communicating to other SMEs and with the PCG. SMEs

8/8/2008 v1.12 Page 12 of 21
mailto:itss-service-alerts@listsmailto:[email protected]:[email protected]:[email protected]:itss-service-alerts@listsmailto:[email protected]


15/23


Post ResolutionInformation To Web,Phone, TV

Message information will include: the date and time, a brief description of the problem, and if available, the estimated time of resolution/restoration.

Web: Update status on down.stanford.edu

Telephone: In the event of a major network failure, update the designatedvoicemail box: 7-DOWN

SU Cable TV ITSS can have pre-worded messages set for broadcast,where the group can just fill in the blanks

PCG

Post Incident Analysis

Complete Cleanup Tasks Determine whether cleanup is required, and identify who will own andperform the additional clean-up tasks SME, PCG

Root Cause Analysis

It is the responsibility of the manager of the PCG to initiate root causeanalysis, collecting as much information as possible, and to ensure thatany information which will help in resolving future incidents is entered intothe related Remedy ticket for future use.

PCGManager

Incident Prevention Determine processes which can be implemented to prevent a repeat of the incident.

SharedServicesManagers,SMEs

Client/User Follow-upEnsure selected members of the recovery team make follow up calls tothe affected users, to solicit their constructive comments. Share results of the analysis with workgroups and clients where appropriate.

PCG

Quality SurveyITSS will make an on-line survey available for user/client feedback, andfor ITSS staff. The PCG is responsible for tallying survey results andmaking them available to the appropriate ITSS staff and managers.

PCG

Table 3 Explanation of High-Level Incident Management Process Flow

8/8/2008 v1.12 Page 13 of 21


16/23


7.0 Outstanding Issues7.1A common paging system is required

1.a. AMCOM for manual paging

1.b. What to use for automated paging from monitoring systems?

7.2Definition of Service Hours

7.3Definition of availability, outage, and service degradation

7.4Service-level procedures for client notification

8/8/2008 v1.12 Page 14 of 21


17/23

Incident Detection & Reporting

PCGClient Help DeskAutomatedMonitoringSMEEnd User

Report Problem:HelpSU/5HELP

Report Problem:HelpSU/5HELP

Resolve?

Report Problem

Urgent?

NoForward To SME

(Help DeskTier 2) For AdditionalAnalysis

No

Forward DirectlyTo PCG

ResolveQuickly?

Enter Solution In

Remedy

Yes

No

1 1 4

Calls5-HELP After

Hours

Calls5-HELP After

Hours

3 3

7

6

8

9

10

11

Report Problem

5

Enter IncidentTicket InRemedy

Report Problem:Directly To PCG

Yes

2

DetermineSeverity

Level

12

13


Appendix A Incident Management Process FlowchartReference Table 1 Detailed Incident Control Process

1.a. Note that the circle numbers in the flowchart correspond to the numberson table 2, page 10.

Figure 1 Incident Detection and Reporting

8/8/2008 v1.12 Page 15 of 21


18/23


Appendix B High-Level Incident Management Process Flow

Figure 2 High-Level Incident Management Process Flow

8/8/2008 v1.12 Page 16 of 21

ProductionControl Group

Subject Matter Expert

Monitoring

Line Manager

[email protected]

7-DOWN End User

Client

System Status

End User Client

HelpSU/5-HELP

Help DeskTier 1

DetectionReporting

Classification

N o

t i f y

U p

d a

t e

Notification

Escalation

Resolution

Post Incident Activities Production

Control GroupSME


U p d a t e

Self-Service

Classify Incident Level & E nter in Remed y

AccountManager

SME

C o m m u n i c a t e

RemedyDatabase

Communicate


Duty Manager

U p d a t e W i t h S o l u t i o n

U p d a t e

E m e r g

e n c y

RemedyDatabase

Up d a te

PCG Manager

U p d a t e w i t h S o l u t i o n I n f o r m a t o n

Classify Incident Level & Enter i n Remedy

C o m m un ica te

C o m m u n i c a t e

Line Manager

Duty Manager

RemedyDatabase

SOC/EOC

Liaison


19/23


Appendix C Incident Level Communications MatrixLevel Description Incident Examples Client UpdateInterval

SME WorkStarted w/in:*

Urgent

A major service outagewith significant and

immediate businessimpact and noworkaround.

Large number of users

Outage of significant length

No availableworkaround

Mission/ businesscritical

Fire suppression system activation indata center

Loss of electrical power Entire network switch, closet and/or building outagesFailure of 1 or more high priorityservices e.g. Exchange, OracleFinancials, HRMS, PeopleSoftLarge denial of service attacks/;successful hacking; loss or altering of data; theft of data, simultaneous virusinfections

SU telephony systems

Initial Immediate.

Notification on-going:

hour

30 minutes

High

A major service outageor degradation with

significant businessimpact and anunsustainableworkaround.

Multiple users Work performance

reduced Mission/ business

critical

Failure of Storage system (storage areanetwork SAN)Failure of a server of a sensitive clientor user

Severely degraded performance

Smaller denial of service attacks

Initial Immediate.

Notification on-going:

1 hour

1hour

Medium

A service outage or degradation with anacceptable workaround.

Service-affecting Minimal

performancedegradation

Affects non-criticalbusiness function

Cannot connect to the internet, send or receive email

Hardware failure, cannot access data,cannot print

Degraded performance

As applicable. By

SME working issue.4 business hours

Low

Non service-affecting. Cosmetic problem System

enhancement

Previously requested enhancements toa system

Upon issue resolutionor as applicable with.By SME workingissue.

1 business day

Table 4 Incident Level Classification Matrix

* Note: This column indicates the most amount of time that will transpire before a technician beginsworking on an Incident. Times will generally be much faster for all severities.

8/8/2008 v1.12 Page 17 of 21


20/23


Appendix D Priorities and Internal Response Times

Note: The following table refers to Priority, not to Urgency or Impact. Priority is a combination of the combined Urgency, Impact, and existing Service Level Commitments for the service in question. Thisis an important concept to adhere to Urgency is offered by the customer, Priority is assigned by theHelpdesk, PCG, and/or SME involved from a system-wide perspective.

Usage: These Priority levels (and the associated Urgency and Impact values) are used to trackincidents as they are reported and worked on. Each of Priority, Urgency, and Impact relate directly toRemedy ticket fields.

8/8/2008 v1.12 Page 18 of 21

Priority DescriptionCommitted

ServiceHours

PCG CallInitiate

SME CallResponse

EscalationInterval

SME WorkStarted

Urgent

A major service outagewith significant andimmediate business

impact and noworkaround.

Large number of users

Outage of significant length

No availableworkaround

Mission/ businesscritical

24x7 Immediate 15 Minutes 10 minutes 30 minutes

High

A major service outageor degradation withsignificant businessimpact and an

unsustainableworkaround. Multiple users Work performance

reduced Mission/ business

critical

24x7 Immediate 15 Minutes 10 Minutes 1 hour

Medium

A service outage or degradation with anacceptable workaround.

Service-affecting Minimal

performancedegradation

Affects non-criticalbusiness function

8-5, M-FTicket

Assignment/eMail

Asappropriate

(workbegins, workupdate, workcompleted)

StandardSME Group

Remedysettings

4 businesshours

Low

Non service-affecting. Cosmetic problem System

enhancement

8-5, M-FTicket

Assignment/eMail

Asappropriate

(workbegins,

informationrequired,

workcompleted)

StandardSME Group

Remedysettings

1 businessday


21/23


Appendix E On-Call GuidelinesGuideline Purpose

To generally define and standardize:

On-call duties and responsibilities

A methodology for communications andengagement of problem determination andresolution

On-call scheduling

Response expectations/guidelines and generalescalation processes in the event 24 X7 on-sitegroup is engaged in an on-going event or incident.

System generated notifications will continue to be handled within the requiredtime frames by the individual SME groups.

DutiesRequirements for on-call responsibility must be identified inthe appropriate job descriptions, including: carrying apager, cell phone, availability of the employees homephone number, and email.

Responsibilities

Share on-call responsibilities with other members of thework group

Begin working on the event as soon as notified

This may require working from home or traveling to work. The decision to make aphysical appearance at work depends on thecircumstances of the event, such as:swapping hardware components or, an on-siteappearance by a vendor.

Communications

Teleconference Phone Bridge Telecom will have ateleconference number available to technical personnel,and the PCG. This will be used when the expertise of multiple SMEs is required to resolve an incident. It willalso permit the technical staff the capability tocommunicate as a group. Additionally, first-hand, the PCGwill be able to determine the status of the incident andkeep management informed without them actually beinginvolved in the conference call.

The AMCOM system will be the primary contactinformation/procedures lookup and paging tool for the 24 X7 on-site groups.

Staff will provide and track individual work group on-callschedules.

8/8/2008 v1.12 Page 19 of 21


22/23


The work group establishes the rotation.

Members of the work groups are responsible for maintaining and keeping current, the contact and coverageinformation on the on-call database.

Communications Elements

Required communications devices: pager or cell phone,personal phone.

Additional communications devices as recommended bythe SME groups: DSL, Treo, wireless-laptop, email.

Notification Protocol

Initial outgoing page

Re-page in 10 minutes

If a call-back is NOT received from the designated on-callSME within 15-minutes, begin escalation to the next on-call

person, including re-contacting the primary on-call personand the on-call Shared Services manager on allsubsequent pages.

Recipient to confirm garbled pages, follow call-backprotocol.

Initial Communications Tracking

Use AMCOM system for initial communications tracking

Response Protocol

15 minute call-back

Within 30 minutes, be actively engaged in problemdetermination and resolution

Actively engaged via:

Home system

Wireless laptop

On-site

SME groups may establish accelerated response profilesbased upon their response criticality

Scheduling

By SME group designSME schedule to be established and published in AMCOMsystem

SME contact instructions to be included

8/8/2008 v1.12 Page 20 of 21


23/23


Appendix F Management On-Call GuidelinesReturn-To-Work Guidelines

These guidelines are for Management to consider if extended hours have been worked due to outage/issue by

an on-call representative.These guidelines should be used to ensure there is alwaysan effective on-call representative, while protecting the on-call SME from overly extensive work-time.

If the primary on-call SME has already worked consecutiveextended hours, or multiple shifts, and a new event hasoccurred:

Either the manager will provide a backup andnotify the backup of their modified on-callstatus, or the entire group of SMEs will make adecision on the selection of an alternate SME to

be used in this situation.To allow staff members who are involved with an after hour call-out on Sunday through Thursday to obtain adequaterest, the following is provided as a sample set of guidelinesfor a return-to-work policy:

On-Call SME works until Report to work no later than0200 11000300 12000400 13000500 Take rest of day off

Table 5 Return-To-Work Guidelines

Documents

Service Desk Incident Triage Matrix