9
KEY PERFORMANCE INDICATORS One way of measuring the culture of a facility or company is through the use of Key Performance Indicators, or KPIs. The organization will select a few parameters that, it is believed, will give a clear picture of the overall culture. These KPIs are then measured on a regular basis to determine if the performance is improving or declining. KPIs can be used in all aspects of a business, including sales, human resources, finance and occupational safety. There is also increasing interest in using them to measure process safety performance. There should only be a small number of KPIs but they should provide a credible measurement as to trends. Generally they will be normalized. For example, rather than simply reporting the number of process safety incidents, it is more useful to report the number of incidents per employee. Doing so facilitates comparisons between different sites and organizations. When evaluating any management system it is important to quantify the results where possible. Quantification allows management to measure progress over time, and it also allows different facilities and companies to be compared with one another. In practice, the quantification of risk management programs is difficult, particularly with respect to the more intangible elements such as employee participation and management of change. In order to establish reliable quantification measures, a consistent set of terms and reporting standards is required. In the area of occupational safety considerable standardization has already been achieved through the use of measures such as the number of first aid cases or recordable injuries. Although different organizations will apply these terms slightly differently from one another there is sufficient consensus to allow for their use across broad swathes of industry. For process safety it is much more difficult to come up with comparable yardsticks. Hence comparisons between different facilities may lack validity and credible trend lines are difficult to develop. LAGGING AND LEADING INDICATORS KPIs can be either lagging or leading. A lagging indicator is like a rear view mirror; an event has occurred. A leading indicator is one that suggests that an event may occur sometime in the future. Many events are both lagging and leading. For example, a control valve may have a small leak through its packing. This is a lagging indicator, and it has minor consequences, such as the need to complete an environmental report. But it is also a leading indicator. The small leak could ignite and lead to a large explosion and fire. Lagging Indicators Lagging (sometimes called trailing) indicators are widely used to measure performance particularly for occupational safety and equipment reliability. These indicators include well- established parameters such as lost time accidents, first aid cases and recordable injuries. Figure 2.3 illustrates how the indicators are tracked over time. Lagging indicators are widely used because, assuming that there are enough events to ensure statistical significance, they allow management to establish baselines, measure trends and to compare results with other facilities and companies. Figure 2.3

Key Performance Indicators RP 754

Embed Size (px)

Citation preview

Page 1: Key Performance Indicators RP 754

KEY PERFORMANCE INDICATORS

One way of measuring the culture of a facility or company is through the use of Key Performance

Indicators, or KPIs. The organization will select a few parameters that, it is believed, will give a

clear picture of the overall culture. These KPIs are then measured on a regular basis to determine

if the performance is improving or declining. KPIs can be used in all aspects of a business,

including sales, human resources, finance and occupational safety. There is also increasing

interest in using them to measure process safety performance.

There should only be a small number of KPIs but they should provide a credible measurement as

to trends. Generally they will be normalized. For example, rather than simply reporting the

number of process safety incidents, it is more useful to report the number of incidents per

employee. Doing so facilitates comparisons between different sites and organizations.

When evaluating any management system it is important to quantify the results where possible.

Quantification allows management to measure progress over time, and it also allows different

facilities and companies to be compared with one another. In practice, the quantification of risk

management programs is difficult, particularly with respect to the more intangible elements such

as employee participation and management of change.

In order to establish reliable quantification measures, a consistent set of terms and reporting

standards is required. In the area of occupational safety considerable standardization has already

been achieved through the use of measures such as the number of first aid cases or recordable

injuries. Although different organizations will apply these terms slightly differently from one

another there is sufficient consensus to allow for their use across broad swathes of industry. For

process safety it is much more difficult to come up with comparable yardsticks. Hence

comparisons between different facilities may lack validity and credible trend lines are difficult to

develop.

LAGGING AND LEADING INDICATORS

KPIs can be either lagging or leading. A lagging indicator is like a rear view mirror; an event has

occurred. A leading indicator is one that suggests that an event may occur sometime in the future.

Many events are both lagging and leading. For example, a control valve may have a small leak

through its packing. This is a lagging indicator, and it has minor consequences, such as the need

to complete an environmental report. But it is also a leading indicator. The small leak could ignite

and lead to a large explosion and fire.

Lagging Indicators

Lagging (sometimes called trailing) indicators are widely used to measure performance —

particularly for occupational safety and equipment reliability. These indicators include well-

established parameters such as lost time accidents, first aid cases and recordable injuries.

Figure 2.3 illustrates how the indicators are tracked over time. Lagging indicators are widely

used because, assuming that there are enough events to ensure statistical significance, they

allow management to establish baselines, measure trends and to compare results with other

facilities and companies.

Figure 2.3

Page 2: Key Performance Indicators RP 754

Lagging Indicators

Lagging indicators are used to generate KPIs. One oil company, for example, has set the

following KPIs for itself (some are monthly, others quarterly and the remainder annual).

Fatalities;

Days away from work;

Recordable injuries (as a function of exposure hours);

Recordable illnesses;

Spills from primary containment (even if secondary containment was effective);

Spills affecting the environment (failure of all containment barriers);

Volume of oil spilled that is not recovered;

Greenhouse gas emission equivalents;

Total hydrocarbon emissions;

Total SOx and NOx emissions;

Total discharges to water; and

Total hazardous waste energy use.

Lagging indicators by themselves do not provide much explicit guidance to management as to

what needs to be done to keep improving safety. The events themselves have to be analyzed

using some type of root cause analysis. Also, lagging indicators tend to react quite slowly to

system changes.

OSHA Recordable Rate

Companies in the United States pay particular attention to the OSHA recordable rate.

Onshore facilities are required to report this number anyway, so it provides a reliable

means of comparing different organizations with one another. An OSHA recordable

injury is an occupational injury or illness that meets one of the following criteria:

Death;

Page 3: Key Performance Indicators RP 754

Loss of consciousness;

Days away from work;

Restricted work activity or job transfer; or

Medical treatment beyond first aid.

It is calculated for the previous three years and is defined as:

Number of Recordable Cases x 200,000

Total Hours Worked

The OSHA Lost Workday Incident Rate is similar:

Number of Lost Workday Cases x 200,000

Total Hours Worked

A lost workday equivalent to a lost time injury is one where an individual misses

more than one day of work due to an injury sustained while at work is another widely

used criterion for measuring occupational safety.

Process Safety

It is difficult to identify effective lagging indicators for use with process safety. The most

obvious difficulty is that major process safety incidents do not occur frequently enough to

develop a statistically significant trend such as that shown in Figure 2.3. If many facilities

and companies pool their data it may be possible to that some trending results can be

developed. However, such results are always open to doubt, not least because different

organizations define terms differently. For example, the Baker report (Baker 2007)

provides a list of events that fall under the term “fire”. That list includes “a fault in a

motor control centre”. It is questionable as to how many organizations would call such an

event a “fire” unless it resulted in flames and/or smoke.

An additional difficulty is that many process safety events — particularly those that are

near misses — may simply not be recognized for what they are. For example, an operator

and a mechanic may fix a leaking pump seal, not realizing how close they were to having

a major accident.

Leading Indicators

Leading indicators are forward-looking. They provide management with an assessment of

their process safety program. Most leading indicators measure some type of activity, although

minor incidents, such as the leaking control valve discussed above, could also fall into the

leading indicator category.

The following are some examples of leading indicators.

Number of field visits and inspections;

Number of safety audits;

Number of safety communications and safety meetings;

Percentage of incidents investigated;

Number of near miss responses;

Page 4: Key Performance Indicators RP 754

Number of positive rewards and recognition given; and

Safety communications;

Claims reporting analysis;

Safety committee activities; and

Number of safe behaviors observed.

Although topics such as these provide useful guidance as to how much progress is being

made, there are serious limitations. In particular it is difficult to make a quantitative link

between these topics and the risk associated with process safety events. For example, a

manager cannot know what impact say doubling the number of field visits will have

compared with doubling the number of positive awards ceremonies.

KPI Limitations

Although leading and lagging indicators provide useful guidance with respect to safety

performance, they do have limitations, as discussed below.

Activity and Quality

It is important to distinguish between activity and quality when analyzing process safety

data (Hopkins 2000). For example, managers may be encouraged to close out findings

from hazards analyses and incident investigations more quickly than they did in the past.

If they do so then this leading indicator will show that the management of process safety

is improving. However, if those managers achieve the improvement by not implementing

the findings as thoroughly as before, the net effect may be a reduction in performance.

They score an ‘A’ for effort, but a ‘B’ for results.

Quality of Reporting

The effectiveness of both lagging and leading indicators depends heavily on the quality

of incident reporting. For lagging indicators this is not a major problem. As already

noted, indices such as OSHA recordables are widely understood and are consistent across

industries. However, minor incidents are often not reported for the following reasons:

Fear of looking bad;

Desire to “tough it out” or to appear manly;

Not realizing how serious the injury might be (for example a small scratch may

have allowed toxic chemicals to enter a person’s blood stream);

Desire to keep the numbers “looking good”.

With regard to leading indicators, the quality of the reported data is likely to be worse

than it is for lagging results because it relies on the reporting of unsafe conditions and

near misses — not on actual events. Hence the value of the reported results is likely to be

patchy and inconsistent. Some people are very diligent about reporting such events;

others are not. Therefore it is difficult to establish a consistent estimating system —

particularly between different companies.

A related difficulty is that leading indicators may be subject to mere formal compliance

and pencil whipping. For example, training may be identified as a KPI so the work force

receives more training in PSM-related topics. However management cannot be sure that

the additional training will actually result in changed performance on the job.

Page 5: Key Performance Indicators RP 754

Management Elements

The reporting of leading and lagging indicators provides no guidance as to which of the

elements of process safety contributed to the system failure. In Chapter 14 — Incident

Investigation and Root Cause Analysis — it was noted that different incident

investigators can come up with widely disparate perceptions as to the root causes of

events. Similarly with leading and lagging indicators — different people will have different

perceptions as to the causes of events.

Using the example of the leading control valve once more, the leak through the packing

could be attributed to problems in any of the following areas:

Asset Integrity. The valve was not properly maintained.

Hazards Analysis. The team failed to identify that the wrong packing material

was being used.

Training. The maintenance technicians had not been properly trained in the

installation of the packing.

API RP 754

Following the fire and explosion at BP’s Texas City refinery in Texas City, Texas, the

Chemical Safety Board (CSB) conducted an investigation. One of the recommendations from

that investigation called for the American Petroleum Institute (API) and the United Steel

Workers (USW) to work together to develop an ANSI standard for leading and lagging

indicators. The recommendation reads

Work together to develop two new consensus American National Standards Institute

(ANSI) standards. In the first standard, create performance indicators for process safety

in the refinery and petrochemical industries. Ensure that the standard identifies leading

and lagging indicators for nationwide public reporting as well as indicators for use at

individual facilities. Include methods for the development and use of the performance

indicators.

Although the USW later withdrew from the program the API continued with the development

of a standard which became Recommended Practice 754, Process Safety Indicators for the

Refining and Petrochemical Industries. It was published in April 2010 (API 2010). RP 754 is

based on the same concepts are as found in the OGP Report No. 456 (OGP 2011).

RP 754 was written for the refining and petrochemical industries but it can be used in any

hydrocarbon processing industry such as offshore oil and gas. However, the following are not

considered to be within the scope of the standard.

a) Releases from pipeline transfer operations occurring outside the process or storage

facility fence line;

b) Marine transport operations, except when the vessel is connected to the process for

the purposes of feedstock or product transfer;

c) Truck or rail operations, except when the truck or rail car is connected to the process

for the purposes of feedstock or product transfer, or if the truck or rail car is being

used for on-site storage;

Page 6: Key Performance Indicators RP 754

d) Vacuum truck operations, except on-site truck loading or discharging operations, or

use of the vacuum truck transfer pump;

e) Routine emissions that are allowable under permit or regulation;

f) Office, shop and warehouse building events (e.g. office fires, spills, personnel injury

or illness, etc.);

g) Personal safety events (e.g. slips, trips, falls) that are not directly associated with on-

site response to a loss of primary containment (LOPC) event;

h) LOPC events from ancillary equipment not connected to the process (e.g. small

sample containers);

i) Quality assurance (QA), quality control (QC) and research and development (R&D)

laboratories (pilot plants are included);

j) Retail service stations; and

k) On-site fueling operations of mobile and stationary

l) On-site fueling operations of mobile and stationary equipment (e.g. pick-up trucks,

diesel generators, and heavy equipment).

The performance indicators identified by RP 754 should meet the following principles:

Indicators should drive process safety performance improvement and learning;

Indicators should be relatively easy to implement and easily understood by all;

Indicators should be statistically valid at one or more of the following levels:

industry, company, and site; and

Indicators should be appropriate for industry, company or site level benchmarking.

Tiers

RP 754 suggests that process safety performance can be measured through the use of four

tiers of indicators. These tiers represent a transition from leading to lagging indicators. Tier 1

is the most lagging, Tier 4 is the most leading. They are shown in Figure 2.3.

Figure 2.3

Performance Triangle

Figure 2.3 is a performance triangle similar to that shown in Chapter 1. Events in the bottom

section occur more frequently than in the top section and generally have a lower

consequence.

Page 7: Key Performance Indicators RP 754

It is assumed that there is a direct correlation between the tiers, i.e., that a shift in

performance at one level will have a corresponding change at the level above. However, as

discussed in Chapter 1, it is important to watch for false assumptions. For example, a newly

invigorated incident reporting program may lead to more Tier 4 incidents being recorded,

even if there has been no actual performance change.

Tiers 1 and 2 are suitable for nationwide public reporting, and thus have a tightly defined

scope. Any Tier 1 or Tier 2 Process Safety Event begins with an unplanned or uncontrolled

release of any material, including non-toxic and non-flammable materials resulting in one or

more consequences described in the RP. These events are referred to as a Loss of Primary

Containment (LOPC), which is defined as follows.

An unplanned or uncontrolled release of any material from primary containment,

including non-toxic and non-flammable materials (e.g. steam, hot condensate, nitrogen,

compressed CO2 or compressed air).

Tiers 3 and 4 are intended for internal use at individual sites.

Quantification is measured through use of the Process Safety Event (PSE) rate, which is

calculated as follows:

PSE Rate = [Total PSE Count x 200,000] / Total Workforce Hours

Each Tier has its own PSE rate.

The tiers are defined as follows.

Tier 1 — Process Safety Event

A Tier 1 event is one that includes loss of containment (LOPC) with the greatest

consequence, as defined by RP 754. These include:

An employee, contractor or subcontractor “days away from work” injury and/or

fatality; or

A hospital admission and/or fatality of a third-party; or

An officially declared community evacuation or community shelter-in-place; or

A fire or explosion resulting in greater than or equal to $25,000 of direct cost to

the Company

A pressure relief device discharges to the atmosphere (directly or via an

downstream destructive device such as a flare) that results in one or more the of

the following consequences:

o Liquid carryover;

o Discharge to a potentially unsafe location;

o An on-site shelter in place;

o Public protective measures such as a road closure; and

o Release of materials greater than the threshold quantities.

Tier 2 — Process Safety Event

Tier 2 events are similar to Tier 1 but have a lower consequence. They include:

Page 8: Key Performance Indicators RP 754

An employee, contractor or subcontractor recordable injury; or

A fire or explosion resulting in greater than or equal to $2,500 of direct cost to

the Company

Pressure relief discharges but with different threshold quantities.

Tier 3 — Challenge to Safety Systems

Tier 3 events typically represent challenges to the barriers that prevent near misses from

turning into actual events. They are events that stop short of Tiers 1 or 2. Examples

include:

Safe operating limits excursions;

Demands on safety systems such as pressure safety relief valves;

Primary containment inspection or testing results outside acceptable limits; and

Other Loss of Primary Containment (LOPC) events that are less than what is

required for Tier 2.

Tier 3 indicators are intended for internal use; the results will not normally be shared with

other organizations.

Tier 4 — Operating Discipline and Management System Performance

Tier 4 indicators provide measurements of operating discipline and the management

system performance. Like Tier 3 they are site-specific and will not generally be used to

compare the performance of different companies.

Examples of Tier 4 items are:

A process safety action item is closed on schedule;

Training is completed on schedule;

Safety critical equipment items are inspected; and

Emergency response drills are completed.

Data Submission

In order to encourage consistent reporting, the API has published a Guide to Reporting

Process Safety Events along with a matching spreadsheet. The Guide provides information

for the reporting of Tier 1 and Tier 2 events. It also provides a glossary defining the terms

used in RP 754. It also provides guidance on the selection of categories (such as types of

refining process) to be used in reporting.

Selection of KPIs

The performance indicators provided above for Tiers 1 and 2 are useful for comparing

facilities with one another. However events at this level occur only rarely and do not provide

an adequate statistical basis whereby a company can improve its own performance and

implement a continuous improvement program.

Table 2 in the OGP document provides many examples of Tier 3 and 4 KPIs. They are

divided into the following categories:

Page 9: Key Performance Indicators RP 754

Management and workforce engagement;

Hazard identification and risk assessment;

Competence of personnel;

Operational procedures;

Inspection and maintenance;

Plant design;

Safety instrumentation and alarms;

Start-ups and shutdowns;

Management of change;

Permit to work;

Contractor management;

Emergency management; and

Compliance with standards.

Further detail is provided in each category. For example, under Emergency Management, the

Tier 3 indicator is:

Number of emergency response elements that are not fully functional when activated in

a) a real emergency

b) an emergency exercise

The corresponding Tier 4 indicators are:

Number of emergency exercises on schedule and total staff time involved

% of staff who have participated in an emergency exercise

Number of emergency equipment and shutdown devices tested versus schedule