Meridium Basics Failure Event Coding

7/28/2019 Meridium Basics Failure Event Coding

http://slidepdf.com/reader/full/meridium-basics-failure-event-coding 1/13

Understanding the

Basics of Failure and

Event Coding for EAMand CMMS

by Ralph Hanneman,Senior Consultant,

Meridium



2 Understanding the Basics of Failure and Event Coding for EAM and CMMS

Purpose

The purpose of this paper is to discuss failure and

event coding in enterprise software systems. It is aimed

at asset intensive manufacturing and industry on abroad scale and is not intended for those wanting to

track failures in the products they manufacturer or forthose looking to track production and operational loss-

es. It can be used as a guide in the development of failure codes or in the assessment of the effectiveness

of current failure codes.

While the development of failure codes during soft-ware implementation projects can be problematic, a

clearer understanding of failure coding can eliminatemuch of the confusion and expedite implementation

efforts - for example, a global pharmaceutical company

undergoing an SAP implementation was able to devel-op their entire set of failure codes in under two weeks.

Background

In today’s world, the machines and systems used in

manufacturing and industry are complex - com-prising thousands of equipment assets at any one

facility. Understanding why equipment failuresoccur is necessary to the development of strategiesto improve profit and reduce risk

With limited exception, most corporations todayare standardized on a select few CMMS (comput-

erized maintenance management system) or EAM(enterprise asset management) systems.Predominate systems include SAP, Oracle’s eAM,

IBM's Maximo and Ventyx (formerly Indus)Passport. In terms of maintaining equipment assets

at a facility, these systems are used to request,plan, approve, schedule and execute work. Thereis usually some form of integration with purchas-

ing, spare parts, contracted services, resources, per-mits, etc.

Equipment failure information - collected at vari-

ous points in the repair process and documentedby others (equipment operators, maintenance

supervisors, planners, technicians, etc.) who usethe system - is a key component in corporate reliability

efforts. However, these reliability efforts are often ham-pered by incomplete or inaccurate failure information.The reasons surrounding this are twofold - technically

there needs to be effective code sets in the systemforming the foundation for reporting failure events and

organizationally the system users need to fully under-stand the benefits of the coding process and how the

usage of code sets will ultimately profit everyone.

This white paper will address the technical aspects

related to effective code sets.

Work Process

To begin with, let’s note the basic phases in the tablebelow to correct an equipment failure. These stepsapply regardless of the system which may be in use.

It is obvious that there are a number of hand-offsbetween the groups involved in this process and that

specifics may be overlooked due to the overall time-line. This, coupled with the fact that “accurate”

reporting of a failure event is not inherently a top pri-ority for anyone other than the reliability engineers,often leads to difficulties in capturing critical failure

data. Clearly, the development of a failure coding sys-tem which is well-organized, user-friendly and clear

...a global pharmaceutical

company undergoing an SAP

implementation was able todevelop their entire set of

failure codes in under two

weeks...




will encourage broad system adoption resulting in the

capture of improved failure related data.

Organized System of DataWhen analyzing the failures of equipment, it is impor-tant to know not only “what” took place, but also

“where” and “when” it happened. Using system date-time stamps it is easy to determine the “when” associ-ated with an event. However, given the size and scope

of modern manufacturing enterprises, it is of equalimportance to have a system for organizing data

around the location (where) of the asset as well as theevents (what) occurred. The term for this organized

system of location, equipment and event data is taxon-

omy. In terms of equipment reliability, taxonomy is

commonly used to refer to the organization of equip-ment into a hierarchy and the relationships of equip-

ment to various categories as well as specific character-istics for the equipment asset. These are all usefulwhen sorting and grouping work history records. An

asset is something which has a value to the corpora-tion. While physical equipment is definitely consid-

ered an asset, in some corporations the locations whichorganize the equipment into systems are also consid-

ered an asset. This may be due to operating criteria,financial rules or other considerations. Because of this,in terms of asset performance management, either

equipment or locations may be described as assets.

A hierarchy (shown in the diagram on page 4) is theorganization of data into a structure which represents

both the summary and the arrangement. Often equip-ment is organized by locations in the hierarchy. These

locations form the higher levels in the hierarchy, while

information about the equipment, including the com-

ponents, forms the lower levels. In the EAM / CMMSsystem, the equipment record represents the physical

asset which needs to be operated, maintained orrepaired. The location record is used to represent the

address within the facility. This address can take sever-al forms - it might be a spatial position in the facility,or an organizational unit within a specific department,

but in most cases it represents a position or location inthe process (with top-most levels representing the

plant locations in the organization). The arrangementaspect of the hierarchy shows where the assets are

located. The summary aspect of the hierarchy providesa structured and consistent means of reporting by dif-ferent levels within the hierarchy. In the hierarchy, the

costs and other key figures for each level represent the

aggregated values for all the subordinate (child) levelsdown to the lowest (leaf) level. For example, a reportof work order costs for a system would show the costs

for all the work performed on the equipment whichbelong to the specific system. Generally, it is best tohave the equipment positioned in the lowest level in

the hierarchy. This is where work actually takes place.Additional information about the performance require-

ments for the equipment’s installation point in theprocess should be maintained as characteristics on the

location record, this is often called the location specifi-cation. Assume that this hierarchy is process based,and then a group of assets belonging to the same sys-

tem would all have the same system location record astheir parent record in the EAM / CMMS system.

Similarly, all the systems which belong to the samearea would have the same area location record as their

parent record in the EAM / CMMS system. The hier-archy can easily be used to represent other structuresbesides the standard process model. In transmission

and distribution systems, the hierarchy can be struc-tured on a circuit model. In mining operations the

hierarchy can be based on areas having mobile assetsusing the fleet concept. The model can be easily devel-

oped to accurately represent the actual business processinvolved in the company.

Note: In SAP the location record is termed functional

location and the equipment asset record is termedequipment. Collectively they are described as technicalobjects. The term technical object is also used to

describe other objects like bills of materials (BOMs),measuring points, etc.

Clearly, the development of a

failure coding system which is

well-organized, user-friendly

and clear will encourage broad

system adoption resulting in

the capture of improved

failure related data.




Factors Affecting System Hierarchy

Adjunct components are a common area of confusion.

These are components which are separate but form a

part of something else. They usually do not rise to thelevel of unique equipment and may be found in the

margins in the equipment systems. For example, let’'stake a coupling which is used to connect a motor to a

pump. Opinions may vary on whether the couplingbelongs with the motor, the pump or something else.

To ensure consistency, equipment boundary definitions are used to make sure that everyone has the same

understanding about this. Companies may adopt differ-ent approaches to developing and documenting equip-ment boundary definitions. When well-documented

and understood by operations, maintenance and engi-neering, these provide a means of communicating the

specifics surrounding the failure event.

When it comes to selecting an EAM / CMMS, compa-nies consider multiple factors. Make sure to under-

stand any considerations inherent in the EAM /CMMS system

which could affectthe development of the hierarchy.

Meridium knowsfirst-hand that one

size doesn't fit all.Company cultures,

processes andequipment systemsvary. Over the

years, Meridium hasconsulted with

clients on their tax-onomies and has

the experience toassist corporationsin developing a tax-

onomy whichincludes detailed

failure codes.

It is important to consider what data you will get outof the system based on its design. Generally, these

structures have been designed to support reporting of key figures or measures such as costs, hours of down-

time, etc. according to different views or dimensionsinto the data. Apart from an array of financial dataand other reporting variations, reporting information

falls into two basic categories - operational and mana-gerial.

Operational (transactional) reports are used to perform

work. Common examples are “work order backlog” or“daily stores receipts.” Managerial reports are used to

manage and improve performance. Common manage-rial report examples are “work order costs by depart-

ment,” “work orders - planned and actual costs,” or“downtime by equipment type.” Meridium normallyconsults with clients to develop their KPIs (key per-

formance indicators) during the software implementa-tion process.

Failure Event Codes: Bridge to

Effective Reliability Analysis

For reliability practitioners, failure event codes from

work history records are the bridge to effective analy-

sis. Work history is the term used in Meridium todescribe the completed work request and work orders.It is the history of work which is useful in reliabilityanalytics. This is event data, not location data. Work

history event data is what was actually documentedduring the process

of maintaining orrepairing an

equipment asset.Since the repairprocess may span

a period of timeand involve a

number of differ-ent parties, the

timely entry of data into the sys-tem is crucial.

Use the system todocument the

minimum neces-sary information

about the eventsas they occur. If itwasn't document-

ed, then it didn't

“happen,” at leastin terms of data which may be used in failure analysisat some point in the future. A failure event needs to

be recorded at the correct equipment, but sometimesthis doesn’t occur. Sadly, some system users find it easi-er to record the failure at a convenient place in the

system rather than the correct place. The actual failuredata can be easily missed if the work history is written

anywhere else in the hierarchy other than at theequipment asset which malfunctioned.

Typical location/equipment asset heirarchy



The way that people document an event can be highly

variable. Even the basic description of the malfunctioncan vary considerably from one author to another.

Factors which may influence a simple narrativedescription include available time, motivation, atten-

tion to detail, understanding of the system and compo-nent interaction. It is not effective to filter, sort orgroup work history event records solely by the use of

narrative descriptions. This is why failure event codesare very important. The use of codes in the system

ensures a consistent way of documenting the keyaspects of the event according to pre-defined cate-

gories. These codes are used in mining data in the sys-tem, which in turn makes the subsequent analysis pos-sible. It may be useful to supplement a specific code

entry with brief comments or other text. This is possi-

ble with some systems and is helpful in more detailedanalysis of the failure events.

Reliability practitioners need quality data as a startingpoint for their analysis efforts. Although it is

technically possible for someone to re-code afailure event after the fact, this isn't the best

solution. This is because the work order usuallyhas to be re-opened in order to make thesesorts of changes in the EAM / CMMS which

usually affects the quality and accuracy of theinformation. Plus, this re-work isn't efficient.

Ideally, it is best to have the data entered cor-rectly closer to the event by those who investi-

gated and corrected the failure.

A more complete picture of the hierarchyshowing additional details about the locations

and equipment as well as event data may befound at right.

Failure Codes and Standards

There have been several efforts over the yearsto create taxonomies and equipment failurecodes. One that has been adopted internation-

ally is ISO 14224: Petroleum, petrochemicaland natural gas industries - collection and

exchange of reliability and maintenance datafor equipment.1 This international standard is

also published by the American PetroleumIndustry as API 689. ISO 14224 has its roots inOREDA©, an organization sponsored by nine

oil and gas companies in an effort to collectand exchange reliability data among its partici-

pants.2 The standard offers a good reference

methodology for the development of taxonomy, equip-

ment boundaries and failure event codes. It also con-tains guides to interpret and calculate reliability and

maintenance data, as well as key performance indica-tors.

Another initiative which has undergone significant

development is the ExxonMobil Enterprise EquipmentTaxonomy.3 This taxonomy documents a structured

classification of all significant equipment, equipmentcomponents and maintenance activities that may befound in a given petrochemical facility and was devel-

oped by EMRE, the research and engineering arm of ExxonMobil (XOM) and is licensed by Meridium.

EMRE developed the taxonomy to be a standardizedmethod for classifying, measuring and tracking equip-

ment specifications and performance across

ExxonMobil operations worldwide. Through use of thetaxonomy, ExxonMobil has simplified its own internal

data collection while reducing maintenance costs.

Taxonomy

Overview of

Location,

Equipment and

Event Data





In the early 90’s, another initiative resulted in ISO

1592: Industrial automation systems and integration -integration of life-cycle data for process plants includ-

ing oil and gas production facilities.4

The standardspecifies a life cycle view of the information require-

ments of process industries. A part of this standard isthe Reference Data Library, which holds technicalclass descriptions of main equipment items.5, 6

A more recent effort addressing asset management bestpractices may be found in PAS 55 - Asset manage-ment.7 This publicly available specification is pub-

lished by the British Standards Institute and containsrelevant guidance for utilities and transport organiza-

tions.

Results Driven Failure Coding

It’s apparent that there are a number of standards

which may be used in the development of failure codesfor industry and manufacturing. At the same time,

there may be any number of consultants offering fail-ure code packages. In terms of detail, code sets foundtoday may range from simple to complex.

Some purists would argue that successful failure codingrequires the initial development of the most compre-hensive and complete set of codes possible. However,

developing the necessary taxonomy and failure codingto support the depth and breadth of this strategy

requires significant time and resources which can often

be counterproductive to fueling support for the effort.

Also, it is easy during EAM / CMMS system imple-

mentations to fall into the trap of building unnecessar-ily extensive code libraries. This happens for a number

of reasons:

• These projects demand considerable attention todetail

• There is a risk adverse culture which accompanies

these large scale projects

• The functional stakeholders assigned perceive thatpost-implementation changes will be difficult

On the other hand, I would argue for a more practicalapproach, one which is driven by results. While it isimportant to develop a model which is scalable across

the enterprise, it is also important to match the effortto the desired results.

Consider that the chief aim of failure event coding is

to have data in a computerized maintenance manage-ment system which facilitates finding the assets which

are experiencing the highest failure rates - the bad

actors - so that strategies can be developed to mitigate

future failures resulting in higher margins of safety,cost, environmental responsibility and productivity.

Given these key business drivers, it is not crucial thatan organization have the absolute best set of codes tobegin with. It is more important to have a good set of

codes which will be used across the organization whileat the same time allowing the user community to sug-

gest codes which should be added.

Over the course of time, the code sets will be refinedas part of the continuous improvement cycle. This

interaction of the user community with the reliabilitypractitioners will foster a naturally occurring develop-ment process that will result in greater participation,

understanding and adoption.

Methodology and Failure CodingEssentials

Meridium has found the methodology outlined in ISO14224 to be an effective means in which to anchor

stakeholders during code development. However, sinceit's often not possible for a company to simply adopt

ISO 14224, Meridium assists companies with failurecoding development. Not only can this speed thedevelopment process, but it also helps to ensure that

the failure code data integrated into Meridium fromthe EAM / CMMS system is a valuable input for relia-

bility analytics.

First, let's review some essentials of the asset hierarchy .

This is the organization of the location records andequipment asset records in the EAM / CMMS.

There needs to be an agreed upon structure for the

While it is important to

develop a model which

is scalable across the

enterprise, it is also

important to match the

effort to the desired results.




hierarchy. This structure supports the locations and

equipment (technical objects), failure data and main-tenance data. At the very least, the structure includes

the location records and the equipment records.There may be at least four location levels for the tech-nical objects. For example, site - area - unit - location.

Some companies have more and ISO 14224 definesfive levels at the use / location hierarchy structure.

In any software system, a data record usually has an

identification (ID) field and a description field. This istrue for location and equipment asset records. When it

comes to the location record, there is usually some sortof naming convention or indicator for the structure of the ID field. In the design of this structure, it is best to

segment each level by some kind of delimiter usually a“-” (dash), although “_” (underscores) and “.” (periods)

have been used.

The location record may also have another field onthe data record to represent the parent or superior

location. This is used in some systems (like SAP) tovisually represent the hierarchy tree.

The structure used should make it easy to read in terms

of understanding the arrangement of any object in thehierarchy. Typically this is done by consistently includ-

ing the data for the parent level in any child level.

This means that all records representing an areashould have the area segment preceded by the

value representing site in the ID field, and soforth down to the location. For example:

SITE01-AREA01-UNIT01-LOC01

SITE01-AREA01-UNIT01-LOC02

Using the location reference on the work orders

helps later on when analyzing data. Often wild-card characters (i.e. * or %) can be used when

querying the work order records in a system. Bymeans of the structure shown above, it’s rela-

tively easy to find all the work order costs for Area01using an entry like “SITE01-AREA01*.”

Recall that the location represents an address in the

process, while the equipment represents the physicalobject which has to be repaired or maintained.Locations often have process specific design considera-

tions, usually found in the P&ID (process and instru-mentation drawing). So, the location record is a good

place to maintain key data related to the process, likesystem pressures, static head, etc.

On the other hand, the equipment may move from

one place to another in the process. A pump, instru-

ment or valve which fails may be replaced with anoth-

er unit while the failed unit is repaired for later useelsewhere in the process. The characteristics of these

unique equipment assets should be maintained on thespecific equipment record. Since the equipment asset

may be moved around in some fashion during its life-time in the plant, it is not necessary to have a struc-ture for the equipment record ID field. Most systems

have an internal number sequencer which auto-num-bers equipment records. Although external numbering

assignments are often possible, they aren't really neces-sary since all that is needed is a unique identifier in

the ID field.

There’s a separate hierarchy for equipment assetswhich is based on their function. Typically there are

one or more fields on the equipment record which may

be used to represent this hierarchy structure.

This structure normally begins with the equipment

category, such as fixed, rotating etc. Next is the equip-ment asset class, such as heat exchanger, pump, valveetc. Then it is down to the specific equipment asset

type (see examples below).

This categorization of equipment assets, down to the

equipment type, is useful when grouping work history

records to understand where the failures are occurring

and in support of other initiatives within the company.

Detection Phase: Writing theWork Request/Work Order

When failures occur, it is usually first recognized at theequipment level by operating or production units. Theinitial failure event is observed and reported during the

detection phase .

When did the f ailure occur?

When a malfunction takes place, if the work request is




entered promptly into the system, then accurate timeperiods are automatically recorded. The same is true of

the time that it takes to restore the equipment from

the malfunction. These time periods are needed lateron to determine the MTBF (mean time between fail-

ure) and MTTR (mean time to repair). In SAP, thesystem proposes the malfunction start and end dates /

times of the notification (work request) based on thecreation and completion dates / times. These proposed

values can be updated by the user as needed.

Where did the failure occur?

This is done by entering the work request in the sys-tem using the correct technical object (location or

equipment record). Often system users find it easier tosimply write the work request to some location that is

easy to remember, rather than searching for the correctplace in the system. It is better to use the lowestappropriate level in the hierarchy such as the record

which includes the actual maintainable item ratherthan the higher level location or subsystem. Also, be

sure to indicate that a failure exists in some fashion.This is done either by using a specific work request /

order type, setting an indicator on the work request /order or entering additional failure information on thework request / order. Having this information in the

system makes it possible to distinguish componentreplacements which were due to failure related events

from those which were non-failure replacements.

Reliability analysis can be adversely affected by treat-ing a data point as a failure event when it should havebeen treated correctly as something else such as a com-ponent replacement without failure (for other rea-

sons).

It may be best to use a specific work request type ororder type to report equipment failures or malfunc-

tions. This will facilitate future analysis of the workhistory data; otherwise, it will be more difficult to

understand the costs and labor needed to make repairs.These work requests / orders can be segregated from

other activities such as preventive maintenance (PM),predictive maintenance (PdM), routine, improvements

etc. In some systems, there is a unique indicator toshow that a failure exists. For example, in SAP, thework request (notification) has a “breakdown indica-

tor.” In fact, the notification in SAP is used to store allthe technical history for the repair, while the mainte-

nance work order is the controlling document used toplan, estimate and execute the work.

How w as the failure discovered?

The failure event was discovered in some fashion, thisis the method of detection. It may have occurred

because something affected the production of the man-ufactured product, was observed during normal rounds,

during routine tests, or discovered in some other waylike chance observation. This information is crucial todetermine if the existing strategies are effective or if

new strategies may be needed.

What is the symptom?

This first level of failure is called the failure mode in

ISO 14224. It is the visible symptom at the equipmentasset level. When this takes place, a work request orwork order is usually entered in the EAM / CMMS.

Similar to a parent taking their child to the clinic, all

that is known at this time is the symptom. No detailedreporting or analysis is to occur at this point. Forexample, an operator may report that the pump failed

to start. This is done by describing the problem andselecting the correct code for the failure mode in thesystem. Since the symptoms are somewhat generic, the

same list of codes may typically be used for all equip-ment types.

What was the effect of the failure ? Usually, the person

reporting the failure has some idea of the effect of thefailure to the organization. There may have been no

effect, or the failure may have affected the environ-

ment, safety or production. This information is notonly useful in prioritizing any subsequent mitigatingactions but is also an important aid in compliancereporting to third parties like environmental or safety

regulatory agencies.

If possible, indicate the degree of failure or functionalloss. The equipment experienced a malfunction, but to

what extent did the equipment malfunction? Was it a“complete” failure, such as when the pump fails to

start? Was it a “partial” failure, such as when the pumpcannot maintain the desired flow rate? Was it a

“potential” or “latent” failure? While these may notseem like actual failures because they result from event

conditions which do not trigger an active fault -it islikely that they will do so at a future point, as in theexample of corrosion in a piping system. Categorizing

these events correctly supports developing better miti-gation strategies moving forward.

Keep the guidelines simple, yet effective.

When something malfunctions:




• Write the work request / order to the equipment

asset

• Indicate that a failure exists

• Select the correct code for the failure mode

• Note who found the failure and how the failure was

discovered

• Describe the problem

• Optionally, indicate the operational effect and thedegree-of-failure

Many of the EAM / CMMS systems can make enteringdata a mandatory requirement in specific fields. This isgood when limited to the minimum requirements, but

can create problems if there are too many requiredfields. Do not require the system users to enter addi-

tional failure information, like maintainable item orfailure mechanism, when creating the work request /

order. At this point, they only know the observablesymptom; it frustrates them and results in bogus entriesinto the system just to get past the required field entry.

This additional information should be entered later bythose making the repairs. It will be more accurate and

meaningful during later analyses.

Correction Phase: Doing the Work andUpdating the Work Request/Work Order

During the investigation / correction phase, the main-

tenance technician will usually find that some compo-nent failed which resulted in the malfunction of theequipment asset. This component is also known as amaintainable item, or object part (in SAP). It is

important to distinguish components that were directlyreplaced because of the failure event from those com-

ponents that were replaced for other reasons.Sometimes other parts are replaced during the repair

process, for example when the failed componentcaused a secondary component failure or when therewas an opportunistic component replacement. If this

additional information is not recorded, then there isno difference between replacement event data and fail-

ure event data in the system. Where this occurs, it cancertainly distort an analysis and requires extra effort to

scrub the data in order to make it useful.

If components are replaced as a result of inspections orother forms of predictive maintenance indicating a

potential, latent or incipient failure event, then thisshould be documented accordingly, because it is still afailure.

Typically, someone in the maintenance department

enters the information required in this phase; it could

be the technician, the maintenance planner or super-visor. This information can be collected at various

points in the process, but should be reviewed for accu-racy just prior to completing the work request / order.

While the codes which may be needed for the failure

modes are somewhat generic or abstract, the codesneeded for the maintainable item need to be more spe-

cific and tailored to the equipment type. There needsto be a good balance to these code sets. Keep in mindthe need for just enough detail. It is easy to go over-

board and build out the complete bill of material for anequipment asset as part of the maintainable item

codes. The effort should be carefully considered beforedoing so - in many cases it is not necessary. Bear in

mind that for the purpose of failure event reporting,

the maintainable items which make up equipmentasset types are codes which describe components found

in the equipment. So, “bearing” is a code value whichwould apply to all the bearings in the equipment. This

may be further divided by bearing type, such as ballbearing, roller bearing, roller thrust bearing, etc. Keep

the results in mind. If subsequent analysis indicatesthat bearing failures are prevalent among the facilities'asset population, then perhaps a review of the predic-

tive / preventive maintenance strategies involved is inorder. At the same time, the failure history records

should be well kept at this maintainable item level,because it is the failed components which are driving

the equipment asset malfunctions.

Reliability practitioners usually have a good perspec-tive on the detail needed for these code sets.

Sometimes the code sets are developed in-house byplant maintenance and engineering resources; notwanting to overlook anything, they may develop

detailed code lists which are more itemized than neces-sary. In order to get results from the work history, the

selection list of codes needs to be manageable. Toomany codes in the selection list are confusing - it will

not be easy for the system users to make the correctselections. In this situation, the easiest thing for the

user would be to select an item at the top of the listand move on, negatively impacting the quality of thedata in the system. From a practical standpoint, con-

sider the quantity of codes which are visible on thescreen. In SAP for example, 21 items are visible on

the screen, more than that results in having to scrolldown to see additional items.

Something occurred which resulted in the failure of

the maintainable item. For example, it may have beencorrosion, fatigue or leakage. The method in which the




component failed is described as the failure

mechanism. These codes are more broadly categorizedand may be developed at the equipment asset category

level. There may be a few notations within the higherlevel categories of mechanical, electrical, instrumentand so forth. ISO 14224 provides an excellent refer-

ence in this regard.

Often confusion or other problems arise around thecause of the failure . The actual cause may not be fully

understood. Also, coding the cause in the system mayassign blame to others in operations, maintenance or

management. But, understanding the failure cause isimportant to determine not only the necessary actionsbut also the extent of those actions. If the equipment

is routinely being operated well outside of its intendedoperating parameters, it may be necessary to consider

many factors which may influence this. The codesused to document the cause are those which best cate-

gorize the underlying or root cause of the failure. Thespecific cause may not be well known at the time andin some cases it may take a formal root cause analysis,

using methodologies like PROACT® for Meridium tofully investigate and document the root cause. Again,

the notations found in ISO 14224 are an excellent aidto developing these codes.

When restoring the equipment asset to its normalfunction, a specific activity is performed. These are

somewhat generic in nature, and include things like

replaced, adjusted, modified, etc. Reliability practition-ers use this in determining the nature of mitigatingactivities for the equipment asset. If a particular con-

trolling mechanism is going out of tolerance over aspecific time span (perhaps due to its operating envi-

ronment) then periodic inspections and calibrationsmay be warranted.

Tasks should be mentioned in passing and may be doc-

umented in the system on the work order, but theseare activities which may be performed in the future,and are usually not needed for failure analysis. Often

the most uncomplicated approach is to create anotherwork request in the system for the future activity. This

new work request will be managed separately, whilethere will be no visibility for a closed work request /

work order.

During the correction phase:

• Update the work request / order during key steps in

the process

• Select the correct codes for:

• Maintainable item

• Failure mechanism

• Failure cause

• Activity

Data automatically recor ded in t he sy stem

As mentioned earlier on the topic of failure date, some

information is automatically captured in the EAM /CMMS system. This includes dates and times for spe-cific processing stages of the work request / work order.

There may be many types of date time stamps in thesystem. Basically, the document creation date and

completion date may be used to determine the overalltimeframe that the asset was unavailable. Sometimes

there are data fields to specifically document the mal-function start and finish dates and times, as in the caseof SAP.

It is also important to understand the costs of themaintenance or repairs. This is another reason why itis important to create the work request / work order

using the correct technical object. Some find it expe-dient on occasion to circumvent established approval

processes by using multiple work orders to make a sin-gle repair, or to make a modification to existing equip-

ment over a period of time. This should be guarded

against because it distorts the true occurrences of thefailure events.

Production costs are useful in making decisions aboutthe priority of strategies and to understanding the

potential improvement benefits. While these costs may

Recommended Event Data for Reliability Analysis




not be directly available in the EAM / CMMS, having

a way to relate these costs to the failure events isinvaluable input for reliability practitioners.

Event Data for Reliability Analysis -

Summary Requirements

Although somewhat subjective, the example on

page 10 summarizes the recommended data needed forreliability analysis and where the data may be captured.

Event Data for Reliability Analysis -

EAM/CMMS Comparisons

Terms used in describing event data for reliabilityanalysis may vary between EAM / CMMS systems.

This section will map the necessary event data forreliability analysis to corresponding fields in several

EAM / CMMS systems and show the data entrymethod.

Many EAM / CMMS systems are designed to support

both configuration and customization. Configurationis the structure provided by the software provider

which enables the company to tailor the software solu-tion to their specific business processes. Configurationis integrated throughout the various modules of the

software solution. It can include the fields that will bedisplayed on a specific screen as well as the data in the

pick or selection list for those fields. Customization

goes beyond configuration and extends the softwaresolution using specific additional programming. Someof the data needed for reliability analysis are easily pro-vided using configuration, while other data may require

customization.

Coding struct ure dif ferences

Coding structure may vary between different EAM /

CMMS systems. The solutions may range from astatic list of values to a dynamic interrelated selec-tion list. It's important to understand the technical

structure used before developing the set of failurecodes down to the equipment asset type.

SAP

SAP uses catalogs, code groups and codes to developthe items which may be used on the notification when

coding the technical history of work. A specific collec-tion of these are assigned to catalog profiles. Technicalobjects (functional location or equipment records) may

have a single catalog profile assigned to them. When anotification is raised in the system, it is referenced to a

technical object. It then uses the codes assigned to thecatalog profile from the technical object. Options areavailable when the desired data for reliability does not

map over directly to a corresponding field in SAP noti-fications.

Indus PassPort

Indus PassPort (also known as Asset Suite) supports

the capture of failure codes at the work order tasklevel. Any number of failure modes and root causescan be captured by task, plus reason category and rea-

son code. All of these codes are user-defined and thefailure mode codes are qualified by equipment type

while the root cause codes are qualified by failuremode. In addition, a Trouble/Breakdown flag is set to

indicate the task which represents the failure.




IBM Maximo

Maximo is designed to support a series of cascadingpick lists where the values at the current level are

determined by the selection at the higher level. Afterthe user selects the code for problem, the system pres-

ents the set of values for cause . The selection used forcause determines the set of values for remedy .

Oracle eAM

Oracle eAM uses an asset group hierarchy whichincludes a predefined item template to create the navi-

gational category of assets in conjunction with option-al intermediate levels. Several options exist to include

the location / use information in the taxonomy. Oraclehas a variety of solutions and provides considerable

flexibility in setting up these codes. The Activity Typeand Activity Cause fields are found on the work orderheader. Other information may be entered using

Quality Collection Plans and Oracle FlexFields.

Library of Failure Codes

Often it is helpful to use a workbook / spreadsheet tobuild the complete library of failure codes outside of

the EAM / CMMS system to make sure the code keysand descriptions follow the naming convention and

adhere to any enterprise data standards within thecompany.

In years past, code keys were often developed using 3

or 4 letter abbreviations, which were supplementedwith a description. For example, “FTS” would be used

to indicate that the equipment “failed to start.” Thisisn't really necessary and is somewhat counterproduc-tive. Originally, the 3 letter codes were used because of

system limitations which made it difficult to displaythe descriptive text. In the EAM / CMMS system the

codes are typically sorted in the selection list by thekey value, so if the codes were developed using a 3 let-

ter abbreviation, then the letter that is used deter-

mines the order of placement in the list. Many systemstoday support multiple languages - in software terms

this is called “localization” - so a 3 letter code forfailed- to-start may be totally meaningless in another

language. The most straightforward approach is to usean alpha - numeric sequence as the keys for the codeswithin a specific category. Be sure to allow for the

insertion of additional codes later on in the existinglist. The best way to do this is to originally number the

items by decades (0010, 0020, 0030, etc.). Thisway a new item may be inserted between the

existing items by using a different number (e.g.0015).

How Do I Use This Information?

• To provide everyone in your organization with a

basic understanding of failure coding

• As a starting point for individuals responsible forthe development of failure codes

• To augment the experience of those guiding failure

code implementations - aiding stakeholders, cham-pions and ultimately, the business sponsors as they

make decisions during implementation efforts

• To measure the meaningfulness of your current fail-ure codes

• To aid in the review process if you currently havefailure codes in your system, but their effectivenessis limited

In closing, please keep in mind that having and usinggood failure event codes is a necessary beginning. Thequality and accuracy of this data directly affects the

ability of skilled practitioners to maximize the value of improvement opportunities.



Corporate Headquarters

Roanoke, Virginia, USA +1.540.344.9205Regional Office

Houston, Texas, USA +1.281.920.9616EuropeWalldorf, Germany +49.6227.7.33890

Middle East, AfricaDubai, United Arab Emirates +971.4.365.4808

Asia PacificPerth, Australia +61.08.6465.2000

[email protected]

About the author

Ralph Hanneman, CMRP

Senior Consultant, Meridium, Inc.

Ralph is a Senior Consultant for Meridium, Inc. who

began his career in the Navy where he obtained con-siderable experience in the maintenance and operation

of all facets of shipboard electrical systems, as well asthe Navy’s preventive maintenance methodology. He

has nearly 20 years maintenance management andproject engineering experience in pulp & paper, auto-

motive parts manufacturing and heavy construction(tunnel boring).

Ralph has substantial experience in many facets of

SAP gained with five years of ERP implementationprojects in Plant Maintenance roles. He is experiencedin SAP’s BI solution, in particular Plant Maintenance

reporting. Since joining Meridium in early 2007, Ralphhas been involved in pilots and enterprise implementa-

tions of Meridium with PEMEX, BMA Coal, HydroOne, Hess, Rio Tinto, Bruce Power, Samarco

Mineração and Flint Hills Resources in consulting andproject management roles.

He has conducted failure code development workshops

for clients using SAP and Oracle eAM.

Contact Meridium to develop a plan to

learn more about how failure and event

coding impact your Asset Performance

Management initiative.

The views and statements made in this document are based on the experience and opinions of

Meridium consultants and have not been evaluat-ed by any authors or governing body.

References

1 http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnum-

ber=36979

2 http://www.sintef.no/static/tl/projects/oreda/

3 http://www.meridium.com/news_events/articles/articles.asp?article_ID=14

4 http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnum-

ber=29556

5 http://en.wikipedia.org/wiki/ISO_15926_WIP

6 http://projects.dnv.com/reference_data/

RD7Browser/doc/general.aspx7 http://www.bsi-global.com/en/Shop/Publication-

Detail/?pid=000000000030077936

Documents

Meridium Basics Failure Event Coding