Home :: Risk is in the Management

Risk is in the Management

The Risk is in the Management

Do you use a risk management program to conserve asset resources?
Does your employer foster a site environment where risk management is a routine part of job planning, preparation and execution?

Did you answer No to the above questions?

Risk Management was once thought to be the sole product of the site safety department.
Maintenance and Operation professionals now understand the importance of a risk management process to aid in protecting, conserving, and extending the reliability of critical assets. Failure to effectively manage the risks of asset failure can add costs to an operating unit at any plant, site, or installation.

Managing risks related to asset maintenance and operation requires good judgment and some professional expertise because this is an art, or vocation, and a science with its own well developed technological hierarchy. The objective of managing risk is not to remove all risk but to eliminate unnecessary or avoidable risk, thus the process must allow individuals to make informed decisions about what risks to accept at each operational level. Managers should compare standard Risk Management principles with historical asset data and their personal experience; then consider How, When, and Why it applies to specific situations within their area of functional responsibility.

Both Managers and Craft/Techs manage risk on a daily basis. Craft/Techs continuously search for hazards within their areas of expertise during daily job performance and routinely recommend the proper controls to reduce risks. Potential hazards and resulting risks vary as operating circumstances and parameters change. Management knowledge, gained from these experienced Craft/Techs, coupled with additional subject matter training can influence the extent and success of risk reduction measures.

Have you ever heard of SFMEA?, RCFA?, Maintenance Optimization, or RCM?
These are all tools that can be employed to help a site preserve asset resources. Programs like these can provide the means to identify, assess, and implement controls of risks and potential hazards to critical assets. Specific parts of these tools also help compile information necessary for making decisions to help balance PM/PdM program costs with increased operating benefits. What does each have in common with the others? They all ask the same questions as the basic Risk Management model. In the following table note the similarities of each process step, or decision level.

BASIC RISK MANAGEMENT
SFMEA
RCFA
MAINTENANCE
OPTIMIZATION
RCM
Step 1 1. Identify the failure hazards. Potential Failure Mode Potential Effects of Failure Severity Potential Causes 1. Define The Problem 1. Determine Operational Criticality. 1. What are the functions and associated desired standards of performance of the asset in its present operating context (functions)?
2. In what ways can it fail to fulfil its functions (functional failures)?
Step 2 2. Assess failure hazards to determine risks to site operations based on probability of occurrence. Probability of Occurrence 2. Analyze the Problem 2. Understand and Predict Equipment Behavior 3. What causes each functional failure (failure modes)?
4.What happens when each failure occurs (failure effects)?
5. In what way does each failure matter (failure consequences)?

Step 3
3. Develop controls to reduce or avoid failures and make decisions reference the level of acceptable risk. Detection 3. Develop Solutions 3. Develop Maintenance Solutions to Improve Future Behavior 6. What should be done to predict or prevent each failure(proactive tasks and task intervals)?
Step 4 4. Implement controls. Improvements 4. Implement Solutions 4. Implement Solutions 7. What should be done if a suitable proactive task cannot be found (default actions)?
Step 5 5. Supervise performance and evaluate Current Process Contro 5. Monitor Results 5. Monitor Results and Adjust Maintenance Tactics ( Auditing the Continuous Improvement of proactive failure reduction tasks.)

Figure 1 - Program Comparison

Most of the above referenced processes also have a big “M” in their acronym. Its meaning varies to many different individuals. The commonality of these programs points to the real definition of that big “M”. All require Management. The acute risk to our plant, site, or installation critical assets is failing to use a process to manage them.

The Risk Management Process is composed of five (5) basic tasks or process steps.

1. Identify Failure Hazards,
2. Assess Failure Hazards,
3. Develop Controls and Make Risk Decisions,
4. Implement Controls,
5. Supervise and Evaluate (performance of the control measures).

Tasks 1 and 2 comprise the risk assessment. In Task 1, Managers and Craft/Techs identify the failure modes and hazards which may be encountered during operation of plant, site, or installation critical assets. Task 2 is a determination of impact of each failure incident and resulting loss of operational function.

Tasks 3 thru 5 are activities to help the Manager effectively reduce the occurrence, mitigate the consequences, and manage risk incidents. In these steps, managers balance asset failure risks against costs of performing RIB (risk based inspections), increased frequency PM procedures, and expanded PdM programs. They also implement the appropriate actions required to eliminate unnecessary failure risks during asset operation. The planning, preparation, and performance of repair, replacement and preventive maintenance activities are carefully evaluated during these steps along the risk management path. Lastly, control activities are monitored and evaluated for their effectiveness and valuable lessons learned are collected for use by others.

To apply the Basic Risk Management model:
1. Identify the Failure Hazards - A hazard is a condition or potential condition where the failure results in loss of an operating function, damage to, or loss of an asset and related components found in an operational environment.

2. Assess the Failure Hazards - Asset risk is defined as the combination of probability of failure and the consequences (severity) of that occurrence. We can define probability as the likelihood of a failure occurring, and severity as a measure of the impact of the failure to the plant, site, or installation operating functions. Asset risk calculations increase as a result of higher probability rates and greater impact to an operation.
A Risk Assessment requires each potential failure incident, hazard, or mode be evaluated in relation to the probability of an incident occurring, and the severity (or impact upon the plant, site, or installation) of that incident or failure.

This activity is heavily dependent upon the use of asset history, lessons learned in the field, intuitive analysis, the Manager’s and Craft/Tech’s experience and sound judgment. Incomplete, inaccurate, undependable, or contradictory information creates doubt and uncertainty when determining the probability and severity of a failure incident. Assessment of risk requires good judgment.

Figures 2 and 3 are tools that can be employed to perform an asset risk assessment.

Risk Assessment Tool 1A is a simplified matrix which can be used by the Manager, or Craft/Tech, to enter the estimated degree of severity and probability for each failure incident or hazard.
Numerical values have been assigned to each of the standardized descriptors. Multiplying the severity number by the probability number will yield a product between 1 and 25. Comparing to the attached key will indicate the estimated risk of failure. The larger the number, the higher the risk.

Risk Assessment Tool 1B is a similarly designed table that can be used by the Manager, or Craft/Tech, much in the same manner. Estimate the level of severity and probability of occurrence then read right and up. The point where the failure severity row and probability of occurrence column intersect, will define the level of failure risk for a particular asset.

Defining the levels of Probability of Failure occurrence:

Frequent - Failures happen often.
Likely - A failure will occur several times during the functional life of the asset.
Occasionally - Sporadic incidents of failure.
Seldom - Remote chance of an isolated failure.
Unlikely - An asset failure is not impossible but highly improbable.

The degrees of Failure Severity are:

Catastrophic - Total loss of asset functionality. Implied threat to related assets, systems, and property.
Critical - Significant reduction in asset, system, or plant operational capability. Significant collateral damage to adjacent assets, components, property, or environmental systems.
Marginal - Possibility of minor impact upon plant, site, or installation operational activities and requirements.
Negligible - Little or no impact on asset, system, or plant operation or capability. Little or no collateral asset, property, or environmental damage.
None - No impact.

The risk assessment tool examines potential failure occurrences in terms of probability and severity to determine the level of risk.

Assessing The Risk of Failure
   
Probability of Failure Occurrence
5
4
3
2
1
Value
Level of Failure Severity
Frequent
Likely
Occasionally
Seldom
Unlikely
5
Catastrophic
25
20
15
10
5
4
Critical
20
16
12
8
4
3
Marginal
15
12
9
6
3
2
Negligible
10
8
6
4
2
1
None
5
4
3
2
1

Very High Risk < 15
High Risk < 10
Moderate Risk < 5
Low Risk > 5

Figure 2 – Risk Assessment Tool 1A

Assessing The Risk of Failure
  Level of Failure Severity
Probability of Failure Occurrence
 
Frequent
Likely
Occasionally
Seldom
Unlikely
  Catastrophic
VH
VH
VH
H
M
  Critical
VH
H
H
M
L
  Marginal
VH
H
M
M
L
  Negligible
H
M
M
L
L
  None
M
L
L
L
L

Very High Risk
VH
High Risk
H
Moderate Risk
M
Low Risk
L

Figure 3 – Risk Assessment Tool 1B

3. Develop Controls and Make Risk Decisions
After identifying and assessing each failure hazard, Managers and Craft/Techs must develop one or more risk controls that will aid in avoiding, preventing, or reducing the risk (probability and/or severity) of a failure incident. While developing controls, Managers must consider the reason for the failure, not just the incident or its impact on asset functions and operation.

Failure controls are generally fall into three (3) categories: risk avoidance, reliability based technology, and educational. Risk Avoidance may include engineering and/or redesign of asset installation and operational profile to remove any risk threat from operation and use of the equipment. Reliability based activities can include optimized PM procedures, PdM technologies, RCFA (Root Cause Failure Analysis), and SFMEA (Simplified Failure Mode Effects Analysis). RBI (Risk-based inspection) is an application of basic risk principles to manage inspection programs for critical plant, site, or installation assets. Educational and Training type controls provide knowledge and skill based programs to ensure implemented procedures and tasks are performed to specific standards.

To make a meaningful Risk Decision, a Risk Assessment should be conducted soon after development and implementation of the above referenced program controls. These results are then used to aid the decision making process pertaining to the amount of risk the Manager is willing to accept for the operation of a critical asset or system. A key activity of this task is to specify Who, What, Where, When, and How each control is to be used.

4. Implement Risk Controls
The number of higher failure risk assets is generally a small percentage of total plant assets. Implement the new or additional PM and PdM tasks when and where needed and focus efforts on the most critical items. Institute a formalized pro-active planning and scheduling function to ensure all resources required to perform the newly implemented activities will be available. The site CMMS should be configured to record and report KPIs (key performance indicators) required for implementation and continuance of a risk reduction or avoidance program. Do not discount or neglect interaction with MRO. Improve the skills of the workforce through asset, maintenance and reliability training.

5. Supervise and Evaluate
The Manager is responsible for evaluating the effectiveness of the implemented controls and programs in reducing or removing the failure potential.
Managers and first line supervision must ensure that subordinates understand how to execute risk controls. Craft/techs continuously assess risks during the workday and should maintain communication with Managers. Both groups should guard against complacency to ensure that risk control and mitigation standards are not relaxed, circumvented, or violated.

Managers must continuously supervise and monitor asset PM/PdM and other inspection activities to ensure they are effective and can keep risks at an acceptable level. Use the asset history from the site CMMS as a source of information to indicate which controls failed and why. Often, a completely different procedure may prove more effective and require implementation.

The level of failure risk for each asset remaining after implementation of best practice controls, is called residual risk. As new controls for failure hazards are identified and selected, a risk assessment is again performed and levels of asset risk revised. The process can be repeated until the risk of asset failure is acceptable or cannot be reduced. Management must be fully committed to continuous improvement of the plant, site, or installation’s risk of failure reduction
efforts.

Risk Management must not be thought of as an add-on to the maintenance management function, but as an integral part of departmental work planning, preparation, and execution.

Asset risk management is a well defined sustainable process, not a one time staged event.