Out of Box Maritime Thinker

2.6.1 FAILURE FINDING MAINTENANCE

Maintenance tasks focused on identifying failures are used to uncover equipment malfunctions that go unnoticed during regular crew operations, such as concealed defects. Since these issues are hidden, proper maintenance is crucial. Otherwise, a subsequent failure and its consequences must transpire before the equipment problem becomes evident. For instance, the inability of a backup electrical generator to start upon power loss might only become apparent when the primary generator fails, causing a power outage. Unfortunately, traditional condition monitoring or scheduled maintenance tasks are often inadequate for managing hidden failures. Instead, failure-detection maintenance tasks, which usually involve functional equipment testing, are employed to ensure the equipment's readiness to perform its designated function(s) when needed.

2.6.1.1 HIDDEN FAILURES – STATISTICS

A failure-finding task aims to minimise the risk of multiple failures to an acceptable level by controlling the occurrence rate of such failures. For example, when assuming that multiple failures can only arise from the simultaneous combination of a specific initiating event and the unavailability of a safety or backup system, the occurrence frequency of multiple failures can be described using the following formula:

Maintaining equipment unavailability below the level required to keep the multiple failure occurrence frequency sufficiently low is crucial, ensuring an acceptable failure risk to achieve an acceptable failure occurrence frequency. For instance, if a specific event's acceptable multiple failure occurrence frequency is 0.01 per year, the initiating event's failure frequency (e.g., ) is 0.1 per year. The acceptable unavailability for the hidden failure is 0.1.

Failure-finding tasks are efficient in addressing hidden failures because they either (1) verify the equipment's functionality or (2) enable the detection of equipment failure, necessitating repairs. After conducting the task, the unavailability of the safety or backup system is effectively "reset" to zero (or close to zero). As time passes, the unavailability increases until the equipment fails or undergoes retesting. The failure rate remains constant, implying that the failure probability increases linearly (or nearly linearly over most reasonable periods) with a slope equal to the failure rate (i.e., the failure probability is the product of the failure rate and elapsed time). The impact of failure-finding tasks can be seen in the below figure.

2.6.1.2 APPLICABILITY AND EFFECTIVENESS OF FAILURE-FINDING TASK

For a failure-finding work to be deemed effective, the following criteria must be met:

· There should be no available or cost-efficient condition-monitoring or planned maintenance task capable of detecting or preventing the failure.

· The task must be technically viable to execute. Performing at the necessary interval should be feasible without disrupting a stable system.

· The task must reduce the failure probability (and thereby the risk) to an acceptable level. The tasks must be conducted at intervals that ensure the probability of multiple failures maintains an acceptable risk level. Risk acceptance criteria should be established and documented.

· The task must not elevate the risk of multiple failures (e.g., when testing a relief valve, overpressure should not be generated without the relief valve is operational).

· The task must ensure that entire protective systems are tested, not individual components.

· The task must be cost-effective. Over time, the expense of performing the task should be lower than the total cost associated with the consequences of failure.

2.6.1.3 TASK INTERVAL FOR DETERMINING FAILURE-FINDING MAINTENANCE

The interval for failure-finding tasks can be established through the following:

· Mathematical approaches employing reliability equations, or

· Utilising general guidelines designed to maintain an acceptable risk level.

Regardless of the method employed, the primary objective is to ensure that the unavailability of a safety or backup system remains low enough so that the frequency of multiple failure occurrences is adequately reduced to achieve an acceptable risk level. An acceptable occurrence frequency must be defined for a specific consequence of multiple failures. For instance, a 0.01 per year occurrence frequency may be acceptable for a €1 million operational loss, while a 0.1 per year frequency could be suitable for a $100,000 operational loss. In both scenarios, the risk is equivalent (€10,000 per year).

2.6.1.3.1 FAILURE-FINDING TASK INTERVAL - MATHEMATICAL DETERMINATION

The intervals for failure-finding tasks addressing the highest-risk hidden failures often necessitate mathematical determination; this is typically achieved by assuming the hidden failure is random and, thus, best modelled using the exponential distribution. This assumption is generally valid for the following reasons:

· If the failure exhibits a wear-in failure characteristic, a one-time change or a condition-monitoring task is usually employed to manage the failure.

· If the failure demonstrates a wear-out failure characteristic, a condition-monitoring or planned maintenance task should be applied to manage the failure.

To establish a failure-finding task interval, the equations for the frequency of multiple failures and the unavailability of the hidden failure are combined as follows:

The subsequent additional assumptions are frequently accurate and will result in simplification:

· The failure distribution is exponential.

· The product of the conditional failure rate and the test interval time (λ × test interval) is less than 0.1.

· The time required to perform a failure-finding task is short compared to the system's availability duration.

· The time needed to conduct a system repair is short compared to the system's availability duration.

· Multiple failures can only arise from the simultaneous combination of the specified initiating event and the unavailability of the backup or safety system.

2.6.1.3.2 USING GUIDELINES TO DETERMINE THE FAILURE-FINDING TASK INTERVAL

Guidelines are established and documented to determine the failure-finding task interval. This process typically involves the following steps:

· Setting up criteria for determining the required unavailability of the hidden failure, considering the risk associated with the hidden failure

· Estimating the Mean Time To Failure (MTTF) of the hidden failure

· Identifying the test interval using a table derived from the equation ( T )

*Risk of Hidden Failure*	*Unavailability Required*
Very High	< 0.0001
High	> 0.0001 to 0.001
Moderate	> 0.001 to 0.01
Low	> 0.01 to 0.05

Failure-finding - Interval Rules – Example

*Unavailability Required*	*Failure-finding Interval (as % of MTTF)*
0.0001	0.02
0.001	0.2
0.01	2
0.05	10

Failure-finding - Interval Based on MTTF – Example

When using the guideline-based approach, users should be cognizant of the assumptions made while developing the rules and task intervals. Confirming the validity of these assumptions in the context of the specific system or equipment being assessed is essential. By doing so, users can ensure that the determined task intervals accurately reflect real-world conditions and contribute to an effective failure management strategy.

2.6.2 RISKS CONSIDERATION

Risk can be divided into two components: the frequency of loss events and the severity of their consequences.

Frequency, often expressed in loss events per year, can be determined using historical data (if many events have occurred) or calculated through risk analysis tools (if limited data records exist).

Consequences can be represented as a combination of a loss event's impact on various aspects, such as:

· Capital investment: Damage to equipment and cost of repair

· Community: Effects on the public

· Directional control: Total loss or diminished manoeuvrability

· Explosion or fire: Damage to equipment and/or the vessel

· Loss of containment: Quantity of harmful substances released into the environment (including cleanup costs)

· Operations: Loss of hire, outage time for functions like drilling, position mooring (station keeping), hydrocarbon production and processing, loading or unloading operations

· Propulsion: Complete loss or reduced propulsive capability

· Safety: Number of individuals affected (injuries or fatalities)

After identifying the risk associated with a loss event, ship designers, operators, insurers, and regulators should implement preventive and/or mitigative measures to reduce the risk to an acceptable level.

The operating modes of a vessel or marine structure, such as ocean transit, cargo discharge, and others, may be associated with different types of loss events. As a result, identifying the operating mode is a crucial initial step in addressing risks for marine vessels and structures. By understanding the specific operating mode, appropriate risk mitigation measures can be tailored to the unique challenges presented in each situation.

Operating modes represent the various contexts and environments in which the vessel functions. Therefore, hazards can be identified based on these operating modes. Initiating events are specific equipment failures, human errors, or external occurrences (e.g., lightning strikes) that could lead to undesired events. Preventative measures consist of engineered safeguards (e.g., alarms) or management systems (e.g., personnel training) designed to prevent an initiating event from escalating into an undesired event. Undesired events are the immediate consequences of an initiating event and hazard affecting the vessel, such as collisions or allisions. By recognising these factors and their interplay, appropriate risk mitigation strategies can be developed for each operating mode.

FIGURE 1 The General Risk Model

2.6.2.1 VESSELS AND INNATE RISKS

Loss events associated with each operating mode can be comprehensively postulated. Many loss events are already well-known and implicitly or explicitly identified in classification society rules and International Maritime Organization (IMO) regulations. These typically include events such as structural failure, loss of stability, loss of propulsion, and fire. Take "loss of propulsion" as an example. Preventative measures must be implemented to ensure propulsion machinery reliability, thus minimising the risk of propulsion loss. Additionally, mitigative measures should be in place to address potential consequences like collision, grounding, or pollution in the event of propulsion loss. The extent of preventative and mitigative measures depends on the risk level and acceptable risk threshold.

Until now, classification society rules and IMO regulations have primarily focused on risk reduction for vessels through hardware design, facility provision, and in-service inspection. However, some areas traditionally have not been explicitly covered, such as:

· Specificity regarding the vessel's operating modes

· Quantification of risk levels and definition of acceptable risk criteria

· Inclusion of operational measures (e.g., safety management and crew training) as tools to reduce risks, especially in measures to mitigate consequences

· Emphasis on maintenance as a means of preventing loss events

Risk acceptance criteria are discussed in general terms in the following subsection.

2.6.2.2 RISK CHARACTERISATION

A risk matrix is an effective method for characterising the risk of loss events. It consists of a grid with cells corresponding to defined consequence (severity) and frequency categories, where loss events can be placed. The consequence and frequency categories are designed to be broad enough to quickly determine the appropriate risk cell for a loss event yet sufficiently narrow to offer varying degrees of resolution for decision-making. In addition, frequency and consequence levels on a risk matrix are often graduated by order of magnitude, providing a clear visual representation of the risks associated with different events; this allows stakeholders to prioritise their risk management efforts based on each event's potential impact and likelihood.

Severity levels can be defined for various types of loss consequences based on the list of examples. Table 1 showcases five (5) consequences (directional control, propulsion, loss of containment, fire/explosion, and safety) and outlines four severity levels for each consequence. Before the risk analysis, an appropriate severity level term for the consequence should be chosen and defined. For each severity level, several example descriptors are listed. Some descriptors are repeated between adjacent severity levels. Some studies use numerals (e.g., 1, 2, 3, 4) to represent the severity levels.

The descriptor chosen to describe a particular severity level may vary between analyses. For instance, one analysis might choose the descriptors "hazardous" and "critical" to describe the two highest severity levels, while another analysis might opt for "critical" and "catastrophic." This flexibility allows organisations to adapt the severity levels to their needs and better understand the potential consequences of different loss events.

Frequency categories are typically expressed in units of events per year. Sometimes, it is challenging to comprehend the smaller frequency categories (e.g., Remote: 0.001 events per year to 0.01 events per year, or from 1 event every 1,000 years to 1 event every 100 years). However, these small frequencies can be better understood when considering multiple vessels over extended periods. For example, a loss event that has occurred twice across a fleet of 100 vessels over the last 20 years corresponds to a frequency of 0.001 events per year.

The risk-based decision-making aspect of a risk matrix is observed along the lines of constant risk. Each cell in the risk matrix corresponds to a defined risk level. Cells with similar risk levels are grouped to form a continuous risk line. All cells in a risk matrix should be categorised into relevant lines of constant risk. Risk-based action levels to address the loss event are based on the risk level represented by the line of constant risk. Generally, the actions necessary to address loss events in each risk level are predefined.

Using a risk matrix enables decision-makers to prioritise actions and allocate resources based on the risk levels associated with different loss events. By grouping events with similar risk levels and defining actions to address these events, organisations can manage risks more effectively and make informed decisions on how to best mitigate potential consequences.

Once the risk of the loss event has been identified using the risk matrix, a risk reduction action consistent with the required action level should be chosen. Typically, maintenance tasks such as condition monitoring and planned maintenance will primarily reduce the frequency of occurrence of a loss event. At the same time, equipment redesign actions and one-time changes may decrease both the frequency and consequence. Central to the risk-based task selection process is evaluating the task's impact on the loss event. The ultimate objective should be to select an efficient, feasible task to decrease the risk level of the loss event to an acceptable level of risk (e.g., low risk or medium risk).

By carefully assessing the potential risk reduction actions and their impact on the loss event, decision-makers can prioritise and implement the most suitable measures. This ensures that resources are allocated effectively and efforts are focused on tasks that will provide the most significant risk reduction. As a result, organisations can better manage risks associated with their operations and minimise the likelihood and consequences of loss events, thereby achieving a safer and more reliable operational environment.

FIGURE 2 Example Risk Model

FIGURE 3 Risk Matrix and Example Consequence (Severity) Categories

2.6.3 RCM ANALYSIS – EXECUTION AND DOCUMENTATION

The following procedures guide RCM (Reliability Centered Maintenance) analyses. RCM analyses should be performed systematically, following a step-by-step process. The essential elements of an RCM analysis process include the following:

· Identify operating modes and corresponding operating contexts: Determine the different operating modes and their specific contexts, such as regular operation, standby, or emergency modes, and the related environmental and operational conditions.

· Define vessel systems: Break down the vessel into various systems and subsystems, such as propulsion, power generation, and safety systems, for more detailed analysis.

· Develop system block diagrams and identify functions: Create block diagrams for each system, illustrating the relationships between components and subsystems. Identify each system's primary and secondary functions, considering their intended purposes and goals.

· Identify functional failures: Determine potential failures that can prevent a system from performing its intended functions, such as equipment malfunctions, human errors, or external factors.

· Conduct a Failure Modes, Effects, and Criticality Analysis (FMECA): Analyse each potential failure mode and its effects on the system, considering factors such as frequency, severity, and detectability. In addition, the criticality of each failure mode should be assessed to prioritise risk management efforts.

· Select a failure management strategy: Choose an appropriate strategy for managing each identified failure mode, including preventive maintenance, condition monitoring, or redesigning components. The selected strategy should minimise the overall risk and maximise the system's reliability.

· Determine spare parts holdings: Identify the necessary parts and their required quantities to support the chosen maintenance strategies, considering lead times, costs, and storage constraints.

· Document the analysis: Compile and document the RCM analysis process, including the identified failure modes, chosen maintenance strategies, and spare parts holdings, to support effective implementation and continuous improvement.

2.6.3.1 DEFINING SYSTEMS

Each system must be accurately defined to conduct an RCM analysis effectively and comprehensively. This process involves:

1. Defining the operating characteristics for the ship as a whole and then for each system: Begin by outlining the overall operational characteristics of the vessel, considering factors such as its intended use, operating environment, and mission requirements. Next, define each system's operating characteristics, considering aspects like performance criteria, operating modes, and specific environmental conditions.

2. Partitioning the vessel into functional groups, systems, and equipment items: Divide the vessel into distinct functional groups based on their overall purpose and interdependence. For example, functional groups may include propulsion, navigation, safety, and communication systems. Further partition these functional groups into specific systems, such as engines, generators, or fire suppression systems, and then break them down into individual equipment items or components. This hierarchical structure helps clearly define each system's boundaries and operational intent subject to RCM analysis.

By following these steps, you can ensure a detailed and organised approach to defining each system for RCM analysis. This accurate definition is crucial for identifying potential failure modes, understanding their effects, and determining appropriate maintenance strategies to minimise risks and maximise system reliability.

2.6.3.2 DEFINING SHIP OPERATING CHARACTERISTICS

Defining the operating characteristics of a vessel is crucial for making informed decisions regarding the RCM failure management strategy. Incomplete or inaccurately defined operating characteristics can lead to an inappropriate failure management strategy. To properly define the operating characteristics, follow these steps:

· Identify the various operating modes for the vessel: List the different operating modes the vessel goes through during its service life. These may include transit, loading and unloading, anchoring, maintenance, and emergency response. Each operating mode has specific requirements and challenges that must be considered.

· Define the operating context for each functional group based on the operating modes: After identifying the operating modes, use them to define the operating context for each functional group. The operating context consists of the conditions and requirements the functional group must meet while operating within a specific mode. For example, the propulsion system's operating context in transit mode might include speed, fuel efficiency, and noise reduction requirements. In contrast, its context in maintenance mode could involve accessibility, ease of maintenance, and safety considerations.

FIGURE 1 RCM Analysis Diagram

2.6.3.2.1 OPERATING MODE

An operating mode for a vessel or marine structure represents its current operational state. Each operating mode affects how the ship's systems and machinery should be operated, guiding the formulation of operating contexts for distinct functional groups. The following are some standard operating modes for ships:

• Normal seagoing conditions at full speed

• Maximum allowed operating speed in crowded waters

• Manoeuvring next to a dock or another ship

• Cargo handling operations

The following example operating modes are typical for mobile offshore drilling units and offshore oil and gas production facilities:

• Drilling operations

• Position mooring or station keeping

• Relocation/Towing

• Hydrocarbon production and processing

• Import and export functions

2.6.3.2.2 OPERATING CONTEXT

The operating context of a functional group refers to the conditions under which the system is expected to operate. It should provide a comprehensive description of the following:

• The physical environment where the functional group operates

• A detailed explanation of how the functional group is used

• The specified performance capabilities of the functional group, as well as the required performance of any additional functional groups within which the functional group is integrated

Some essential factors to consider when developing the operating context for a functional group include:

• Serial redundancy applies to configurations with an identical standby system/equipment to support an operating functional group. If the operating system fails, the standby system is activated. The operating contexts for the running and standby systems/equipment are distinct. For instance, a functional failure in the operating system/equipment is likely to be apparent, while one in the standby system/equipment is likely to be hidden.

• Parallel redundancy pertains to systems/equipment operating simultaneously. Each system can meet the total demand. If a functional failure occurs in one system/equipment, the remaining systems/equipment continue operating at a higher capacity. In some cases, standby systems/equipment may also be available.

• Performance and quality standards: Systems/equipment may need to perform at a specific level or provide a service meeting particular quality criteria (e.g., compressed air supplied at a specified quantity, pressure, temperature range, and humidity limit).

• Environmental standards: As required by international, national, and local laws and regulations (e.g., for an engine emission standard, the operating context should consider a functional group's impact or potential impact on the environment).

• Safety standards: The operating context should specify any hazards that might be present and the safeguards required to protect the crew.

• Shift arrangements: It is assumed that the propulsion machinery operates continuously for ocean-going vessels, except when the vessel is docked. However, the ship's service electrical power system operates continuously. Therefore, system configurations and maintenance strategies must be carefully developed to ensure system availability.

2.6.3.2.3 VESSELS OPERATING CONTEXTS - DEVELOPMENT

Operating contexts should be developed with varying degrees of detail at each level. An operating context statement should be written for each level of functional breakdown, expanding upon the context provided at the preceding level. As the focus shifts to the systems and equipment that comprise the functional group at the lower levels of the functional breakdown, more detail is included in the operating context statement.

Specific performance parameters are crucial for defining functions for the functional group and determining what constitutes a failure and the effects that such failures will have on individual equipment performance, overall system operation, and, ultimately, the vessel's roles.

• Vessel level: Operating contexts should first be developed for the vessel as a whole, typically with a focus on the vessel type; this includes a physical description of the vessel, the vessel type, the cargoes to be carried, performance standards (speed, manoeuvrability, fuel capacity and consumption, etc.), and cargo handling capabilities. Statements should address the primary roles (e.g., transporting cargo from point A to point B within a specific time frame, cargo preservation), secondary roles (e.g., crew habitability), and safety and environmental roles of the vessel.

• Functional group level: Using the vessel-level operating context, develop an operating context for each functional group level (e.g., machinery and utilities, propulsion functional group). The operating contexts at a given functional group level must encompass all operational characteristics to define the context for the next highest level. For example, the operating contexts for propulsion, manoeuvring, electrical, vessel service, and navigation and communication functional groups should include all operating characteristics in the machinery and utilities functional groups. Additionally, an operating context must be developed for each vessel's operating mode.

TABLE(s) 1 & 2 Example Operating Modes and Operating Context

2.6.3.3 DIVIDING SYSTEMS

Due to a vessel's complexity, which comprises numerous intricate systems and subsystems, it is beneficial to categorise it into functional groups. These functional groups can then be divided into specific systems, subsystems, equipment items, and individual components. This approach allows for a more structured and manageable understanding of the vessel's overall design and operation.

2.6.3.3.1 DIVIDING A VESSEL INTO FUNCTIONAL GROUPS

Dividing a vessel into functional groups is done using a top-down approach. For most vessels, the top level consists of the following primary functional groups:

• Hull

• Machinery and utilities

• Cargo handling

In most instances, it is necessary to further divide these high-level functional groups to identify major systems for analysis. For example, machinery and utilities can be subdivided into the following functional groups:

• Propulsion functional group

• Manoeuvring functional group

• Electrical functional group

• Ship service functional group (e.g., bilge, ballast, firefighting, steam)

• Navigation and communication functional group

Each functional group should be divided using a top-down approach until a level is reached where functions are associated with distinct physical units, such as individual systems or equipment items; this is sometimes called the level of indenture. The indenture level is crucial as it significantly influences the time and effort needed to conduct a thorough analysis. An analysis performed at too high a level may be overly superficial, while one conducted at too low a level may become excessively complex.

The level of indenture will vary depending on a system's complexity. Highly complex systems with numerous failure modes will typically be analysed at lower levels. The level of indenture should allow for the identification of the following elements within the functional group:

• Physical boundaries

• Functions and functional failures

• Discrete equipment items

2.6.3.3.2 DIVIDING A FUNCTIONAL GROUP INTO EQUIPMENT ITEMS

Upon achieving an appropriate level of partitioning for functional groups, each functional group should be further divided into specific equipment items. One or two levels of indenture might be necessary to adequately separate a functional group into equipment items. The chosen level of indenture for equipment items should meet the following criteria:

• The equipment can be identified for its contribution to the overall functions of the functional group.

• The equipment can be identified for its failure modes.

• The equipment represents the most convenient physical unit for specifying maintenance tasks.

2.6.3.3.3 SELECTION OF FUNCTIONAL GROUPS FOR ANALYSIS

Establishing a priority order to analyse functional groups might be necessary, allowing resources to be utilised more effectively. Generally, one of the following methods is employed to choose groups for analysis:

• Engineering judgment: This method depends on the undocumented experience of subject matter experts in selecting the group. Typically, when choosing a group, a team will subjectively consider factors such as the number of failures that have occurred, the amount of maintenance resources, the potential for performance improvement, and the possibility of reducing costly downtime maintenance (e.g., dry-docking maintenance). After determining the selection and priorities, the team should document the reasoning behind their decision.

• Simple analytical approaches: A more analytical method for selecting functional groups involves using simple tools like Pareto analysis and relative ranking. These tools offer the selection team a structured methodology for ranking the various factors considered during the selection process. When using Pareto analysis, the team collects data on each considered factor. For instance, if the number of failures is essential, the team would gather failure data for each group and then rank each group based on the number of failures. The team develops a scoring system for each factor when using relative ranking. The scores are then tabulated and evaluated to rank the groups.

• Risk assessment: The most comprehensive approach is to perform a risk assessment or use an available risk assessment to select and rank functional groups. Whether the risk assessment is a detailed quantitative analysis or a high-level profiling analysis (used for enterprise risk management), the risk assessment data can identify groups with unacceptable risk and those with the highest risk. The unacceptable risk data can determine if further detailed analysis, such as RCM analysis, is warranted. The group risk ranking can then be used to prioritise groups for analysis (e.g., groups with the highest risk are analysed first). Additionally, the risk assessment should be reviewed to determine if equipment failures can be affected by improved maintenance and if these failures are significant contributors to the risk. For example, when reviewing a group's risk assessment, it might be discovered that the major risk contributor is operational errors. In this case, an RCM analysis might not be the best method to reduce the risk. However, the highest-risk groups where equipment failures are major risk contributors are suitable candidates for RCM analysis.

Regardless of the approach used to select groups, the following considerations should be taken into account:

• The expected cost savings over the projected remaining life of the equipment should be balanced against the cost of the analysis.

• The human resources required to conduct each analysis must be identified, and their availability must be ascertained.

2.6.3.4 DEFINING FUNCTIONS AND FUNCTIONAL FAILURES

After establishing the operating mode for the vessel and the operating context for a functional group, the RCM analysis team uses this information to determine the necessary functions for the functional group to operate successfully, ensuring all relevant vessel functions are maintained. Considering the applicable operating modes when defining functions is crucial, as functions can vary with different operating modes. Furthermore, identifying all functions is essential, as failing can lead to overlooking critical failures (e.g., those affecting system and vessel performance).

Once the functions are defined, functional failures (e.g., various loss functions that can occur due to failures) are identified. Functional failures can represent a total loss of function (e.g., providing no compressed air) or a partial loss of function (e.g., providing compressed air at reduced pressure and flow). The following paragraphs explain more about identifying functions and functional failures.

1. Identifying functions: Begin by listing the required functions for each functional group, considering the operating context and mode. These functions may include primary functions (directly related to the main purpose of the functional group), secondary functions (supporting the primary functions), and safety or environmental functions (relating to safety and environmental requirements).

2. Identifying functional failures: For each identified function, determine the potential functional failures that could occur. These failures can be total (the complete loss of a function) or partial (a reduction in the function's effectiveness or efficiency). It is important to consider different operating modes and contexts and the potential impact of these failures on the overall system and vessel performance.

2.6.3.4.1 IDENTIFYING FUNCTIONS FOR A FUNCTIONAL GROUP

After completing the operating characteristics and partitioning, the subsequent task involves identifying the functions associated with the chosen functional group and its associated equipment. These functions are determined by considering the operating context of the functional group and the equipment included within it. Therefore, it is essential to state what the functional group must provide or do to ensure proper vessel operation in the given operating mode rather than how individual equipment items operate to ensure comprehensive identification and definition of all functions. For instance, a propulsion group function could be "providing X horsepower at Y RPM to the propeller" despite the engine's potential to produce more horsepower and operate at a higher RPM.

Developing a functional block diagram is one approach to identifying functions. This diagram represents the system operation in a graphical format, comprising three main components. Firstly, the inputs (e.g., raw materials, energy sources) that enter the system boundary. Secondly, the functional blocks represent the functions within the system boundary. Thirdly, the outputs (e.g., materials, energy, signals) that leave the boundary.

Additionally, arrows illustrate the flow of materials, energy, signals, etc., between functional blocks and into and out of the system. Each block within the boundary corresponds to a primary or secondary function that must be provided to transform the inputs into outputs. Therefore, every function block and its associated outputs signify a function that must be furnished for the system to operate correctly.

Each function must be documented as a function statement consisting of a verb, an object, and a performance standard To ensure clear identification and definition. The performance standard should specify the minimum acceptable requirement rather than the design capability and must be clearly defined or quantified. It defines failure as the basis for the maintenance decision-making process. Functions can be categorised as primary or secondary. Primary functions represent the fundamental reasons why the system/equipment exists. For instance, the primary functions of a diesel engine are to provide power to drive the propeller from 0 to 91 RPM with output from 0 to 16,860 kW. The minimum acceptable output is 9,000 kW to maintain a minimum vessel speed of 7 knots. Secondary functions are generally less evident but may have worse consequences if they fail. The following functional categories can be used to determine secondary functions:

• Environment integrity

• Safety, structural integrity

• Control, containment, comfort

• Appearance

• Protection

• Economy, efficiency

• Supplementary functions

For example, some of the secondary functions of a diesel engine could include having acceptable engine emissions by some standards and having a vibration level that will not affect structural integrity.

Protection and protective devices are the most critical of the various secondary functions. These devices function in one of the following five ways:

• Alerting the operator to abnormal conditions

• Shutting down the equipment when a failure occurs

• Eliminating or relieving abnormal conditions following a failure that could otherwise cause more severe damage

• Taking over from a failed function

• Preventing a hazardous situation from occurring in the first place

When documenting the functions of any system/equipment, it is essential to list the functions of all associated protective devices. These devices must receive special attention as they safeguard the system/equipment and personnel.

2.6.3.4.2 IDENTIFYING FUNCTIONAL FAILURES FOR A FUNCTIONAL GROUP

Identifying a series of functional failures is crucial for every function within the functional group. Typically, each function will have at least two functional failures, which can be a complete or partial loss of function. Deviations in the performance standard usually represent a partial loss of function. For instance, the function "to provide 16,860 kW at 91 RPM to the propeller" can have the following functional failures:

• Total loss of function

No power to the propeller

• Partial loss of function

Provides less than 16,860 horsepower to the propeller

Provides more than 16,860 horsepower to the propeller

Provides less than 35 RPM to the propeller

Provides more than 91 RPM to the propeller

Functional failures can be identified from functions by using the following guides:

• No or none of the function

• Less of each standard performance parameter

• More of each performance parameter

• Premature operation of the function

• Failure to cease operation of the function (e.g., the function operates too long)

• Intermittent operation of the function

• Other functional failures appropriate for the functional group

Each functional failure must be documented in a statement containing a verb, an object, and the functional deviation.

Function and Functional Failure List - Example

2.6.3.5 FMECA – CONDUCTION

After identifying potential functional failures, the next step in the RCM analysis is to perform a Failure Mode, Effects, and Criticality Analysis (FMECA). This step aims to establish the cause-and-effect relationship between potential equipment failures, functional failures, and the end effect of functional failures. It also evaluates the criticality of the postulated failure mode. This information is essential in determining the following:

• When a failure management strategy is required

• What type of failure management strategy is best suited to manage the failure mode (e.g., one-time change, planned maintenance, or run-to-failure)

• The significance of the failure management strategy

By evaluating the criticality of the failure modes and their potential impact on the system/equipment, it is possible to determine the most appropriate course of action to manage the failure. The failure management strategy can include various approaches, such as redesign, replacement, or modification of equipment, scheduled maintenance, or run-to-failure. The importance of the failure management strategy will depend on the consequences of the failure mode and the criticality of the system/equipment.

2.6.3.5.1 IDENTIFYING FAILURE MODES AND EFFECTS WITH AN FMECA

There are two primary approaches for performing an FMECA: bottom-up and top-down. Both methods can be effectively used in an RCM analysis, and each has advantages and disadvantages. However, the main attribute of both approaches is that they are inductive analysis techniques that help guide the RCM analysis team in establishing the cause-and-effect relationships necessary to identify maintenance requirements and other potential improvements.

The bottom-up approach involves analysing individual equipment components and their potential failure modes and effects, then evaluating their criticality in the overall system/equipment context. This approach can be more time-consuming but provides a more comprehensive understanding of the system/equipment's failure modes.

On the other hand, the top-down approach focuses on analysing the overall system/equipment and its functions to identify potential failure modes and their effects. This approach can be less time-consuming but may overlook some equipment-specific failure modes.

Both approaches can effectively identify potential failure modes and their impacts on the system/equipment. They ultimately help develop a maintenance strategy that maximises system/equipment reliability while minimising maintenance costs.

2.6.3.5.1.1 BOTTOM-UP FMECA APPROACH

The bottom-up approach is a systematic method of explicitly analysing each equipment item. This approach focuses on understanding the effects of different equipment failure modes on the system's operation. The bottom-up approach involves determining whether an equipment failure mode results in a local effect that leads to a functional failure, which ultimately causes an end effect of interest. The following are the steps involved in performing a bottom-up FMECA:

• Select an equipment item for analysis

• Identify potential failure modes for the equipment item

• Choose a failure mode for evaluation

• Determine the failure characteristic (e.g., wear-in, random, wear-out) for the failure mode

• Determine the local, next higher-level, and end effects for the postulated failure mode

• If the end effect leads to a consequence of interest, determine the causes of the failure mode

• Use the risk decision tool to determine the criticality of the failure mode

• Repeat the steps as necessary until all equipment items and associated failure modes have been evaluated

When conducting a bottom-up FMECA, the failure causes are the fundamental equipment failures that result in the failure mode, and the following higher-level effect usually identifies the resulting functional failure.

The bottom-up approach is advantageous because it helps ensure that all equipment items are analysed, and all potential equipment failure modes are considered. Furthermore, a standard list of failure modes can be created for common equipment items, making the analysis more straightforward and promoting consistency between RCM teams. This consistency allows for the efficient transfer of knowledge and experience from one team to another, making it easier for organisations to implement RCM strategies across multiple sites or locations. Additionally, the bottom-up approach helps identify failure modes specific to particular equipment items, which might be missed in a more general top-down analysis. This detailed analysis can lead to the development of more effective maintenance strategies and increase the reliability and safety of the system/equipment.

2.6.3.5.1.2 TOP-DOWN FMECA APPROACH

The top-down approach is a method of analysing each function and its associated functional failures. This approach focuses on understanding the effects of different functional failures on the system's operation and then identifying which equipment failures (e.g., failure mode) can cause the functional failure. The top-down approach aims to determine whether a functional failure results in an end effect of interest and then determine which equipment failures can cause the functional failure. The following are the steps involved in performing a top-down FMECA:

• Select a function for analysis

• Choose a functional failure for evaluation

• Determine the local and end effects of the postulated functional failure

• If the end effect leads to a consequence of interest, identify the equipment failures that can result in the functional failure

• Determine the failure characteristic (e.g., wear-in, random, wear-out) for the failure mode

• Use the risk decision tool to determine the criticality of the failure mode

• Repeat the steps until all functions and functional failures are evaluated

The top-down approach allows for a broader understanding of the system/equipment by analysing the functions and associated functional failures, leading to a better understanding of the system's overall reliability and safety. It is advantageous for complex systems/equipment where many functions and multiple pieces of equipment could fail. However, it is possible to overlook equipment-specific failure modes that might only affect a single component or subsystem, which could be identified more easily through a bottom-up approach.

Bottom-up FMECA Worksheet - Example

Top-down FMECA Worksheet - Example

2.6.3.5.2 CONSIDERATIONS IN IDENTIFYING FAILURE MODES AND FAILURE EFFECTS WITH AN FMECA

Regardless of whether a top-down or bottom-up FMECA approach is used, the FMECA must identify equipment failures that can lead to functional failures, which ultimately result in end effects of interest. To ensure that the FMECA serves its intended purpose, the following issues must be considered:

2.6.3.5.2.1 IDENTIFYING FAILURE MODES

To develop a comprehensive list of failure modes causing each functional failure, the following should be considered:

• Failures that have previously occurred on similar equipment

• Other failure modes that are considered probable, including those being suppressed by the current preventative maintenance program

• Failure modes that are possible but considered unlikely are included to demonstrate that they have been considered

When conducting a bottom-up FMECA, the following guide phrases may help develop a list of failure modes to consider:

• Premature operation

• Failure to operate at a designated time

• Intermittent operation

• Failure to stop operation at a designated time

• Loss of output or failure during operation

• Degraded output or operational capability

• Other unique failure conditions

The causes of failure, such as normal wear and tear, corrosion, abrasion, erosion, fatigue, etc., should be recorded in detail to identify an appropriate failure management strategy. In addition, if there is clear evidence of human error leading to failures or if operator error can cause significant consequences, those failures should also be included. It is essential to ensure that the causes of failure are adequately identified so that the subsequent maintenance recommendations address the root cause rather than just treating its symptoms.

2.6.3.5.2.2 IDENTIFYING FAILURE EFFECTS

When identifying failure effects, it is crucial to consider the following three levels:

• Local effects: These are effects that are localised to the system/equipment being analysed and should include the following:

o Methods of detecting failures (e.g., alarms, test indicators)

o Reduced level of performance

o Whether a standby system/equipment can provide the same function

• Next higher effects: These are effects on the larger system to which the system/equipment belongs and should include the following:

o Potential physical damage to the system/equipment

o Potential secondary damage to other equipment in the system or unrelated equipment in the vicinity

• End effects: These are effects on the vessel and should include the following:

o Threats to safety and the environment

o The operational effectiveness of the vessel

o Downtime required to repair the damage

2.6.3.5.3 END EFFECT CONSIDERATIONS

Upon identifying the end effects of failure, the following information should be included in the FMECA:

• Mitigation strategies to reduce the consequences of failure before implementing maintenance (e.g., activating a standby system, reconfiguring the system), and the estimated time needed for such action

• Repair action for the defective item (e.g., repairing primary and secondary damages, identifying personnel needed, determining if dry-docking or shore support is necessary, estimating the time required for the repair)

• Identification of spare parts needed for the repair

2.6.3.5.3.1 LEVEL OF INDENTURE CONSIDERATIONS

When conducting FMECA for a functional group, it is usually performed at the level of equipment where maintenance is performed. This level is known as the convenient indenture level. However, a lower indenture level should be chosen if more than 20 to 30 failure modes can be identified at this level. The lower level should be selected to ensure comprehensive analysis and to identify all potential failure modes that may affect the system's functionality. This approach helps ensure that all potential failure modes are identified and maintenance strategies can be developed to mitigate the risks associated with those failure modes.

2.6.3.5.3.2 MAINTENANCE CONSIDERATIONS

When conducting FMECA, it is important to assume zero-based maintenance, which means that no proactive maintenance tasks are being performed. This approach is necessary to identify the potential failure modes that could occur without any maintenance and to determine the need for a failure management strategy. If proactive maintenance tasks were included in the analysis, the risk associated with equipment failures could be underestimated.

However, the probability of failure for existing maintenance programs should be based on the current maintenance program; this allows for a more accurate assessment of the equipment's actual risk of failure, considering the maintenance tasks already being performed. This information can be used to optimise the maintenance program and identify any additional maintenance tasks necessary to mitigate the risks associated with equipment failures.

2.6.3.5.4 ASSESSING THE CRITICALITY OF FAILURE MODES AND EFFECTS IN AN FMECA

Criticality analysis is an important step in the RCM process as it involves ranking each potential failure mode identified during the FMECA based on a combination of severity classification and probability of occurrence. This analysis aims to identify the risks associated with each failure mode and prioritise them based on their criticality level. By doing so, maintenance resources can be allocated more effectively to those failure modes that pose the most significant risk to the system's functionality, safety, and environmental impact.

The criticality analysis involves assigning a severity classification to each potential failure mode based on the consequences of the failure and a probability of occurrence based on the likelihood of the failure mode occurring. Then, the severity and probability rankings are combined to calculate a risk priority number (RPN) for each failure mode. The RPN is used to prioritise the failure modes and identify those that require immediate attention.

The criticality analysis highlights the risks associated with each failure mode and helps ensure that maintenance resources are allocated effectively to mitigate those risks. This approach can help prevent costly equipment failures and downtime, improve safety, and reduce environmental impact.

Two approaches may be taken to determine the frequency category (probability of occurrence) during the criticality analysis.

• The first approach is the quantitative approach, which should be used if reliability data is available. This approach involves analysing the reliability data and calculating the probability of occurrence of each potential failure mode. The source of the data and the operating context should be noted to ensure the analysis is accurate and reliable.

• The second approach is the qualitative approach, which is used when quantitative data is unavailable to determine the probability of occurrence. This approach involves applying engineering judgment based on previous experience; this may include expert opinions, historical data, and other sources of information to estimate the probability of occurrence of each potential failure mode. This approach may be less precise than the quantitative approach, but it is still helpful in identifying and ranking potential failure modes.

Regardless of the approach used, it is important to ensure that the probability of occurrence is accurately and consistently determined to ensure that the criticality analysis is reliable and effective in identifying and prioritising potential failure modes.

To determine the qualitative risk associated with a failure mode, the following procedure should be followed:

• Severity classification: Identify the consequence of the end effect resulting from each failure mode and the severity category allocated in applying the example shown here. Suppose the failure mode does not directly result in an end effect. In that case, the criticality analysis will assume that the protected function experiences failure with the protective device in the failed state.

• Probability of occurrence: Derive the probability of occurrence of each failure mode identified in the FMECA. If reliability data are available, use the quantitative approach to determine the probability of occurrence. Otherwise, use engineering judgment based on previous experience.

• Risk matrix: Plot the severity classification and probability of occurrence on the risk matrix shown here to obtain the risk level.

The criticality ranking or risk level for each failure mode/end effect pair is then used in an RCM decision flow chart to determine the appropriate failure management strategy. Examples of criticality ranking can be found here.

2.6.3.6 SELECTING A FAILURE MANAGEMENT STRATEGY

After conducting the FMECA, failure modes assessed to have high, medium or low risks are evaluated using the RCM Task Selection Flow Diagrams at the level of each system/equipment item. The flow diagram helps determine the appropriate failure management strategy for the identified failure modes. An example of suggested failure management tasks for the identified failure modes is provided. It should be noted that the answers to the questions posed in the flow diagram only apply to the operating mode under consideration, and "Normal Operating Conditions" (NOC) should be interpreted as the normal operating condition concerning the relevant operating mode.

2.6.3.6.1 RCM TASK SELECTION FLOW DIAGRAM

The RCM Task Selection Flow Diagram helps determine each failure mode's most effective and efficient failure management strategy based on its assessed risk. The flow diagram considers the failure's consequences, the likelihood of occurrence, and the effectiveness of the failure management strategy to select the appropriate tasks to manage the identified failure modes.

2.6.3.6.1.1 FIRST SELECTION DECISION

To determine whether a failure mode has the highest or lowest risk, the RCM analysis team must consider the consequences and probability of occurrence associated with each failure mode. If a failure mode has the highest risk and cannot be managed through maintenance alone, a fundamental change in how the equipment is designed or operated may be required. In this case, a one-time change is necessary to reduce the risk. Once the one-time change is identified, the FMECA should be updated, and any applicable failure modes should be reevaluated using the RCM Task Selection Flow Diagram.

On the other hand, if a failure mode has the lowest risk, it may not require any failure management strategy. Therefore, the RCM analysis team should have a high confidence level in the risk characterisation, indicating that the team is relatively confident that the risk is characterised correctly and can be used in the RCM flow diagram without further discussion.

However, if the confidence level in the risk characterisation is low, indicating uncertainty, the failure mode is assumed to have a medium/moderate risk characterisation. The entire RCM Task Selection Flow Diagram evaluates the failure management strategy. Additional data about the probability or consequence of the failure may be needed before the risk can be used in the decision-making process.

2.6.3.6.1.2 SECOND SELECTION DECISION

Condition-monitoring tasks are considered first because they are usually the most technically feasible and cost-effective option. To determine if a failure mode can be managed through a condition-monitoring task, the team should select a specific task and determine an appropriate task interval. The following criteria should be considered when selecting the task:

• The task must be practicable to implement

• The task must have a high degree of success in detecting the failure mode

• The task must be cost-effective, meaning that the cost of performing the task should be less than the total cost of the consequences of failure over time.

The team should then evaluate the potential risk reduction resulting from implementing the condition-monitoring task by comparing the reduced risk to the risk acceptance criteria. Suppose the risk reduction achieved by the task is unacceptable. Further analysis is needed to determine if other maintenance tasks or a one-time change is needed to manage the failure.

Proactive maintenance task intervals should ideally be based on actual failure data to determine the task interval, but this may not be feasible for most organisations. Therefore, the task interval can be determined from the following sources in ascending order of priority:

• Generic P-F interval data

• Manufacturer's recommendations

• Current task intervals

• Team experience

For condition-monitoring tasks, the task interval must be set at less than half the anticipated P-F interval to ensure enough warning time to take action and avoid the consequences of failure.

2.6.3.6.1.3 THIRD SELECTION DECISION

For wear-out failure modes, the team must select a planned maintenance task that will restore the equipment item to an acceptable level of performance before it fails. The task and task interval should be determined based on the following criteria:

• The task should effectively reduce the probability of the failure mode occurring.

• The task should be practical to implement and cost-effective.

• The task interval should be based on the anticipated P-F interval and the consequences of failure.

• The task interval should be less than half the anticipated P-F interval.

Suppose the selected planned maintenance task does not achieve an acceptable level of risk reduction. In that case, the team must evaluate the need for additional or alternative maintenance tasks or a one-time change.

a) MAINTENANCE TASK SELECTION CRITERIA

If the risk reduction achieved through the planned maintenance task is unacceptable, the team must evaluate the need for design or operational changes to manage the failure mode. For example, if the failure mode exhibits a wear-out failure characteristic, a one-time change or redesign of the equipment item may be necessary to manage the failure. If the failure mode exhibits a wear-in failure characteristic, a one-time change or redesign may also be necessary, but the team should evaluate whether any other proactive maintenance tasks can be implemented in addition to the one-time change or redesign.

b) MAINTENANCE TASK INTERVAL DETERMINATION.

Ideally, proactive maintenance task intervals are determined using actual failure data, but this is not realistic for most organisations. Therefore, the task frequency must be determined from the following sources listed in ascending order of priority and documented:

• Generic failure data

• Manufacturers' recommendations or failure data

• Current task intervals

• Team experience

For planned maintenance to be effective, the item must have a straightforward life. Most items must survive this life, after which the conditional probability of failure increases significantly. The life can be determined based on information from the equipment manufacturer, expert opinion, published reliability data, actuarial analysis, etc.

Suppose the risk reduction does not achieve an acceptable level of risk. In that case, the failure mode is further analysed to determine if a combination of planned maintenance and condition-monitoring tasks can achieve an acceptable risk. If a combination provides an appropriate failure management strategy, the failure mode is further analysed per the maintenance task selection criteria above.

2.6.3.6.1.4 FOURTH SELECTION DECISION

The failure-finding task is a proactive maintenance task designed to detect hidden failures. It is used when condition monitoring and planned maintenance tasks are insufficient to manage the risk associated with a hidden failure mode. The following criteria should be considered when selecting a failure-finding task:

• The failure-finding task must be practicable to implement.

• The task must have a high degree of success in detecting the hidden failure mode.

• The cost of undertaking the task over some time should be less than the total cost of the consequences of failure.

• The task interval must give enough warning of the failure to ensure action can be taken in time to avoid the consequences.

If a failure-finding task is selected, the task and interval should be recorded in the maintenance program, and the failure mode should be reassessed using the RCM Task Selection Flow Diagram to ensure that the selected task provides an acceptable level of risk.

a) MAINTENANCE TASK SELECTION CRITERIA

If the failure-finding task is an appropriate failure management strategy, the team must select the task and task interval. The task interval for a failure-finding task must give enough time for the hidden failure mode to develop between inspections while providing enough warning time to take appropriate corrective action. The maintenance task interval must be set at less than half the anticipated time between the failure of the protective device and the secondary failure. Suppose the risk reduction achieved by the failure-finding task does not achieve an acceptable level of risk. In that case, the failure mode is further analysed to determine if a one-time change or redesign of the equipment item is needed to manage the failure.

(b)MAINTENANCE TASK INTERVAL DETERMINATION

In an ideal scenario, actual failure data should be used to determine the intervals for proactive maintenance tasks. However, for most organisations, this is not feasible. Hence, the frequency for these tasks must be established based on the following sources, in order of priority, and documented:

• Generic failure data

• Manufacturers' recommendations or failure data

• Current task intervals

• Team experience

Availability and reliability information should be utilised wherever possible to set the failure-finding intervals T. One equation that can be used to determine T is as follows:

The team should consider regulatory requirements and guidelines when setting the unavailability limits for different functions. As a general guide, the unavailability for safety and environmental functions should be kept as low as reasonably practicable, typically not exceeding 0.05%. The unavailability limit should be kept below 2.0% for operational functions, while for non-operational functions, the limit may be higher, up to 10%.

2.6.3.6.1.5 ONE-TIME CHANGES

If the team determines that a one-time change is the best failure management strategy, the team should update the FMECA and RCM Task Selection Flow Diagram to include the change. The team should also consider any impact the change may have on other systems, equipment, or processes and ensure that any necessary modifications are made.

Suppose the team determines that none of the maintenance tasks or one-time changes can manage the failure to an acceptable level of risk. In that case, the team must record the failure mode and its associated risks in the system/equipment documentation and notify the appropriate personnel. The risk may need to be reevaluated periodically or when new information becomes available.

It should be noted that the RCM process is ongoing and requires continuous monitoring and feedback. As new data becomes available or the operating context changes, the FMECA and RCM Task Selection Flow Diagram should be updated accordingly.

2.6.3.6.1.6 ROUNDS AND ROUTINE SERVICING

The team should also review rounds and routine inspection tasks to prevent altering the failure rate curve for the failure mode. These tasks are essential to maintaining the basis for proactive maintenance tasks and risk characterisation, preventing issues such as premature wear-out of a bearing due to insufficient lubrication.

2.6.3.6.2 MAINTENANCE TASK ALLOCATION AND PLANNING

2.6.3.6.2.1 TASK CATEGORIES

The RCM-derived maintenance tasks should be categorised as follows:

• Category A: Can be performed by the vessel's crew while at sea.

• Category B: Must be carried out alongside with the assistance of equipment vendors or using dockside facilities.

• Category C: Must be performed in a dry dock facility.

2.6.3.6.2.2 TASK INTERVAL ADJUSTMENT

The task intervals resulting from the RCM analysis may not align with the current calendar-based maintenance schedule, so the team must integrate them into a common maintenance schedule. However, the RCM task intervals may need to be adjusted to a shorter or longer interval depending on the following criteria:

• Tasks with safety or environmental consequences should only be adjusted to a shorter task interval to ensure that safety and containment are not compromised.

• Tasks with operational consequences may be adjusted to a longer or shorter task interval. However, if the interval is adjusted to a longer one, the responsible person in the shipping company should approve the change.

2.6.3.6.2.3 OVERALL MAINTENANCE SCHEDULE

Category B and C tasks should be organised to derive an overall maintenance schedule by adjusting the RCM task intervals, but only for Category B and C tasks. This adjustment should be made using the criteria specified in Category B so that these tasks can be timed to coincide with the vessel's port calling and dry-docking schedules.

2.6.3.6.3 SPARES HOLDING

The availability of spares to support the identified maintenance tasks is crucial for the proposed maintenance schedules to be viable. The spare holding requirement should be developed based on the following factors:

• The list of parts needed to perform tasks to correct each failure mode identified in the RCM analysis and the parts required as a result of remedial work to correct condition-monitoring, planned maintenance, failure-finding, any applicable and effective, and run-to-failure tasks.

• An assessment of the impact on the operational availability of the functional group or system in an out-of-stock situation.

• An evaluation of the parts that can be preplanned for use. For parts that cannot be preplanned, determine the quantity required to achieve the desired operational availability.

2.6.3.6.3.1 STOCK-OUT EFFECT ON END EFFECTS

Determine whether the stock-out and further failure will result in End Effects, such as degradation or loss of propulsion, fire, etc. When determining the effect, consider the direct and indirect effects of the stock-out under normal circumstances. The following defines direct and indirect effects and normal circumstances.

• Direct effect. Suppose the spare is unavailable, and the associated maintenance tasks cannot be carried out. In that case, the corresponding failure mode will eventually lead to an End Effect(s) if a failure occurs.

• Indirect effect. Suppose the spare is unavailable, and the associated maintenance tasks cannot be carried out. In that case, the corresponding failure mode will not lead to an End Effect(s) unless a different failure occurs.

• Normal circumstances. The item operates within the context without a failure occurring.

• If the stock-out has no effect, then no spare holding is required.

2.6.3.6.3.2 SPARES HOLDING DECISIONS

The spares selection process involves the following decision-making process when a stock-out or a stock-out and further failure may result in End Effects:

• If parts requirements can be anticipated before failure occurs or there is sufficient warning time for parts to be ordered, order parts before demand occurs, provided that ordering parts in advance is acceptable.

• If ordering parts in advance is not acceptable, consider holding parts onboard or in storage depots, provided that:

o The risk of a stock-out is reduced to an acceptable level.

o The cost and storage basis for holding the parts are feasible.

• If neither of the above strategies is feasible, consider the following:

o If the stock-out will result in End Effect(s) (direct or indirect), reviewing the RCM analysis to revise the maintenance task is mandatory.

o If the stock-out will only have a non-operational effect, reviewing the RCM analysis to revise the maintenance task is desirable.

FIGURE 6 Spares Holding Decision Flow Diagram (1)

An example of an operating context and analysis is the fuel oil piping system with two fuel oil supply pumps arranged in parallel redundancy. The system is designed to supply heavy fuel oil to the main propulsion engine and two diesel generator engines operating at their maximum continuous rating. The pumps are operated alternately, with one pump operating for a week while the other is on standby. The anticipated annual service hours for both pumps are the same. This operating context and analysis provide the necessary information to develop a maintenance strategy to ensure the system's reliability and availability.

References & Bibliography:

https://www.researchgate.net/publication/354347868_Determination_of_Maintenance_Task_on_Tanker_Vessel's_Marine_Boiler_Using_Reliability_Centered_Maintenance_RCM_II_Method

https://www.researchgate.net/publication/350303270_The_Combination_of_Reliability_and_Predictive_Tools_to_Determine_Ship_Engine_Performance_based_on_Condition_Monitoring

https://www.researchgate.net/publication/350148670_Application_of_Reliability-Centered_Maintenance_for_Tugboat_Kresna_315_Cooling_Systems

http://www.atpm.co.kr/5.mem.service/6.data.room/data/treatise/5.reliability/5.reliability_01.pdf

https://www.academia.edu/962903/Increasing_ship_operational_reliability_through_the_implementation_of_a_holistic_maintenance_management_strategy

https://ww2.eagle.org/content/dam/eagle/rules-and-guides/current/design_and_analysis/132_reliabilitycenteredmaintenance/rcm-gn-aug18.pdf

Disclaimer:

Out of Box Maritime Thinker © by Naro Consilium Group 2022 and Aleksandar Pudar assumes no responsibility or liability for any errors or omissions in the content of this paper. The information in this paper is provided on an "as is" basis with no guarantees of completeness, accuracy, usefulness, or timeliness or of the results obtained from using this information. The ideas and strategies should never be used without first assessing your company's situation or system or consulting a consultancy professional. The content of this paper is intended to be used and must be used for informational purposes only.

Out of Box Maritime Thinker

LinkedIn

Wednesday, October 23, 2024

2.6. RELIABILITY-CENTERED MAINTENANCE (RCM) – PART II

Followers

Pages