Out of Box Maritime Thinker

Friday, August 2, 2024

2.5. RELIABILITY-CENTERED MAINTENANCE (RCM) – PART I

By Aleksandar Pudar

Technical Superintendent and Planned Maintenance Supervisor Reederei Nord BV

Co-founder of "Out of Box Maritime Thinker Blog" and Founder of Narenta Consilium Group

In marine engineering, a reliability-centred maintenance (RCM) approach systematically identifies the functions, functional failures, and likely causes of failures for various assets on a vessel. It also assesses the effects of potential failure modes and determines the significance of these effects. With this information, the RCM selects the most suitable asset management policy to optimise system performance, safety, and reliability.

RCM considers all possible asset management options in the marine engineering context, such as on-condition tasks, scheduled restoration tasks, scheduled discard tasks, failure-finding tasks, and one-time changes. One-time changes encompass modifications to various aspects of the marine engineering asset, including hardware design, operating procedures, personnel training, and other factors beyond maintenance. This comprehensive consideration of asset management options sets RCM apart from other maintenance development processes.

2.5.1 DEFINING RELIABILITY-CENTERED MAINTENANCE (SEVEN QUESTIONS ADDRESSED BY RCM)

RCM is a process of systematically analysing an engineered system to understand the following:

· Its functions

· The failure modes of its equipment that support these functions

· How then to choose an optimal course of maintenance to prevent the failure modes from occurring or to detect the failure mode before a failure occurs

· How to determine spare holding requirements

· How to periodically refine and modify existing maintenance over time

The objective of RCM is to achieve reliability for all of the operating modes of a system.

An RCM analysis, when properly conducted, should answer the following seven questions:

i. What are the system functions and associated performance standards?

ii. How can the system fail to fulfil these functions?

iii. What can cause a functional failure?

iv. What happens when a failure occurs?

v. What might the consequence be when the failure occurs?

vi. What can be done to detect and prevent failure?

vii. What should be done if a maintenance job description cannot be found?

Typically, the following tools and expertise are employed to perform RCM analyses:

· Failure modes, effects and criticality analysis (FMECA). This analytical tool helps answer Questions 1 through 5.

· RCM decision flow diagram. This diagram helps answer Questions 6 and 7.

· Design, engineering and operational knowledge of the system

· Condition-monitoring techniques

· Risk-based decision-making
(e.g., the frequency and the consequence of a failure in terms of its impact on safety, the environment and commercial operations)

Documenting and implementing the following formalise this process:

· The analyses and the decisions taken

· Progressive improvements based on operational and maintenance experience

· Clear audit trails of maintenance actions taken and improvements made

· Once these are documented and implemented, this process will effectively ensure an engineered system's reliable and safe operation.

2.5.1.i FUNCTIONS - What can be done to detect and prevent the failure?

Determining the necessary maintenance strategies for an asset within its current operating context requires identifying the functions and their associated desired standards of performance. To effectively identify these functions and standards, the following criteria must be satisfied:

• Define the operating context of the asset: Establish the conditions under which the asset is used, including environmental factors, workload, and any other relevant parameters.

• Identify all functions of the asset or system: Ensure that all primary and secondary functions are recognised, including the roles of all protective devices.

• Create function statements with a verb, object, and performance standard: Each function statement should contain a straightforward action, target, and quantifiable performance standard whenever possible.

• Establish desired performance standards based on the owner or user's expectations: The function statements should incorporate performance standards that reflect the desired level of performance by the asset or system's owner or user in its operating context.

The operating context refers to the circumstances under which a marine engineering asset operates. For example, identical hardware may not require the same failure management policy across all installations or applications. For instance, a solitary pump within a system may require a different failure management policy than a pump that is part of multiple redundant units. Similarly, a pump handling corrosive fluids typically needs a distinct policy from one that transports benign fluids. Often overlooked protective devices must also be considered in the RCM process by identifying their functions.

Ultimately, it is the responsibility of the vessel owner or operator to determine the desired performance level that the maintenance program should maintain. By understanding and accounting for the unique operating context of each marine engineering asset, a tailored and efficient maintenance strategy can be developed, ensuring optimal performance and reliability.

2.5.1.ii FUNCTIONAL FAILURES - How can the system fail to fulfil these functions?

The criterion for this question is singular and straightforward: to identify all the failed states related to each function. If functions are well-defined, listing functional failures is relatively easy. Identifying these failures is crucial in understanding potential risks and implementing preventive measures to ensure system reliability and safety. Here are some examples of functional failures in the context of marine engineering:

Propulsion system:

a. Engine failures: Mechanical or electrical issues with the engine lead to loss of propulsion or reduced power output.

b. Fuel system problems: Contamination, leakage, or blockage in the fuel system impacts the engine's performance or causes shutdowns.

c. Cooling system malfunctions: Failures in the cooling system cause overheating, which can damage components and affect the engine performance.

d. Transmission and shaft issues: Problems with gearboxes, shafts, or couplings impact the power transfer from the engine to the propeller.

Electrical system:

a. Generator failures: Inability to generate sufficient electrical power due to equipment malfunction or fuel shortage.

b. Distribution failures: Power distribution issues to onboard systems, such as switchboard malfunctions, damaged cables, or circuit breaker issues.

c. Power quality issues: Voltage fluctuations or frequency deviations that can damage electrical equipment or disrupt operations.

d. Battery issues: Inadequate charging, capacity loss, or failure of onboard batteries, affecting the performance of critical systems.

Navigation and communication system:

a. Equipment failure: Malfunctions in navigation or communication devices, such as GPS, radar, or radios, hinder safe and efficient operations.

b. Software failure: Bugs or errors in system software cause incorrect data display or unexpected behaviour.

c. Signal interference: Electromagnetic or atmospheric conditions disrupt the signal reception or transmission.

Hull and structural system:

a. Corrosion: Deterioration of the hull or structural components due to exposure to seawater, leading to reduced structural integrity.

b. Fatigue: Material failure due to repetitive loading or stress, causing cracks or fractures in structural components.

c. Leakage or flooding: Damage to hull plating, seals, or watertight compartments leads to water ingress, affecting buoyancy and stability.

Auxiliary systems (e.g., HVAC, bilge, and ballast):

a. Equipment failures: Mechanical or electrical issues with pumps, compressors, or valves result in disruptions to the operation of auxiliary systems.

b. Piping failures: Leakage or blockage in the piping system impacts the proper functioning of auxiliary systems.

c. Control system malfunctions: Failures in control systems lead to incorrect operation or reduced efficiency of auxiliary systems.

2.5.1.iii FAILURE MODES - What can cause a functional failure?

What causes each functional failure (failure modes)? Understanding the causes of each functional failure (failure modes) in marine engineering is crucial for developing effective maintenance strategies. In FMECA, the term "failure mode" is used, similar to how RCM uses "functional failure," but the RCM community defines failure mode as the event that causes a functional failure. The standard criteria for a process that identifies failure modes include the following:

· Identifying all reasonably probable failure modes that can cause each functional failure.

· Employing a method to determine what constitutes a reasonably probable failure mode, which must be acceptable to the owner or user of the marine asset.

· Identifying failure modes at a level of causation enables the selection of an appropriate failure management policy.

· Included in the list of failure modes are those that have occurred before, those currently being prevented by existing maintenance programs, and those that have not yet happened but are considered reasonably likely (credible) within the operating context.

· Incorporating in the list of failure modes any event or process likely to cause a functional failure, such as deterioration, human error by operators or maintainers, and design defects.

As the most comprehensive analytical process for developing maintenance programs and managing physical assets, RCM is well-suited to identify every reasonably likely failure mode in marine engineering applications. By thoroughly examining these failure modes, vessel operators can optimise maintenance strategies, improve system reliability, and enhance overall performance.

2.5.1.iv FAILURE EFFECTS - What happens when a failure occurs?

What happens when failures occur (failure effects)? Understanding the consequences of failures, known as failure effects, is crucial. The criteria for identifying failure effects include the following:

• Describing failure effects as what would happen if no specific task were carried out to anticipate, prevent, or detect the failure. Failure effects should encompass all necessary information to evaluate the consequences of the failure, such as:

o The indicatory evidence (if any) that the failure has occurred (for hidden functions, consider the consequences of multiple failures occurring).The potential impact on human safety, such as causing injury or death or adversely affecting the environment.

o The adverse effects (if any) on vessel performance or operations.

o The physical damage (if any) resulting from the failure.

o The actions (if any) required to restore the system's function after the failure. FMECA typically characterises failure effects by examining their impacts at the local level, subsystem level, and system level. Furthermore, it addresses the necessary actions to restore the system's functionality following a failure.

· Failure Modes, Effects, and Criticality Analysis (FMECA) typically describe failure effects in terms of their impact at the local, subsystem, and system levels.

2.5.1.v FAILURE CONSEQUENCES - What might the consequence be when the failure occurs?

Understanding the significance of each failure (failure consequences) is crucial for effective maintenance planning. The standard's criteria for a process that identifies failure consequences are :

· Assessing failure consequences as if no specific task is currently being performed to anticipate, prevent, or detect the failure.

· Formally categorising the consequences of every failure mode:

o Separating hidden failure modes from evident failure modes in the categorisation process.

o Clearly distinguishing events (failure modes and multiple failures) with safety and environmental consequences from those that only have economic consequences, such as operational and non-operational consequences.

RCM evaluates failure consequences, assuming that no preventive measures are in place. As a result, some may be tempted to argue that a failure does not matter because a specific action "always" protects against it. On the contrary, RCM evaluates the assumed protective action's effectiveness and meticulously justifies the effort required. Furthermore, it systematically categorises failure consequences by assigning each failure mode to one of four groups: hidden, evident safety/environmental, evident operational, and evident non-operational.

2.5.1 vi EQUIPMENT FAILURE - What can be done to detect and prevent failure?

A loss of system function in marine engineering systems can result from equipment failures and/or human errors. Equipment failure can typically be attributed to the following factors:

· Design error

· Faulty material

· Improper fabrication and construction

· Improper operation

· Inadequate maintenance

· Maintenance errors

2.5.1.1 EQUIPMENT FAILURE RATE AND PATTERNS

To effectively improve equipment reliability through maintenance, design changes, or operational improvements, it is essential to understand the potential failure mechanisms, their causes, and the associated impacts on the marine system. Equipment failure should be defined as a state or condition where a component no longer fulfils its design intent (e.g., a functional failure occurs due to equipment failure). RCM focuses on managing equipment failures that result in functional failures.

Developing an effective failure management strategy requires understanding the failure mechanism. Equipment may exhibit various failure modes (e.g., how the equipment fails). Furthermore, each failure mode's failure mechanism might vary throughout the equipment's lifespan.

Depending on the dominant system failure mechanisms, system operation, operating environment, and maintenance, specific equipment failure modes exhibit diverse failure rates and patterns. Failure rate statistics are expressed regarding operating time or another pertinent parameter before equipment failure. Failure density distributions often predict an item's failure after a working time.

A typical failure distribution used to model equipment failures is the Weibull distribution, employed when equipment exhibits a constant failure rate for part of its life, followed by an increasing failure rate due to wear-out. Weibull analysis is also used when there is limited failure data. For example, a Weibull plot can help determine if the failure is due to infant mortality, random failure, early wear-out, or wear-out, which helps determine an appropriate maintenance strategy.

Mean Time to Failure (MTTF) is another standard statistical measure. MTTF represents the average life to failure for a specific equipment failure mode, helping to determine when to perform specific maintenance tasks. For example, MTTF data can help establish the rebuilding task interval if an equipment item requires rebuilding.

Understanding that equipment failure modes can exhibit different failure patterns has important implications for determining appropriate maintenance strategies. For most equipment failure modes, specific failure patterns may be unknown but are unnecessary for making maintenance decisions. Instead, inevitable failure characteristic information is needed to make maintenance decisions:

• Wear-in failure – characterised by "weak" members related to manufacturing defects and installation/maintenance/startup errors, also known as "burn-in" or "infant mortality" failures.

• Random failure – dominated by chance failures caused by sudden stresses, extreme conditions, random human errors, etc. (unpredictable by time).

• Wear-out failure – dominated by end-of-useful life issues for equipment

Identifying which of the three equipment failure characteristics represents the equipment failure mode helps determine the proper maintenance strategy. For example, rebuilding or replacing the equipment item may be appropriate if an equipment failure mode exhibits a wear-out pattern. However, replacing or rebuilding the equipment item may not be advisable if an equipment failure mode is characterised by wear-in failure.

Lastly, a basic understanding of the failure rate helps determine whether maintenance or equipment redesign is necessary. For example, equipment failure modes with high failure rates (e.g., frequent failures) are often best addressed by redesign rather than more frequent maintenance.

2.5.1.2 FAILURE MANAGEMENT STRATEGY.

Understanding failure rates and characteristics is crucial for determining an appropriate strategy to manage failure modes (e.g., RCM refers to this as the failure management strategy). Developing and utilising this understanding is fundamental to RCM and vital for enhancing equipment reliability. For instance, it is no longer considered accurate that the more an item is overhauled, the less likely it is to fail. Unless a dominant age-related failure mode exists, age limits do little to improve the reliability of complex items. Sometimes, scheduled overhauls can increase failure rates by introducing infant mortality and/or human errors into otherwise stable systems. In RCM, the failure management strategy can comprise the following:

• Appropriate proactive maintenance tasks,

• Equipment redesigns or modifications, or

• Other operational improvements.

The proactive maintenance tasks in the failure management strategy aim to (1) prevent failures before they occur or (2) detect the onset of failures in sufficient time so that the failure can be managed before it occurs. Equipment redesigns, modifications, and operational improvements (RCM refers to these as one-time changes) attempt to enhance equipment with high failure rates or for which proactive maintenance is ineffective/inefficient.

The key issues in determining whether a specific failure management strategy is effective are:

• Is the failure management strategy technically feasible?

• Is an acceptable level of risk achieved when the failure management strategy is implemented?

• Is the failure management strategy cost-effective?

In addition to proactive maintenance tasks and one-time changes, servicing tasks and routine inspections may be essential to the failure management strategy. These activities help ensure that the equipment failure rate and failure characteristics are as expected. For example, the failure rate and failure pattern for a bearing drastically change if it is not adequately lubricated.

2.5.1.3 PROACTIVE MAINTENANCE TASKS

Proactive maintenance tasks can be divided into four categories:

I. PLANNED MAINTENANCE TASKS

Planned maintenance tasks (sometimes called preventative maintenance) are performed at specified intervals, regardless of the equipment's condition. The purpose of these tasks is to prevent functional failure before it occurs. They are often applied when no condition-monitoring task is identified or justified, and a wear-out region characterises the failure mode. RCM further divides planned maintenance into two subcategories:

• Restoration task: A scheduled task performed at or before a predetermined interval (age limit) to restore an item's capability, providing an acceptable probability of functioning until the end of another specified interval. For instance, rebuilding fuel injectors in a diesel engine can be a restoration task.

• Discard task: A scheduled task carried out at or before a specified age limit that requires disposing of an item, regardless of its condition. It is important to note that "restoration" and "discard" can apply to the same task. For example, when replacing a diesel engine's cylinder liners with new ones at fixed intervals, the task can be described as a scheduled discard of the cylinder liner or a scheduled restoration of the diesel engine.

II. CONDITION-MONITORING TASKS

A condition-monitoring task is a scheduled task used to detect the onset of a failure so that action can be taken to prevent the functional failure. A potential failure is an identifiable condition indicating that a functional failure is imminent or in progress. Condition-monitoring tasks should only be chosen when a detectable potential failure condition will exist before failure. When choosing maintenance tasks, condition-monitoring tasks should be considered first unless an observable potential failure condition cannot be identified. Condition-monitoring tasks are also referred to as "predictive maintenance." Section 4 provides additional details.

III. COMBINATION OF TASKS

When neither condition-monitoring nor planned maintenance tasks alone seem capable of reducing the risks of the functional failure of the equipment, it may be necessary to select a combination of both maintenance tasks. This approach is usually used when the condition-monitoring or planned maintenance task is insufficient to achieve an acceptable risk.

IV. FAILURE-FINDING TASKS

A failure-finding task is scheduled to detect hidden failures when no condition-monitoring or planned maintenance task is applicable. It is a scheduled function check to determine whether an item will perform its required function if called upon. Most of these items are standby or protective equipment. An example would be checking the safety valve on a boiler.

A failure-finding task is a scheduled task designed to identify whether a specific hidden failure has taken place. These tasks are typically applied to protective devices that may fail without warning. This task aims to bridge the gap between the sixth (proactive tasks) question and the seventh (default actions or actions taken without proactive tasks). While failure-finding tasks share the scheduling aspect with proactive tasks, they are not proactive; they neither predict nor prevent failures. Furthermore, failure-finding tasks are not proactive; instead, they aim to detect failures that have already occurred to minimise the possibility of multiple failures or the failure of a protected function when a protective device is already in a failed state. These tasks represent a transition from the sixth (proactive) question to the seventh (default actions or measures implemented when proactive tasks are absent).

2.5.1.4 RUN-TO-FAILURE

Run-to-failure is a failure management strategy that allows an equipment item to operate until failure occurs, at which point a repair is made. This maintenance strategy is acceptable only if the risk of failure is acceptable without any proactive maintenance tasks. An example would be allowing a local pressure gauge to fail on a cooling water line equipped with a remote-reading pressure gauge.

When considering a run-to-failure decision for an asset, before accepting the decision, the following criteria should be considered:

• For hidden failures with no appropriate scheduled task, the associated multiple failures must not have safety or environmental consequences.

• The related failure mode must not pose a safety or environmental risks for evident failures with no suitable scheduled task. In other words, the process should not allow users to opt for a "run to failure" strategy if the failure mode or (in the case of a hidden failure) the corresponding multiple failures have safety or environmental implications.

These continuous improvement programs encompass valuable modifications that can enhance plant performance. In numerous cases, these adjustments are shared across all methods. However, some alterations are exclusive to a single approach. The primary shortcomings of these strategies include the following:

• Insufficient emphasis on or incorporation of effective culture change, such as change management processes

• Absence of a comprehensive approach, with each method concentrating on a single function or activity within the plant

• The necessity for a permanent organisational structure to oversee the efforts

2.5.1.vii DEFAULT ACTIONS - ONE-TIME CHANGES

What should be done if a maintenance job description cannot be found?

What actions should be taken if no appropriate proactive task can be identified (default actions)? This question relates to unscheduled failure management policies, which involve deciding whether to allow an asset to run until failure or to alter some aspect of the asset's operating context, such as its design or operation method.

One-time changes are employed to reduce the failure rate or manage failures when appropriate proactive maintenance tasks are not identified or cannot effectively and efficiently manage the risk. The primary purpose of a one-time change is to modify the failure rate or failure pattern through:

• Equipment redesigns or modifications and/or

• Operational improvements.

One-time changes most effectively address equipment failure modes resulting from the following:

• Faulty design and/or material

• Improper fabrication and/or construction

• Misoperation

• Maintenance errors

These failure mechanisms often lead to a wear-in failure characteristic, thus requiring a one-time change.

When no maintenance strategy can be found that is both applicable and effective in detecting or preventing failure, a one-time change should be considered. A one-time change is mandatory for failure modes with the highest risk. The following briefly describes each type of one-time change:

• Equipment redesign or modifications: Redesign or modifications involve physical changes to the equipment or system. An example would be adding drain valves to appropriate lengths of piping on a tanker's deck cargo piping to prevent freezing and damage to the piping during vessel transits in freezing temperatures.

• Operational improvements: Operational improvements may include modifications to the operation of the equipment and/or modifications to how maintenance is performed on the equipment. Operational improvements typically involve changing the operating context and procedures, providing additional training to the operator or maintainer, or any combination thereof. For example, in the case of a main propulsion engine with a non-continuous rating nameplate, the engine could be operated at a lower output closer to its continuous rating to reduce downtime for maintenance. (However, this action may cause the vessel to be unable to meet its schedules.)

2.5.1.5 SERVICING AND ROUTINE INSPECTION

These tasks are designed to (1) ensure that the failure rate and failure pattern remain as predicted by performing routine servicing (e.g., lubrication) and (2) identify accidental damage and/or issues resulting from ignorance or negligence. In addition, they provide an opportunity to confirm that the overall maintenance standards are satisfactory. These tasks are not based on any explicit potential failure condition. Servicing and routine inspection may also be applied to items with minor failure consequences that should not be overlooked (such as minor leaks, drips, etc.).

References & Bibliography:

1. Simatupang, J., Harahap, R. and Simatupang, J., 2021. Determination of Maintenance Task on Tanker Vessel's Marine Boiler Using Reliability Centered Maintenance (RCM) II Method. ResearchGate. Available at: https://www.researchgate.net/publication/354347868_Determination_of_Maintenance_Task_on_Tanker_Vessel's_Marine_Boiler_Using_Reliability_Centered_Maintenance_RCM_II_Method [Accessed 2 August 2024].

2. Simatupang, J., Harahap, R. and Simatupang, J., 2021. The Combination of Reliability and Predictive Tools to Determine Ship Engine Performance based on Condition Monitoring. ResearchGate. Available at: https://www.researchgate.net/publication/350303270_The_Combination_of_Reliability_and_Predictive_Tools_to_Determine_Ship_Engine_Performance_based_on_Condition_Monitoring [Accessed 2 August 2024].

3. Harahap, R., Simatupang, J. and Simatupang, J., 2021. Application of Reliability-Centered Maintenance for Tugboat Kresna 315 Cooling Systems. ResearchGate. Available at: https://www.researchgate.net/publication/350148670_Application_of_Reliability-Centered_Maintenance_for_Tugboat_Kresna_315_Cooling_Systems [Accessed 2 August 2024].

4. ATPM Co., Ltd., n.d. Reliability. Available at: http://www.atpm.co.kr/5.mem.service/6.data.room/data/treatise/5.reliability/5.reliability_01.pdf [Accessed 2 August 2024].

5. Stević, M. and Radojević, V., 2008. Increasing ship operational reliability through the implementation of a holistic maintenance management strategy. Academia.edu. Available at: https://www.academia.edu/962903/Increasing_ship_operational_reliability_through_the_implementation_of_a_holistic_maintenance_management_strategy [Accessed 2 August 2024].

6. American Bureau of Shipping, 2018. Reliability-Centered Maintenance. Available at: https://ww2.eagle.org/content/dam/eagle/rules-and-guides/current/design_and_analysis/132_reliabilitycenteredmaintenance/rcm-gn-aug18.pdf [Accessed 2 August 2024].

Thursday, June 27, 2024

Comprehensive Mathematical Model for Condition-Based Maintenance (CBM): Maximizing Reliability and Minimizing Costs

Comprehensive Mathematical Model for Condition-Based Maintenance (CBM): Maximizing Reliability and Minimizing Costs

By Aleksandar Pudar

Technical Superintendent and Planned Maintenance Supervisor Reederei Nord BV

Co-founder of "Out of Box Maritime Thinker Blog" and founder of "Narenta Gestio Consilium Group."

1. Introduction and Objective

Condition-Based Maintenance (CBM) is a maintenance strategy that monitors the actual condition of assets to decide on the appropriate maintenance actions. (Jardine et Banjevic., 2006). The primary aim of the mathematical model is to minimise maintenance costs while maximising equipment reliability and availability. (Ebeling, 1997).

2. Variables and Parameters

T: Time between maintenance actions.
C(T): Cost function over time, which includes both preventive maintenance costs (CP) and corrective failure costs (Cf).
R(T): Reliability function representing the probability of equipment functioning without failure over time.
L: Lube oil quality indicator derived from lube oil analysis.
V: Vibration level derived from vibration monitoring data.
θ: Temperature or thermal condition derived from thermal imaging.

3. Assumptions

Equipment failure rate follows a Weibull distribution, commonly used for reliability analysis of mechanical systems.
Maintenance cost is a function of both preventive and corrective actions.
Equipment condition degrades over time and is influenced by operating conditions and maintenance quality.

4. Mathematical Relationships

Reliability Model:

where η and β are shape and scale parameters of the Weibull^{^[1]} distribution, respectively

Cost Model:

, where

is the preventive maintenance cost, and 𝐶𝑓is the failure cost.

Condition Indicators:

L, V, and Ѳ directly influence the decision to adjust T scheduling maintenance more immediately to avoid further damage.

5. Optimisation Problem

Objective: Find T that minimises 𝐶(𝑇) while ensuring 𝑅(𝑇)remains above a desired threshold, indicating that the equipment will likely operate reliably until the next scheduled maintenance.
Optimisation Problem Statement:

· Minimize 𝐶(𝑇)=400⋅𝑇+8000⋅(1−𝑅(𝑇))

· Subject to 𝑅(𝑇)≥0.90

6. Example: Emergency Fire Pump

Objective: Minimise the emergency fire pump's maintenance and operational costs, maximising its reliability and readiness for emergencies.

Provided Data:

Low-Pressure Suction: -0.2 bar
High-Pressure Discharge: 8.0 bar
Current: 52 Amps
Lube oil: Result 8/10
Vibration level: 2.73 mm/s.
Temperature: 15°C (mechanical seal temperature)
Operating Condition: Ballast (indicating the ship's cargo condition)

Updated Variables and Parameters:

T: Time between maintenance actions. Assuming 12 monthly in-depth maintenance.
C(T): Cost function over time. Preventive maintenance costs are €400 per month, and failure costs could increase to €8000.
R(T): Reliability function, which we'll keep as is since it is a unitless probability.
L: A score from 0 to 10, with 10 indicating new oil. This will remain the same.
V: Vibration level, the safe threshold is 4 mm/s (RMS).
θ: Temperature or thermal condition. The standard operating range is 15 - 60°C, with deviations indicating potential issues.

Assumptions:

The equipment failure rate follows a Weibull distribution with 𝜂=730η=730 days (2 years) and 𝛽=1.5, suggesting a wear-out failure mode.
Maintenance cost is a function of preventive and corrective actions due to failures.
Equipment condition degrades over time.

Mathematical Relationships:

Reliability Model:

where 𝜂=730η= (2 years) and 𝛽=1.5.

Cost Model:

𝐶(𝑇)=400⋅𝑇+8000⋅(1−𝑅(𝑇))

Assuming preventive maintenance costs €400 per month and failure costs can soar up to €8000 in case of malfunction. The formula adjusts the failure cost based on the probability of failure, which increases as the time since the last maintenance extends.

Condition Indicators:

Lube Oil Quality (L): 8/10, suggesting good lubrication status.
Vibration Level (V): 2.73 mm/s, well below the 4 mm/s threshold, indicating stable mechanical operation.
Temperature (θ): 15°C within the normal operating range (15-60°C), showing no immediate thermal risks.

These indicators suggest the pump operates efficiently, but ongoing monitoring is crucial to maintaining this status.

Optimisation Problem:

Problem Statement: Determine the optimal maintenance schedule 𝑇T for the Emergency Fire Pump that minimises the overall maintenance and failure costs while ensuring the pump's reliability remains above a critical threshold to guarantee its readiness for emergencies.
Solution Approach:

Objective Function:

Minimize 𝐶(𝑇)=400⋅𝑇+8000⋅(1−𝑅(𝑇))

Constraints:

Maintain 𝑅(𝑇) above a desired threshold, say 0.90, to ensure high reliability.
Maintain operational limitations such as vibration and temperature within safe limits.

Numerical Solution: Use numerical optimisation techniques, possibly incorporating constraint programming, to find the 𝑇T that offers the best balance between maintenance frequency and cost efficiency. Simulation or scenario analysis might evaluate different maintenance intervals and their impacts on costs and pump failure probability.

7. Additional Considerations

7.1 Weibull Distribution

The probability density function (PDF) of the Weibull distribution for a random variable 𝑋 is given by:

Where:

𝑋 is the variable
𝜆>0 is the scale parameter
𝑘>0 is the shape parameter

The shape parameter 𝑘k determines the type of distribution:

If 𝑘=1, the Weibull distribution simplifies to an exponential distribution.
If 𝑘<1, the distribution models a phenomenon with a high failure rate initially, which decreases over time (often used for items that fail early on, such as new products with manufacturing defects).
If 𝑘>1, the distribution models a phenomenon where the failure rate increases (commonly used for products that wear out over time, like mechanical components).

The scale parameter 𝜆 essentially stretches or compresses the distribution along the x-axis, affecting its spread but not its general shape. The Weibull distribution is beneficial because it can model various data distributions by adjusting 𝑘 and 𝜆, making it versatile for statistical modelling and analysis in multiple fields. (National Institute of Standards and Technology, 2021).

8. Updated Approach

8.1 Continuous Monitoring Simulation - Overview

Daily Conditions and Maintenance Triggers:

L (Lube Oil Quality): Starts at 10 and degrades by up to 0.01 daily (if working). Maintenance is triggered if it drops below 7.
V (Vibration Level): Starts at 2.0 and increases by up to 0.02 daily (if working). Maintenance is triggered if it exceeds 3.5.
θ (Temperature): Starts at 25°C and fluctuates daily with an average variation (standard deviation of 1.5°C). Maintenance is triggered if it goes below 15°C or above 60°C.

Costs:

Preventive Maintenance Cost: €500 each time maintenance is triggered.
Daily Operational Cost: €10 for each day without maintenance.

This setup is intended to continuously monitor the condition of the equipment, reacting in real-time when any parameter exceeds safe thresholds. Preventing severe equipment failures will ensure minimal downtime and cost.

8.2 Predictive and Prescriptive Maintenance Simulation – Overview

IoT-Based Predictive and Prescriptive Maintenance Setup:

IoT Sensors: Sensors continuously monitor lube oil quality (L), vibration levels (V), and temperature (θ) around the clock.
Data Processing: Real-time data analysis using machine learning models to predict when maintenance thresholds will likely be breached based on current trends and historical data.
Prescriptive Algorithms: The system prescribes specific maintenance actions once a potential issue is identified. These could include adjusting operating parameters, scheduling part replacements, or performing other maintenance tasks.
Automation and Alerts: Automated alerts notify maintenance teams of predicted issues and prescribed actions, allowing for an immediate response before conditions deteriorate.
Cost Adjustments: The costs are slightly adjusted to account for the infrastructure and operational expenses of running IoT sensors and data processing systems.

Simulation Logic with IoT Continuous Monitoring:

Continuous Real-Time Monitoring: Replace daily checks with continuous monitoring. Sensory data is analysed in real time to generate predictive alerts.
Maintenance Trigger: Maintenance is no longer scheduled at regular intervals but is triggered by predictive alerts based on real-time data analysis.
Maintenance Cost: Assume an increased preventive maintenance cost of € 600 per session due to the advanced technologies used.
Operational Cost: Given the enhanced monitoring and data analysis capabilities, consider a nominal IoT operational cost of € 1.4 per day.

Cost Estimation with IoT-Based System:

Maintenance Frequency: Due to effective early intervention and continuous monitoring, maintenance is assumed to be needed only once a year.
Maintenance Cost: 1 sessions × € 600/session = € 600
Operational Cost: 365 days × €1.4/day = €511

Total Annual Cost:

Total Cost with IoT-based monitoring:

€600 + € 511= € 1111

This refined IoT approach for 24/7 monitoring minimises downtime and maintenance frequency and ensures that maintenance actions are highly targeted and efficient, significantly reducing the likelihood of severe machine failures.

See Appendix 1 & Appendix 2.

Conclusion

The mathematical model for Condition-Based Maintenance (CBM) provided in this document integrates the core aspects of CBM, including data from condition monitoring techniques like vibration analysis, thermal imaging, and lube oil analysis, to drive maintenance decisions. The goal is to optimise the trade-off between maintenance costs and equipment reliability. Adopting continuous and predictive maintenance strategies, especially with IoT-based monitoring, can significantly enhance overall maintenance efficiency and equipment reliability.

This document provides a comprehensive guide to developing and implementing a CBM model, emphasising the importance of accurate data, appropriate mathematical modelling, and continuous monitoring to achieve optimal maintenance outcomes.

References & Bibliography :

1. Books and Textbooks:

Ebeling, C.E., 1997. An Introduction to Reliability and Maintainability Engineering. New York: McGraw-Hill.
Jardine, A.K.S. and Tsang, A.H.C., 2013. Maintenance, Replacement, and Reliability: Theory and Applications. 2nd ed. Boca Raton, FL: CRC Press.
Meeker, W.Q. and Escobar, L.A., 1998. Statistical Methods for Reliability Data. New York: John Wiley & Sons.

2. Articles and Papers:

Mobley, R.K., 2002. An Introduction to Predictive Maintenance. 2nd ed. Boston: Butterworth-Heinemann.
Lu, C.J. and Weng, S., 2008. Condition-Based Maintenance Decision-Making for Equipment Under Variable Working Conditions. Journal of Quality in Maintenance Engineering, 14(1), pp.63-74.
Jardine, A.K.S., Lin, D. and Banjevic, D., 2006. A Review on Machinery Diagnostics and Prognostics Implementing Condition-Based Maintenance. Mechanical Systems and Signal Processing, 20(7), pp.1483-1510.

3. Standards and Guidelines:

International Organization for Standardization (ISO), 2011. ISO 17359:2011 Condition Monitoring and Diagnostics of Machines — General Guidelines. Geneva: ISO.
International Organization for Standardization (ISO), 2014. ISO 55000:2014 Asset Management — Overview, Principles and Terminology. Geneva: ISO.

4 Technical Reports:

NASA Office of Safety and Mission Assurance, 2012. NASA Reliability-Centered Maintenance Guide for Facilities and Collateral Equipment. NASA Technical Report.
U.S. Department of Energy, 2010. Operations & Maintenance Best Practices: A Guide to Achieving Operational Efficiency. Washington, D.C.: U.S. Department of Energy.

5 Web Resources:

National Institute of Standards and Technology (NIST), 2021. Weibull Distribution. Available at: https://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm [Accessed 27 June 2024].
Reliability Hotwire Magazine, 2001. The Basics of Weibull Distribution. Available at: http://www.weibull.com/hotwire/issue14/hottopics14.htm [Accessed 27 June 2024].

Appendix 1 - Simulation Logic with IoT Continuous Monitoring for Emergency Fire Pump

Example: Emergency Fire Pump with IoT-Based Continuous Monitoring

Objective: Minimise maintenance and operational costs of the emergency fire pump, maximising its reliability and readiness for emergencies.

Provided Data:

Low-Pressure Suction: -0.2 bar
High-Pressure Discharge: 8.0 bar
Current: 52 Amps
Lube oil: Result 8/10
Vibration level: 2.73 mm/s.
Temperature: 15°C (mechanical seal temperature)
Operating Condition: Ballast (indicating the ship's cargo condition)

Updated Variables and Parameters:

T: Time between maintenance actions. IoT monitoring will trigger maintenance based on real-time data rather than fixed intervals.
C(T): Cost function over time. Due to advanced IoT technologies, preventive maintenance costs are now €600 per session, and failure costs could increase to €8000.
R(T): Reliability function, which we'll keep as is since it is a unitless probability.
L: A score from 0 to 10, with 10 indicating new oil. This will remain the same.
V: Vibration level, the safe threshold is 4 mm/s (RMS).
θ: Temperature or thermal condition. The standard operating range is 15 - 60°C, with deviations indicating potential issues.

Simulation Logic:

Initial Conditions:

Lube Oil Quality (L): 10
Vibration Level (V): 2.0 mm/s
Temperature (θ): 25°C

Daily Monitoring and Degradation Rates:

Lube Oil Quality (L): Degrades up to 0.01 daily if the pump is operational.
Vibration Level (V): Increases up to 0.02 daily if the pump is operational.
Temperature (θ): Fluctuates daily with an average variation (standard deviation of 1.5°C).

Maintenance Triggers:

Lube Oil Quality (L): Maintenance is triggered if it drops below 7.
Vibration Level (V): Maintenance is triggered if it exceeds 3.5 mm/s.
Temperature (θ): Maintenance is triggered if it goes below 15°C or above 60°C.

Costs:

Preventive Maintenance Cost: €600 each time maintenance is triggered.
Daily Operational Cost: € 1.4 per day due to IoT monitoring.

Simulation Execution:

Step 1: Initialize simulation with given starting values.
Step 2: For each day:

Update L, V, and θ based on their respective degradation rates and variations.
Check if any parameter exceeds its threshold.
If a parameter exceeds its threshold, trigger maintenance, reset the parameter to its initial value, and add the preventive maintenance cost to the total Cost.
Add the daily operational Cost to the total Cost.

Step 3: Repeat step 2 for the entire simulation period (e.g., one year).

Example Calculation for One Year:

Initial Conditions:

𝐿=10L=10
𝑉=2.0V=2.0 mm/s
𝜃=25θ=25°C

Simulation Iterations:

Day 1:

Update: 𝐿=9.99, 𝑉=2.02 mm/s, θ=25.1°C
No maintenance was triggered.
Total Cost: € 1.4

Day 2:

Update: 𝐿=9.98, 𝑉=2.04 mm/s, 𝜃=26.0°C
No maintenance was triggered.
Total Cost: € 2.8

...
Day X (when maintenance is triggered):

𝐿=6.99 (below threshold)
Maintenance triggered, reset 𝐿 to 10.
Add preventive maintenance cost: €600
Total Cost: Previous Total + €600 + €1.4 (daily operational cost)

Continue the simulation for 365 days.

Annual Cost Calculation:

Assume maintenance is triggered 1 times a year based on the degradation rates and threshold limits.
Preventive Maintenance Cost: 1 sessions × €600/session = €600
Operational Cost: 365 days × €1.4/day = € 511
Total Annual Cost: € 600 + € 511 = € 1111

Summary of Simulation Results

By continuously monitoring the emergency fire pump using IoT sensors, the maintenance actions can be precisely targeted based on real-time data, reducing the likelihood of severe failures and optimising maintenance costs. The total annual Cost for maintaining the pump with IoT-based monitoring is estimated at €1111, reflecting the benefits of proactive and data-driven maintenance strategies.

Appendix 2 - Simulation Logic with IoT Continuous Monitoring for Emergency Fire Pump

Provided Data and Parameters

Initial Conditions:

Lube Oil Quality (L): 10
Vibration Level (V): 2.0 mm/s
Temperature (θ): 25°C

Daily Monitoring and Degradation Rates:

Lube Oil Quality (L): Degrades by 0.001 daily if the pump is operational.
Vibration Level (V): Increases by 0.0002 mm/s daily if the pump is operational.
Temperature (θ): Fluctuates daily with an average variation (standard deviation of 1.5°C).

Maintenance Triggers:

Lube Oil Quality (L): Maintenance is triggered if it drops below 7.
Vibration Level (V): Maintenance is triggered if it exceeds 3.5 mm/s.
Temperature (θ): Maintenance is triggered if it goes below 15°C or above 60°C.

Costs:

Preventive Maintenance Cost: €600 each time maintenance is triggered.
Daily Operational Cost: €1.4 per day due to IoT monitoring.( equipment+DDS)
Major Overhaul Cost (every 5 years): €5000

Simulation Execution with IoT-Based Monitoring

Scenario 1: IoT-Based Monitoring with Triggered Maintenance

Initial Conditions:

𝐿=10L=10
𝑉=2.0V=2.0 mm/s
𝜃=25θ=25°C

Daily Degradation and Monitoring:

Lube Oil Quality (L):

Degrades by 0.001 per day.
Threshold for maintenance: L < 7 (after approximately 3000 days).

Vibration Level (V):

Increases by 0.0002 mm/s per day.
Threshold for maintenance: V > 3.5 mm/s (after approximately 7500 days).

Temperature (θ):

Fluctuates daily with an average variation (standard deviation of 1.5°C).
Maintenance is triggered if outside the 15°C to 60°C range.

Maintenance Schedule:

Given the degradation rates, no maintenance is triggered within one year based on Lube Oil Quality (3000 days threshold) and Vibration Level (7500 days threshold).

Costs:

Preventive Maintenance:

Major overhaul: Every 5 years = €5000 (prorated to €1000 per year)

Daily Operational Cost: 365 days × €1.4/day = €511

Total Annual Cost with IoT-Based Monitoring:

€1000 (prorated major overhaul) + €511(daily operational cost) = €1511 €

Scenario 2: Traditional Fixed Interval Maintenance

Maintenance Schedule:

Preventive maintenance is performed without condition monitoring.
Assume monthly maintenance at €600 per session. ( materials, manhours, spares)
Annual major overhaul cost included €5000 (prorated to €1000 per year)

Costs:

Preventive Maintenance:

Monthly maintenance: 12 sessions × €600/session = €7200
Prorated major overhaul: €1000

Total Annual Cost:

· €7200 (monthly maintenance) + €1000(prorated major overhaul) = €8200

Potential Savings with IoT-Based Monitoring

· Total Annual Cost with IoT-Based Monitoring:

€1511

· Total Annual Cost with Traditional Fixed Interval Maintenance:

€8200

· Potential Annual Savings:

€8200 − €1511= € 6689

Summary of Potential Savings

Using IoT-based continuous monitoring, the emergency fire pump's maintenance costs can be significantly reduced compared to traditional fixed interval maintenance. The potential savings over one year amount to €6689, demonstrating the financial benefits of adopting proactive and condition-based maintenance strategies.

Disclaimer:

Out of Box Maritime Thinker © by Narenta Gestio Consilium Group 2024 and Aleksandar Pudar assumes no responsibility or liability for any errors or omissions in the content of this paper. The information in this paper is provided on an "as is" basis with no guarantees of completeness, accuracy, usefulness, or timeliness or of the results obtained from using this information. The ideas and strategies should never be used without first assessing your company's situation or system or consulting a consultancy professional. The content of this paper is intended to be used and must be used for informational purposes only.

[1] Weibull distribution is a continuous probability distribution named after the Swedish engineer Waloddi Weibull, who described it in detail in 1951, though it was first identified by Fréchet in 1927. The Weibull distribution is widely used in reliability engineering, life data analysis, weather forecasting, and various other applications due to its flexibility in modelling different data types.

Out of Box Maritime Thinker

LinkedIn

Friday, August 2, 2024

2.5. RELIABILITY-CENTERED MAINTENANCE (RCM) – PART I

Thursday, June 27, 2024

Comprehensive Mathematical Model for Condition-Based Maintenance (CBM): Maximizing Reliability and Minimizing Costs

Followers

Pages