Monday, April 4, 2011

Types of Maintenance

Types of Maintenance

1. Breakdown maintenance
It means that people waits until equipment fails and repair it. Such a thing could be used when the equipment failure does not significantly affect the operation or production or generate any significant loss other than repair cost.

2. Preventive maintenance ( 1951 )
It is a daily maintenance ( cleaning, inspection, oiling and re-tightening ), design to retain the healthy condition of equipment and prevent failure through the prevention of deterioration, periodic inspection or equipment condition diagnosis, to measure deterioration. It is further divided into periodic maintenanceand predictive maintenance. Just like human life is extended by preventive medicine, the equipment service life can be prolonged by doing preventive maintenance.

2a. Periodic maintenance ( Time based maintenance - TBM)
Time based maintenance consists of periodically inspecting, servicing and cleaning equipment and replacing parts to prevent sudden failure and process problems.

2b. Predictive maintenance
This is a method in which the service life of important part is predicted based on inspection or diagnosis, in order to use the parts to the limit of their service life. Compared to periodic maintenance, predictive maintenance is condition based maintenance. It manages trend values, by measuring and analyzing data about deterioration and employs a surveillance system, designed to monitor conditions through an on-line system.

3. Corrective maintenance ( 1957 )
It improves equipment and its components so that preventive maintenance can be carried out reliably. Equipment with design weakness must be redesigned to improve reliability or improving maintainability

4. Maintenance prevention ( 1960 )
It indicates the design of a new equipment. Weakness of current machines are sufficiently studied ( on site information leading to failure prevention, easier maintenance and prevents of defects, safety and ease of manufacturing ) and are incorporated before commissioning a new equipment.

Sunday, April 3, 2011

Failure Modes and Effects Analysis (FMEA)

FMEA / FMECA Overview

In general, Failure Modes, Effects and Criticality Analysis (FMEA / FMECA) requires the identification of the following basic information:

  • Item(s)
  • Function(s)
  • Failure(s)
  • Effect(s) of Failure
  • Cause(s) of Failure
  • Current Control(s)
  • Recommended Action(s)
  • Plus other relevant details

Most analyses of this type also include some method to assess the risk associated with the issues identified during the analysis and to prioritize corrective actions. Two common methods include:

  • Risk Priority Numbers (RPNs)
  • Criticality Analysis (FMEA with Criticality Analysis = FMECA)

Basic Analysis Procedure for FMEA or FMECA

The basic steps for performing an Failure Mode and Effects Analysis (FMEA) or Failure Modes, Effects and Criticality Analysis (FMECA) include:

  • Assemble the team.
  • Establish the ground rules.
  • Gather and review relevant information.
  • Identify the item(s) or process(es) to be analyzed.
  • Identify the function(s), failure(s), effect(s), cause(s) and control(s) for each item or process to be analyzed.
  • Evaluate the risk associated with the issues identified by the analysis.
  • Prioritize and assign corrective actions.
  • Perform corrective actions and re-evaluate risk.
  • Distribute, review and update the analysis, as appropriate.

Risk Evaluation Methods

A typical failure modes and effects analysis incorporates some method to evaluate the risk associated with the potential problems identified through the analysis. The two most common methods, Risk Priority Numbers and Criticality Analysis, are described next.

Risk Priority Numbers

To use the Risk Priority Number (RPN) method to assess risk, the analysis team must:

  • Rate the severity of each effect of failure.
  • Rate the likelihood of occurrence for each cause of failure.
  • Rate the likelihood of prior detection for each cause of failure (i.e. the likelihood of detecting the problem before it reaches the end user or customer).
  • Calculate the RPN by obtaining the product of the three ratings:

RPN = Severity x Occurrence x Detection

The RPN can then be used to compare issues within the analysis and to prioritize problems for corrective action. This risk assessment method is commonly associated with Failure Mode and Effects Analysis (FMEA).

Criticality Analysis

The MIL-STD-1629A document describes two types of criticality analysis: quantitative and qualitative. To use the quantitative criticality analysis method, the analysis team must:

  • Define the reliability/unreliability for each item and use it to estimate the expected number of failures at a given operating time.
  • Identify the portion of the item’s unreliability that can be attributed to each potential failure mode.
  • Rate the probability of loss (or severity) that will result from each failure mode that may occur.
  • Calculate the criticality for each potential failure mode by obtaining the product of the three factors:

    Mode Criticality = Expected Failures x Mode Ratio of Unreliability x Probability of Loss

  • Calculate the criticality for each item by obtaining the sum of the criticalities for each failure mode that has been identified for the item.

    Item Criticality = SUM of Mode Criticalities

To use the qualitative criticality analysis method to evaluate risk and prioritize corrective actions, the analysis team must:

  • Rate the severity of the potential effects of failure.
  • Rate the likelihood of occurrence for each potential failure mode.
  • Compare failure modes via a Criticality Matrix, which identifies severity on the horizontal axis and occurrence on the vertical axis.

These risk assessment methods are commonly associated with Failure Modes, Effects and Criticality Analysis (FMECA).

Applications and Benefits for FMEA and FMECA

The Failure Modes, Effects and Criticality Analysis (FMEA / FMECA) procedure is a tool that has been adapted in many different ways for many different purposes. It can contribute to improved designs for products and processes, resulting in higher reliability, better quality, increased safety, enhanced customer satisfaction and reduced costs. The tool can also be used to establish and optimize maintenance plans for repairable systems and/or contribute to control plans and other quality assurance procedures. It provides a knowledge base of failure mode and corrective action information that can be used as a resource in future troubleshooting efforts and as a training tool for new engineers. In addition, an FMEA or FMECA is often required to comply with safety and quality requirements, such as ISO 9001, QS 9000, ISO/TS 16949, Six Sigma, FDA Good Manufacturing Practices (GMPs), Process Safety Management Act (PSM), etc.

You can use something as simple as a paper form or an Excel spreadsheet to record your FMEA / FMECA analyses. However, if you want to establish consistency among your organization's FMEAs, build a "knowledge base" of lessons learned from past FMEAs, generate other types of reports for FMEA data (e.g. Top 10 Failure Modes by RPN, Actions by Due Date, etc.) and/or track the progress and completion of recommended actions, you may want to use a software tool, such as ReliaSoft's Xfmea, to facilitate analysis, data management and reporting for your failure modes and effects analyses


Preventive Maintenance

Preventive Maintenance

Preventive maintenance is a schedule of planned maintenance actions aimed at the prevention of breakdowns and failures. The primary goal of preventive maintenance is to prevent the failure of equipment before it actually occurs. It is designed to preserve and enhance equipment reliability by replacing worn components before they actually fail. Preventive maintenance activities include equipment checks, partial or complete overhauls at specified periods, oil changes, lubrication and so on. In addition, workers can record equipment deterioration so they know to replace or repair worn parts before they cause system failure. Recent technological advances in tools for inspection and diagnosis have enabled even more accurate and effective equipment maintenance. The ideal preventive maintenance program would prevent all equipment failure before it occurs.

Value of Preventive Maintenance

There are multiple misconceptions about preventive maintenance. One such misconception is that PM is unduly costly. This logic dictates that it would cost more for regularly scheduled downtime and maintenance than it would normally cost to operate equipment until repair is absolutely necessary. This may be true for some components; however, one should compare not only the costs but the long-term benefits and savings associated with preventive maintenance. Without preventive maintenance, for example, costs for lost production time from unscheduled equipment breakdown will be incurred. Also, preventive maintenance will result in savings due to an increase of effective system service life.

Long-term benefits of preventive maintenance include:

  • Improved system reliability.

  • Decreased cost of replacement.

  • Decreased system downtime.

  • Better spares inventory management.

Long-term effects and cost comparisons usually favor preventive maintenance over performing maintenance actions only when the system fails.


When Does Preventive Maintenance Make Sense

Preventive maintenance is a logical choice if, and only if, the following two conditions are met:

  • Condition #1: The component in question has an increasing failure rate. In other words, the failure rate of the component increases with time, thus implying wear-out. Preventive maintenance of a component that is assumed to have an exponential distribution (which implies a constant failure rate) does not make sense!

  • Condition #2: The overall cost of the preventive maintenance action must be less than the overall cost of a corrective action. (Note: In the overall cost for a corrective action, one should include ancillary tangible and/or intangible costs, such as downtime costs, loss of production costs, lawsuits over the failure of a safety-critical item, loss of goodwill, etc.)

If both of these conditions are met, then preventive maintenance makes sense. Additionally, based on the costs ratios, an optimum time for such action can be easily computed for a single component. This is detailed in later sections.

he Fallacy of "Constant Failure Rate" and "Preventive Replacement"

Even though we alluded to the fact in the last section of this on-line reference, Availability, it is important to make it explicitly clear that if a component has a constant failure rate (i.e. defined by an exponential distribution), then preventive maintenance of the component will have no effect on the component's failure occurrences. To illustrate this, consider a component with an MTTF = 100 hours, or λ = 0.01, and with preventive replacement every 50 hours. The reliability vs. time graph for this case is illustrated in Figure 7.3. In Figure 7.3, the component is replaced every 50 hours, thus the component's reliability is reset to one. At first glance, it may seem that the preventive maintenance action is actually maintaining the component at a higher reliability.

Figure 7.3: Reliability vs. time for a single component with an MTTF = 100 hours, or  = 0.01, and with preventive replacement every 50 hours.

Figure 7.3: Reliability vs. time for a single component with an MTTF = 100 hours, or λ = 0.01, and with preventive replacement every 50 hours.

However, consider the following cases for a single component:

Case 1: The component's reliability from 0 to 60 hours:

  • With preventive maintenance, the component was replaced with a new one at 50 hours so the overall reliability is the reliability based on the reliability of the new component for 10 hours, R(t = 10) = 90.48%, times the reliability of the previous component, R(t = 50) = 60.65%. The result is R(t = 60) = 54.88%.

  • Without preventive maintenance, the reliability would be the reliability of the same component operating to 60 hours, or R(t = 60) = 54.88%.

Case 2: The component's reliability from 50 to 60 hours:

  • With preventive maintenance, the component was replaced at 50 hours so this is solely based on the reliability of the new component, for a mission of 10 hours, orR(t = 10) = 90.48%.

  • Without preventive maintenance, the reliability would be the conditional reliability of the same component operating to 60 hours, having already survived to 50 hours, or MATH.

As it can be seen, both cases, with and without preventive maintenance, yield the same results.

Determining Preventive Replacement Time

As mentioned earlier, if the component has an increasing failure rate, then a carefully designed preventive maintenance program is beneficial to system availability. Otherwise, the costs of preventive maintenance might actually outweigh the benefits. The objective of a good preventive maintenance program is to either minimize the overall costs (or downtime, etc.) or meet a reliability objective. In order to achieve this, an appropriate interval (time) for scheduled maintenance must be determined. One way to do that is to use the optimum age replacement model, as presented next. The model adheres to the conditions discussed previously, or:

  • The component is exhibiting behavior associated with a wear-out mode. That is, the failure rate of the component is increasing with time.

  • The cost for planned replacements is significantly less than the cost for unplanned replacements.

Figure 7.4: Cost curve for preventive and corrective replacement.

Figure 7.4: Cost curve for preventive and corrective replacement.

Figure 7.4 shows the Cost Per Unit Time vs. Time plot. In this figure, it can be seen that the corrective replacement costs increase as the replacement interval increases. In other words, the less often you perform a PM action, the higher your corrective costs will be. Obviously, the longer we let a component operate, its failure rate increases to a point that it is more likely to fail, thus requiring more corrective actions. The opposite is true for the preventive replacement costs. The longer you wait to perform a PM, the less the costs; while if you do PM too often, the higher the costs. If we combine both costs, we can see that there is an optimum point that minimizes the costs. In other words, one must strike a balance between the risk (costs) associated with a failure while maximizing the time between PM actions.

Optimum Age Replacement Policy

To determine the optimum time for such a preventive maintenance action (replacement), we need to mathematically formulate a model that describes the associated costs and risks. In developing the model, it is assumed that if the unit fails before time t, a corrective action will occur and if it does not fail by time t, a preventive action will occur. In other words, the unit is replaced upon failure or after a time of operation, t, whichever occurs first.

Thus, the optimum replacement time can be found by minimizing the cost per unit time, CPUT(t). CPUT(t) is given by:

MATH(5)

Where:

  • R(t) = reliability at time t.

  • CP = cost of planned replacement.

  • CU = cost of unplanned replacement.

The optimum replacement time interval, t, is the time that minimizes CPUT(t). This can be found by solving for t such that:

MATH(6)

Or by solving for a t that satisfies Eqn. (7):

MATH(7)

Interested readers can refer to Barlow and Hunter [2] for more details on this model.

Introduction to Repairable Systems Example 2

The failure distribution of a component is described by a 2-parameter Weibull distribution, with β = 2.5 and η = 1000 hours.

  • The cost for a corrective replacement is $5.

  • The cost for a preventive replacement is $1.

Estimate the optimum replacement age in order to minimize these costs.

Solution to Introduction to Repairable Systems Example 2

Prior to obtaining an optimum replacement interval for this component, the assumptions of Eqn. (5) must be checked. The component has an increasing failure rate, since it follows a Weibull distribution with β greater than one. Note that if β = 1, then the component has a constant failure rate and if β < 1, it has a decreasing failure rate. If either of these cases exist, then preventive replacement is unwise. Furthermore, the cost for preventive replacement is less than the corrective replacement cost. Thus, the conditions for the optimum age replacement policy have been met.

Using BlockSim, the failure parameters can be entered in the component's Block Properties window. Select "Optimum Replacement" from the Block menu, enter the costs and compute the optimum time, 493.0470. Figure 7.5 illustrates this.

Figure 7.5: Using BlockSim's Optimum Replacement utility to obtain the results in Example 2.

Figure 7.6 shows a plot illustrating the cost per unit time.

Figure 7.6: Graph of cost vs. replacement time for Example 2.

Figure 7.6: Graph of cost vs. replacement time for Example 2.

Discussion on Introduction to Repairable Systems Example 2

The effect of the corrective/preventive cost ratio on the optimum replacement interval is plotted in Figure 7.7. It can be seen that as the cost ratio increases, the optimum replacement interval decreases. This is an expected result because the corrective replacement costs are much greater than the preventive replacement costs. Therefore, it becomes more cost effective to replace the component more frequently before it fails.

Figure 7.7: Replacement interval as a function of the corrective/preventive cost ratio.

Figure 7.7: Replacement interval as a function of the corrective/preventive cost ratio.