Failure Mode Effect Analysis - FMEA

"What is FMEA? 
“A systematic process for identifying potential design and process failures(both existing and potential) before they occur, with the intent to eliminate them or minimize the risk associated with them”


FMEA is a proactive analysis tool allowing engineers to define, identify and eliminate known /or potential failure, problems, errors and so on from the system, design, process /or service,

One of the first steps to take when completing an FMEA is to determine the participants. The right people with the right experience, such as process owners and designers, should be involved in order to catch potential failure modes

FMEA is a qualitative and systematic tool, usually created within a spreadsheet, to help practitioners anticipate what might go wrong with a product or process. In addition to identifying how a product or process might fail and the effects of that failure, FMEA also helps find the possible causes of failures and the likelihood of failures being detected before occurrence.

Define the ways that the system can fail (failure modes)
The objective is to list all of the ways that the function of the system can fail. For example, the conveyor belt may fail by being unable to transport the goods from one end to the other, or perhaps it does not transport the goods quickly enough.
As part of a tool to proactively discover and potentially eliminate a failure root cause before a failure even has occurred the Failure Mode Effect Analysis (FMEA)
It is a very important step to understand that FMEA is a tool to look ahead in order to be prepared for anything that might be coming up in the future, whereas e.g. Parteo analysis focuses on analysis of data from the past. Of course the equipment history as well as the conclusions drawn out of the Pareto analysis shall be used as an input when developing FMEA’s.


Risk can be defined as the probability of occurrence of an incident, multiplied with the severity of this specific incident: R = P * S with R: Risk P: Probability S: Severity
The probability factor can be expressed in terms of frequency
A: Occurs every three months
B: Occurs once per six months
C: Occurs once per year
D: Occurs once per two to three years
E: Occurs once per five years
F: Occurs once per 20 years
The severity of an incident can be expressed in terms of production loss and excessive maintenance cost.
I : Catastrophic
II : High
III : Moderate
IV : Negligible


The line, which separates the red from the white area, is the "protection level". The protection level has to be set field by field by the plant by balancing the severity of an incident versus its probability of occurrence

The white area is the one inside the protection level, where no actions need to be considered. On the opposite side - upper right – there is the area with an unacceptable level of risk. For this area either actions to reduce the probability and / or the severity of an incident need to be defined. The description of how exactly this can be done, will be given once the FMEA has been discussed.
FMEA is a review technique that focuses on defining failure modes of an equipment and actions to reduce the risk of a failure.
When developing an FMEA at least the elements as listed below have to be taken into consideration
Sub-equipment: In order to make the whole approach as systematic and structured as possible, each equipment for which an FMEA is being developed is divided into sub-equipment.
Failure mode: This column defines which part of an equipment fails, so e.g. a bearing, the gear box any other equipment that is being considered
Cause: This is where the cause of the hypothetical failure is being documented in order to later on define any specific measure to be taken as precisely as possible. In case of a bearing failure the cause can e.g. be “lack of lubrication” or “improper installation”.
Effect: This column is used to document the effect of each failure mode in a bigger scale. This means that e.g. if due to a bearing failure in the gearbox the feed-belt of the raw mill fails, also the raw mill itself has to be shut down.
Initial risk: once all the above mentioned information has been recorded for an equipment, the initial risk for each failure can be defined by determining its probability of occurrence as well as the severity. It is important to understand that for defining the initial risk all failures are viewed under the consideration that no action is being considered. In order to support the risk definition the equipment history ( probability of occurrence) as well as lead times. After having defined the severity and the probability a crosscheck with the according risk matrix will show whether the specific failure mode is inside or outside the protection level


Oldest