FMEA[VisualPro Tech Insight] FMEA: Catching Semiconductor Fab Safety Equipment Failures Before They Happen

In our previous VisualPro Tech Insight, we introduced an example of safety analysis from a control perspective by identifying hidden threats in semiconductor fabs through STPA (Systems Theoretic Process Analysis). This time, we present a use case demonstrating how to proactively eliminate potential risks in semiconductor manufacturing environments by combining the STPA system safety analysis conducted previously with DFMEA design optimization, utilizing the VisualPro FMEA AI feature.


 6773dde640575.png

💡 Why is this analysis necessary? Semiconductor Fabs use dozens of toxic and flammable specialty gases. A single leak of gases like Silane (SiH₄), Ammonia (NH₃), or Chlorine (Cl₂) can lead to worker casualties, environmental pollution, and line stoppages costing tens of millions of dollars. The problem is that such accidents do not stem from a single component failure, but rather from a cascading failure of the entire safety loop: Sensor → Controller → Valve. It is difficult to discover these "systemic blind spots" through traditional component-level inspections alone.

Category

STPA (Top-Down)

DFMEA (Bottom-Up)

Perspective

Entire system control structure

Physical failure of individual components

Strengths

Discovers loopholes in control logic

Derives specific failure causes and quantified risks

Weaknesses

Does not reflect physical failure mechanisms

Prone to blind spots in system interactions

Core Idea: A two-track strategy using STPA to draw the big picture of "where the risks are," and DFMEA to precisely measure "why and how dangerous they are".


🔍 Step 1. Dissecting the Target System The analysis target is the On-site Integrated Safety System of a semiconductor manufacturing line. This system consists of three core components. If even one of these components fails to perform its role, the entire safety loop collapses.

6ced31d8f509a.png

 

🚨 Step 2. "What if it fails?" — Building a Failure Network 

 c575f84798df0.png

 8a4568c510b44.png

  • Caution: Severity of Top-Level Failure Effect: 9 / 10.
  • Toxic gas leaks threaten workers' lives without prior warning and result in environmental pollution and legal sanctions.
  • In effect, it is at a level that "must never happen".


📊 Step 3. Risk Quantification — How dangerous is it? The core of DFMEA is evaluating risk with numbers, not intuition. We scored each failure cause across three axes:

  • S (Severity): The size of the impact when a failure occurs → 9 points (Fixed, inherited from top-level Failure Effect).
  • O (Occurrence): The likelihood of the failure cause actually happening.
  • D (Detection): The current system's ability to detect the failure (A higher score indicates poorer detection capability).


🟢 Gas Detection Sensor

Item

Score

Rationale

 

Severity (S)

9

Severity of the Failure Effect (Worker casualties and environmental pollution)

Occurrence (O)

3

Sensor degradation due to corrosive gases occurs intermittently. Historical data shows a certain level of failure frequency

Detection (D)

2

Equipped with self-diagnosis; sends immediate alarms to the upper system upon sensor abnormality. Low probability of non-detection

Current Prevention

-

Periodic sensor calibration

Current Detection

-

Built-in sensor self-diagnosis function

AP (Action Priority)

L

Risk is controllable with the current management level, but there is room for further improvement


🔴 Interlock/EMO Valve

Item

Score

Rationale

 

Severity (S)

9

Severity of the Failure Effect (Worker casualties and environmental pollution)

Occurrence (O)

2

High-quality chemical-resistant sealing applied. Frequency of mechanical failures (blockage/leakage) is managed at a very low level

Detection (D)

3

Periodic operational tests are conducted, but it is difficult to detect potential mechanical sticking in real-time with 100% certainty until emergency activation

Current Prevention

-

Use of high-quality sealing material

Current Detection

-

Periodic operational testing

AP (Action Priority)

L

Requires improvement in terms of detection


🔵 Pressure Sensor

Item

Score

Rationale

 

Severity (S)

9

Severity of the Failure Effect (Worker casualties and environmental pollution)

Occurrence (O)

2

Thanks to a double-pipe structure and advanced pressure-resistant design, the probability of physical damage or sudden errors is very low

Detection (D)

2

Continuous monitoring of signs of minor leaks or pressure fluctuations via a pressure trend algorithm. Early detection is possible

Current Prevention

-

Pressure-resistant design and double-piping

Current Detection

-

Pressure trend anomaly detection algorithm

AP (Action Priority)

L

Current management level is good, but further improvement is possible by introducing predictive maintenance




🛡️ Step 4. Design Optimization — "Making it even safer"

Even if the current AP is Low (L) across the board, items with high severity must undergo review for further design improvements. This is the philosophy of DFMEA.


🟢 Gas Sensor Optimization S:9 / O:3→2 / D:2→1

Measure Type

Content

Improvement Effect


Design/Prevention

Design change to high-reliability Infrared (IR) gas sensor

Fundamentally blocks degradation from corrosion → Occurrence 3→2

Detection Enhancement

Add sensor response delay monitoring logic to the upper controller

Self-diagnosis + response pattern cross-validation → Detection 2→1


🔴 Valve Optimization S:9 / O:2→1 / D:3→1

Measure Type

Content

Improvement Effect


Design/Prevention

Design change to dual-diaphragm structure + special coated sealing

Physically blocks mechanical sticking → Occurrence 2→1

Detection Enhancement

Stem micro-movement position sensor + alarm system integration

100% early detection of micro-sticking signs → Detection 3→1


🔵 Pressure Sensor Optimization S:9 / O:2→1 / D:2→1

Measure Type

Content

Improvement Effect


Design/Prevention

Reinforced protective housing + upgraded heat/pressure resistance specifications

Perfect protection against physical impacts → Occurrence 2→1

Detection Enhancement

Introduction of AI-based early anomaly sign analysis model (Machine Learning)

Proactive capture of micro-anomaly patterns → Detection 2→1

 

📈 Before vs After — Visualizing the Improvement (AIAG-VDA Standard)

The latest AIAG-VDA FMEA standard evaluates risk using the AP (Action Priority) criteria rather than the outdated RPN (S×O×D) multiplication method. Due to rigorous initial design, our target system achieved an AP: Low (L) rating even in its initial evaluation.

However, settling for the status quo just because it's "Low" violates the zero-defect principle of semiconductor fabs. For critical items with extreme severity (S=9), we implemented fundamental prevention and detection measures to drive both Occurrence (O) and Detection (D) as close to 1 (the theoretical minimum) as possible, achieving an ultra-gap safety design.

1d8c6359952b9.png

Component

Initial State (S / O / D)

Post-Optimization (S / O / D)

AP (Action Priority)

Significance of Improvement


Gas Sensor

9 / 3 / 2

9 / 2 / 1

L ➔ L (Maintained)

Lowers occurrence and maximizes detection to the limit (1 pt)

Valve

9 / 2 / 3

9 / 1 / 1

L ➔ L (Maintained)

Structurally prevents the possibility of mechanical failure fundamentally (O=1 pt)

Pressure Sensor

9 / 2 / 2

9 / 1 / 1

L ➔ L (Maintained)

Perfect control before physical signs appear through AI predictive maintenance

※IMPORTANT

  • Optimization Philosophy based on AP (Action Priority): Severity (S) represents the intrinsic risk of a failure caused by the system and therefore cannot be lowered.
  • The latest standards discourage the traditional approach of merely lowering RPN scores.
  • Instead, they recommend implementing practical enhancements in prevention and detection to bring Occurrence (O) and Detection (D) down to 1 for items with high severity (S=9~10), even if their AP is already at a Low level.


🔮 Conclusion: The Era of Catching Failures Before They Occur The greatest insight gained from this project is that "the combination of analysis tools determines the depth of the analysis". STPA captured risk scenarios from the control flow of the entire system , DFMEA linked those scenarios to physical failure mechanisms and quantified them , and the AI predictive maintenance model completed a structure that proactively prevents even future failures. Safety in a semiconductor fab must be in the realm of "pre-accident prevention," not "post-accident response".

※Note: This analysis was conducted through the AI model connection feature via MCP within the VisualPro DFMEA environment and was written in conjunction with the existing STPA analysis.



Roh Kyung Hyun
04559, 5F Pyeonggwang Building, 243 Toegye-ro, Jung-gu, Seoul (Chungmuro 5-ga 19-19)
+82-10-8337-9837
631-81-00287
www.vwaycorp.com
vway@vwaycorp.com

© VWAY All rights reserved


Representative

Roh Kyung HyunBusiness Registration Number
631-81-00287
Company Address
5th Floor, Pyeong-kwang B/D, 243, Toegye-ro, Jung-gu, Seoul, Republic of Korea
Website
www.vwaycorp.com
Telephone
+82-2-2285-6541
Representative Email
vway@vwaycorp.com

© VWAY All rights reserved