Modern systems, such as smart factories, autonomous driving, aerospace, and aviation, have become incomparably more complex than in the past. Rather than simple component failures, there are now many cases where 'incorrect interactions between system components' lead to major accidents.
To control the complexity of these modern systems, an innovative safety analysis technique developed at MIT, known as STPA (Systems Theoretic Process Analysis), is utilized. Today, we will explore how to conduct an STPA analysis focusing on 'Semiconductor Fab Safety' using our VisualPro software.
Note: The risk-related scenarios and analysis content in this material are fictional, created with the assistance of AI, and are not related to any specific real-world organization or individual.

Step 1: What Must Be Prevented? (Defining Losses and Hazards)
The first step is to identify situations that must not occur in the system—Losses—and the system states that lead to those losses—Hazards.
In this analysis, the following losses were identified:
ID | Loss | Traceability |
Loss-1 | Loss of life and injury (toxic gas leaks, explosions, occupational diseases, physical safety accidents) | H-1, H-2, H-3, H-5 |
Loss-2 | Environmental pollution (atmospheric dispersion of harmful gases, unauthorized discharge of toxic wastewater, soil contamination) | H-1, H-4 |
Loss-3 | Financial loss and equipment damage (damage to expensive equipment due to fire/explosion, mass product disposal and recalls) | H-1, H-2, H-3, H-4, H-5 |
Loss-4 | Mission (production) failure (full factory shutdown due to accidents, failure to secure yield, supply chain disruption) | H-1, H-2, H-4, H-5 |
As the causes of these losses, the following hazards were defined:
ID | Hazard | Traceability |
H-1 | Leakage of highly toxic and corrosive chemicals | Loss-1, Loss-2, Loss-3, Loss-4 |
H-2 | Accumulation of flammable/explosive gases in confined spaces and exposure to ignition sources | Loss-1, Loss-3, Loss-4 |
H-3 | Neutralization of shielding and interlocks for high-energy sources (radiation, high temperature, etc.) | Loss-1, Loss-3 |
H-4 | Bypassing of hazardous pollutant purification facilities and untreated discharge | Loss-2, Loss-3, Loss-4 |
H-5 | Non-compliance with Lockout/Tagout (LOTO) procedures during hazardous work and neglect of defects | Loss-1, Loss-3, Loss-4 |
Step 2: How Is the System Controlled? (Control Structure Modeling)
This step visualizes the commands and feedback exchanged between system components. Using VisualPro, even complex diagrams can be drawn with ease.
In the semiconductor fab control structure, there are broadly four Controllers managing the subordinate Semiconductor Manufacturing Process & Pollution Prevention Facilities (Controlled Process):
- Government & Regulatory/Standardization Bodies: Enforce legal compliance and issue corrective orders.
- Top Management & Corporate EHS Organization: Issue Standard Operating Procedures (SOPs) and operate the work permit system.
- On-site Integrated System & Safety Managers: Execute automatic equipment interlocks and Emergency Off (EMO) mechanisms.
- Field Operators & Maintenance Personnel: Manually operate chemical equipment and implement LOTO.

Diagramming this Master structure results in the following:
Name | Field Operator and Maintenance Personnel |
Type | Human |
Level | 1 |
Responsibilities | [None] |
Process Model | [None] |
Control Action | Target of Control Action |
Manual operation of chemical equipment & Implementation of LOTO (Lockout/Tagout) | Semiconductor Manufacturing Process and Pollution Prevention Facilities |
Feedback | Target of Feedback |
Reporting of safety blind spots and hazards (Near-misses) | Top Management and Corporate EHS Organization |
Name | On-site Integrated System and Safety Manager |
Type | Controller |
Level | 1 |
Responsibilities | [None] |
Process Model | [None] |
Control Action | Target of Control Action |
Automated equipment Interlock & Emergency Off (EMO) | Semiconductor Manufacturing Process and Pollution Prevention Facilities |
Feedback | Target of Feedback |
Reporting of on-site diagnostic data & Safety expenditure status | Top Management and Corporate EHS Organization |
Name | Semiconductor Manufacturing Process and Pollution Prevention Facilities |
Type | Controlled Process |
Level | 1 |
Control Action | Target of Control Action |
External discharge of treated wastewater and exhaust gas | Surrounding Ecosystem (External Factors) and Local Community |
Feedback | Target of Feedback |
Physical pressure gauge indication & Emergency siren alarm | Field Operator and Maintenance Personnel |
Real-time data streaming (Gas concentration, vibration, etc.) | On-site Integrated System and Safety Manager |
Name | Government and Regulatory/Standards Bodies |
Type | External Controller |
Level | 1 |
Control Action | Target of Control Action |
Enforcement of legal compliance & Corrective orders | Top Management and Corporate EHS Organization |
Feedback | Target of Feedback |
[None] | [None] |
Step 3: Which Control Actions Create Risks? (UCA Identification)
Now, it is time to identify Unsafe Control Actions (UCAs). We analyze the risks that occur when a control action is not provided, is provided incorrectly, or has incorrect timing.
Through VisualPro, we were able to identify critical UCAs such as the following:
- (UCA-N-1): The Top Management/EHS Organization fails to issue a mandatory work permit requiring a preliminary risk assessment, leaving workers to perform inspections exposed to danger (Not providing).
- (UCA-P-2): The permit is arbitrarily and incorrectly approved due to production schedule pressure even though gas shut-off is incomplete (Providing).
- (UCA-N-9): Field operators proceed with opening equipment without physically locking out hazardous energy valves (LOTO).
The full list of derived UCAs is mapped out accordingly.
Control Action (Source -> Target) | Level | UCA Flag | UCA ID | Description | Assumption |
SOP Issuance & Work Permit System Operation
(Top Management & Corporate EHS Organization -> Field Operators & Maintenance Personnel) | 1 | Not providing causes hazard | (UCA-N-1) | Failure to issue a mandatory work permit requiring a preliminary risk assessment, leaving the operator to conduct inspections exposed to danger [H-1, H-2] |
|
| Providing causes hazard | (UCA-P-2) | Arbitrary incorrect permit approval due to production schedule pressure even though gas shut-off is incomplete [H-1, H-2] |
|
| Too early, too late, out of order | (UCA-T-3) | Allowing entry too early before the residual toxic gas purge is complete [H-1] |
|
| Stopped too soon, applied too long | (UCA-S-4) | Leaving the existing permit approval state active for too long without re-evaluation despite changes in the hazardous environment [H-1] |
|
Enforcement of Legal Compliance & Corrective Orders
(Government & Regulatory/Standards Bodies -> Top Management & Corporate EHS Organization) | 1 | Not providing causes hazard |
|
|
|
| Providing causes hazard |
|
|
|
| Too early, too late, out of order |
|
|
|
| Stopped too soon, applied too long |
|
|
|
Automated Equipment Interlock & Emergency Off (EMO)
(On-site Integrated System & Safety Manager -> Semiconductor Manufacturing Process & Pollution Prevention Facilities) | 1 | Not providing causes hazard | (UCA-N-5) | Failure to provide Interlock/EMO emergency control even after detecting signs of a gas leak, leaving the leak unattended [H-1, H-2] |
|
| Providing causes hazard | (UCA-P-6) | Forcible EMO activation due to sensor malfunction during the process, shutting off even essential cooling systems [H-1] |
|
| Too early, too late, out of order | (UCA-T-7) | Emergency activation is too late due to leak notification relay delays, resulting in fatal concentration levels being exceeded [H-1] |
|
| Stopped too soon, applied too long | (UCA-S-8) | Releasing the safety shut-off too early while risk factors are still unresolved, causing a secondary leak [H-1] |
|
Issuance of Safety Goals & Direction for Field Supervision
(Top Management & Corporate EHS Organization -> On-site Integrated System & Safety Manager) | 1 | Not providing causes hazard |
|
|
|
| Providing causes hazard |
|
|
|
| Too early, too late, out of order |
|
|
|
| Stopped too soon, applied too long |
|
|
|
External Discharge of Treated Wastewater & Exhaust Gas
(Semiconductor Manufacturing Process & Pollution Prevention Facilities -> Surrounding Ecosystem & Local Community) | 1 | Not providing causes hazard |
|
|
|
| Providing causes hazard |
|
|
|
| Too early, too late, out of order |
|
|
|
| Stopped too soon, applied too long |
|
|
|
Manual Operation of Chemical Equipment & LOTO Implementation
(Field Operators & Maintenance Personnel -> Semiconductor Manufacturing Process & Pollution Prevention Facilities) | 1 | Not providing causes hazard | (UCA-N-9) | Proceeding with equipment opening without physically locking out (LOTO) the hazardous energy valves [H-3, H-5] |
|
| Providing causes hazard | (UCA-P-10) | Applying LOTO to the wrong valve or mistakenly locking the scrubber exhaust valve where ventilation is essential [H-3, H-5] |
|
| Too early, too late, out of order | (UCA-T-11) | Implementing the LOTO lock too late, long after entering the work environment, leading to exposure from initially emitted gases [H-5] |
|
| Stopped too soon, applied too long | (UCA-S-12) | Releasing the LOTO lock too early before checking the surroundings while a colleague is still inside, causing a gas/power leak [H-5] |
|
Step 4: Why Did Such UCAs Occur? (Loss Scenario Discovery)
We trace the causes (scenarios) explaining why the Unsafe Control Actions (UCAs) inevitably occurred. This includes all aspects such as system defects, sensor errors, and operator misunderstandings.
- System Misjudgment (LS-1): Even though there were signs of a minor hydrofluoric acid gas leak the previous day, the EHS system fails to recognize it, misclassifies the task as 'simple lighting/filter replacement,' and omits the risk assessment permit.
- Psychological Pressure (LS-4): As exhaust system failure alarms and urgent maintenance requests from the production team arrive simultaneously, a manager feeling psychological pressure prioritizes the production team and bypasses the permit procedure.
- Model Inconsistency (LS-8): Despite receiving operator feedback, the referenced drawings are outdated, leading to the failure to recognize and address the gas leak risks in newly bypassed piping.
The complete list of Loss Scenarios related to the UCAs is generated for thorough analysis.
Control Action | SOP Issuance & Work Permit System Operation |
Level | 1 |
UCA-N-1 | Failure to issue a mandatory work permit requiring a preliminary risk assessment, leaving the operator to conduct inspections exposed to danger |
ID | Loss Scenario |
LS-1 | [Class 1/Case 6] Permit omission due to blind spots in rule-based logic that ignores exceptional situations |
LS-1-1 | The EHS system determines 'simple lighting/filter replacement' as a general task not requiring a preliminary risk assessment. Even though there were signs of a minor hydrofluoric acid gas leak the previous day, the system fails to recognize it and omits the permit. |
LS-2 | [Class 1/Case 20] Permit omission due to blind faith in past default safety states in the absence of feedback |
LS-2-1 | When the gas concentration sensor fails to operate due to a communication disconnection, the manager blindly trusts the past state, assuming it is safe because a purge was done yesterday, and verbally proceeds without a permit. |
LS-3 | [Class 1/Case 23] Failure to send the actual on-site permit because the server was left in simulation mode |
LS-3-1 | The EHS server remains in 'diagnostic mode' after a weekend inspection, mistaking a silane pipe repair permit request for test data and omitting issuance to the actual site. |
LS-4 | [Class 1/Case 30] Ignoring the risk assessment by prioritizing the production team's urgent request during multiple input conflicts |
LS-4-1 | As exhaust system failure alarms and urgent maintenance requests from the production team arrive simultaneously, a manager feeling psychological pressure prioritizes the production team and bypasses the permit procedure. |
LS-5 | [Class 1/Case 18] Exclusion from risk assessment target by misinterpreting pressure drop feedback as an empty gas cylinder |
LS-5-1 | A sudden drop in gas cabinet pressure is misinterpreted as 'the operator has already closed the valve' instead of a gas leak, thereby invalidating the review logic. |
LS-6 | [Class 2/Case 8] Failure to recognize fatal shielded area opening feedback due to alarm flood overload |
LS-6-1 | Due to a flood of tens of thousands of sensor alarms during a factory maintenance period, critical alarms such as the opening of a shielded door are missed, and the permit issuance procedure is not initiated. |
LS-7 | [Class 2/Case 11] Ignoring ambiguous field anomaly reports due to excessive blind faith in machine readings |
LS-7-1 | When a field operator's radio report of an ammonia smell conflicts with the system data showing 0%, the mechanical readings are absolutely trusted, ignoring the field judgment, and personnel are dispatched. |
LS-8 | [Class 2/Case 7] Overlooking the risk of gas leaks in bypass piping by applying an outdated process model (old drawings) |
LS-8-1 | Despite receiving operator feedback, the referenced drawing is an outdated process model, leading to the failure to recognize and address the gas leak risks in newly bypassed piping. |
LS-9 | [Class 3/Case 9] Both commands are rejected as control actions from different sources collide at the field terminal |
LS-9-1 | An 'access control' command from a smart pad and a 'maintenance permit' message from the center collide at the door terminal, and both are rejected, creating a control gap. |
LS-10 | [Class 3/Case 5] Permit packet is lost due to being overwritten by a broadcast message on the wireless communication network |
LS-10-1 | At the moment of transmitting a specific work instruction packet, a firmware update broadcast distributed throughout the factory consumes the communication network, causing data loss. |
LS-11 | [Class 3/Case 7] Reversal due to a past 'cancel' command arriving after the latest 'permit' due to a communication buffer delay |
LS-11-1 | Due to a buffer delay, an old 'permit cancellation' packet from yesterday is processed 1 second after the newly approved signal, resulting in the permit ultimately being canceled on-site. |
LS-12 | [Class 4/Case 7] A screen door control chip that should be blocked allows the door to open due to being stuck in standby mode |
LS-12-1 | Zone lockdown should be the default upon error, but the door control chipset gets stuck in standby mode, ignoring the system's access block command and allowing a free pass. |
LS-13 | [Class 4/Case 6] Physical unlocking outside the central control logic due to latch (component) damage caused by corrosive gas |
LS-13-1 | The central server assumes the door is locked because no permit was issued, but the actual on-site locking latch is completely rusted by chemical fumes and clatters open. |
LS-14 | [Class 4/Case 13] Corruption of received permit data due to High Frequency (RF) electromagnetic noise interference |
LS-14-1 | The permit data packet on the field pad is corrupted by strong noise generated during the operation of high-power plasma etching equipment. The operator mistakes it for a standby delay and begins arbitrary disassembly. |
LS-15 | [Class 4/Case 4] Standby team exposure due to self-triggered emergency exhaust caused by errors such as internal sensor failure |
LS-15-1 | While waiting for central review, a secondary sensor inside the cabinet malfunctions and autonomously triggers emergency exhaust, causing backflowing gas to strike the work team waiting outside the door for the permit. |
Step 5: How Do We Make It Safe? (Establishing Countermeasures)
Finally, countermeasures that can fundamentally block the identified scenarios are established as system requirements.
- [CM-1] Cross-validation Interlock Logic: Integrates with gas leak anomaly symptom monitoring to completely block automatic entry approval upon detecting errors.
- [CM-4] Hard-coded Hierarchical Safety Interrupt: Grants the highest authority when life-threatening alarms (e.g., exhaust system failure) occur, indefinitely suspending operations.
- [CM-8] Digital Twin Piping Recognition Forced Lock: Triggers a system-level permit lock if the RFID information of newly added valves does not match the central drawings.
Examples of comprehensive countermeasures are mapped directly to their corresponding Loss Scenarios to ensure complete coverage.
ID | Priority | Countermeasure | Description | Traceability |
CM-1 | 1 | Cross-validation Interlock Logic | Integrates with gas leak anomaly symptom monitoring to completely block automatic entry approval upon detecting errors. | LS-1 |
CM-2 | 1 | Hardware-level Fail-Safe Default Design | Upon loss of sensor feedback, ignores the previous state and immediately defaults to a 'Danger (Access Prohibited)' state. | LS-2 |
CM-3 | 1 | Forced Network Isolation and Auto-Reboot in Diagnostic Mode | Applies a test mode timer; unconditionally forces a return to real-time operation mode upon timeout. | LS-3 |
CM-4 | 1 | Hard-coded Hierarchical Safety Interrupt | Grants the highest scheduling authority to life-threatening alarms (e.g., exhaust system failure) and suspends operations indefinitely. | LS-4 |
CM-5 | 1 | Multi-variable Sensor Fusion (AND Gate) | Approves 'Safe' logic only when complex conditions (purge/pump, etc.) are met, rather than relying on a single pressure sensor. | LS-5 |
CM-6 | 1 | Alarm Triage and Independent Relay Indicator Network | Assigns critical hazard signals to hardware warning lights isolated from general monitors and prevents operators from muting them. | LS-6 |
CM-7 | 1 | Manual Override Lockdown | Overrides existing 0% digital sensor readings upon manual field reports (e.g., emergency buttons), enforcing the highest-priority forced lockdown control. | LS-7 |
CM-8 | 1 | Digital Twin Piping Recognition Forced Lock | Triggers a system-level permit lock if the RFID information of newly added valves does not match the central drawings. | LS-8 |
CM-9 | 1 | Deterministic Arbiter Gate | Outputs an unconditionally conservative measure (e.g., total lockdown) via hardware logic combinations when conflicting commands are received at the device. | LS-9 |
CM-10 | 1 | Dedicated Independent Fieldbus Network Allocation for Safety | Physically separates the safety permit packet network to prevent interference from high-volume traffic such as general firmware updates. | LS-10 |
CM-11 | 1 | Timestamp Sequencing Discard (TTL) | Forces fail-safe discarding of arriving packets that have exceeded their timeout to prevent command reversal caused by communication buffer delays. | LS-11 |
CM-12 | 1 | Normally Closed (NC) Relay Integrated Watchdog | Treats chipset freezing in standby mode as a missing survival heartbeat, cutting power and forcibly maintaining a door-closed state. | LS-12 |
CM-13 | 1 | Physical Tension-based Position Feedback Sensor | Measures actual latch corrosion/friction in addition to door closure sensors; triggers immediate lockdown if the probability of loosening/separation increases. | LS-13 |
CM-14 | 1 | Noise-Shielded Optical Line Conversion & Enhanced CRC | Designs optical cables for high-frequency interference process zones and installs structures to instantly block corrupted received data packets. | LS-14 |
CM-15 | 1 | Hardware-Isolated Check Valve Forced Vent Configuration | Designed to completely block gas backflow using a mechanical damper (Check Valve) even if the internal S/W sensor malfunctions and reverses rotation. | LS-15 |
In this way, STPA maps out the big picture of the system, persistently investigates blind spots that can occur in the interactions between components, and derives the most definitive safety measures.

<VisualPro: The Official MIT-Certified STPA Analysis Tool>
By systematically conducting analyses with VisualPro, you can easily follow along and perform seemingly complex STPA analyses without omissions. Through its intuitive UI and automated traceability management, anyone can perform expert-level STPA analysis. Recently, an AI Chat feature was added, assisting in STPA analysis to make it much easier and more accurate.
Experience easy-to-understand STPA analysis with VisualPro.
Modern systems, such as smart factories, autonomous driving, aerospace, and aviation, have become incomparably more complex than in the past. Rather than simple component failures, there are now many cases where 'incorrect interactions between system components' lead to major accidents.
To control the complexity of these modern systems, an innovative safety analysis technique developed at MIT, known as STPA (Systems Theoretic Process Analysis), is utilized. Today, we will explore how to conduct an STPA analysis focusing on 'Semiconductor Fab Safety' using our VisualPro software.
Note: The risk-related scenarios and analysis content in this material are fictional, created with the assistance of AI, and are not related to any specific real-world organization or individual.
Step 1: What Must Be Prevented? (Defining Losses and Hazards)
The first step is to identify situations that must not occur in the system—Losses—and the system states that lead to those losses—Hazards.
In this analysis, the following losses were identified:
ID
Loss
Traceability
Loss-1
Loss of life and injury (toxic gas leaks, explosions, occupational diseases, physical safety accidents)
H-1, H-2, H-3, H-5
Loss-2
Environmental pollution (atmospheric dispersion of harmful gases, unauthorized discharge of toxic wastewater, soil contamination)
H-1, H-4
Loss-3
Financial loss and equipment damage (damage to expensive equipment due to fire/explosion, mass product disposal and recalls)
H-1, H-2, H-3, H-4, H-5
Loss-4
Mission (production) failure (full factory shutdown due to accidents, failure to secure yield, supply chain disruption)
H-1, H-2, H-4, H-5
As the causes of these losses, the following hazards were defined:
ID
Hazard
Traceability
H-1
Leakage of highly toxic and corrosive chemicals
Loss-1, Loss-2, Loss-3, Loss-4
H-2
Accumulation of flammable/explosive gases in confined spaces and exposure to ignition sources
Loss-1, Loss-3, Loss-4
H-3
Neutralization of shielding and interlocks for high-energy sources (radiation, high temperature, etc.)
Loss-1, Loss-3
H-4
Bypassing of hazardous pollutant purification facilities and untreated discharge
Loss-2, Loss-3, Loss-4
H-5
Non-compliance with Lockout/Tagout (LOTO) procedures during hazardous work and neglect of defects
Loss-1, Loss-3, Loss-4
Step 2: How Is the System Controlled? (Control Structure Modeling)
This step visualizes the commands and feedback exchanged between system components. Using VisualPro, even complex diagrams can be drawn with ease.
In the semiconductor fab control structure, there are broadly four Controllers managing the subordinate Semiconductor Manufacturing Process & Pollution Prevention Facilities (Controlled Process):
Diagramming this Master structure results in the following:
Name
Field Operator and Maintenance Personnel
Type
Human
Level
1
Responsibilities
[None]
Process Model
[None]
Control Action
Target of Control Action
Manual operation of chemical equipment & Implementation of LOTO (Lockout/Tagout)
Semiconductor Manufacturing Process and Pollution Prevention Facilities
Feedback
Target of Feedback
Reporting of safety blind spots and hazards (Near-misses)
Top Management and Corporate EHS Organization
Name
On-site Integrated System and Safety Manager
Type
Controller
Level
1
Responsibilities
[None]
Process Model
[None]
Control Action
Target of Control Action
Automated equipment Interlock & Emergency Off (EMO)
Semiconductor Manufacturing Process and Pollution Prevention Facilities
Feedback
Target of Feedback
Reporting of on-site diagnostic data & Safety expenditure status
Top Management and Corporate EHS Organization
Name
Semiconductor Manufacturing Process and Pollution Prevention Facilities
Type
Controlled Process
Level
1
Control Action
Target of Control Action
External discharge of treated wastewater and exhaust gas
Surrounding Ecosystem (External Factors) and Local Community
Feedback
Target of Feedback
Physical pressure gauge indication & Emergency siren alarm
Field Operator and Maintenance Personnel
Real-time data streaming (Gas concentration, vibration, etc.)
On-site Integrated System and Safety Manager
Name
Government and Regulatory/Standards Bodies
Type
External Controller
Level
1
Control Action
Target of Control Action
Enforcement of legal compliance & Corrective orders
Top Management and Corporate EHS Organization
Feedback
Target of Feedback
[None]
[None]
Step 3: Which Control Actions Create Risks? (UCA Identification)
Now, it is time to identify Unsafe Control Actions (UCAs). We analyze the risks that occur when a control action is not provided, is provided incorrectly, or has incorrect timing.
Through VisualPro, we were able to identify critical UCAs such as the following:
The full list of derived UCAs is mapped out accordingly.
Control Action (Source -> Target)
Level
UCA Flag
UCA ID
Description
Assumption
SOP Issuance & Work Permit System Operation
(Top Management & Corporate EHS Organization -> Field Operators & Maintenance Personnel)
1
Not providing causes hazard
(UCA-N-1)
Failure to issue a mandatory work permit requiring a preliminary risk assessment, leaving the operator to conduct inspections exposed to danger [H-1, H-2]
Providing causes hazard
(UCA-P-2)
Arbitrary incorrect permit approval due to production schedule pressure even though gas shut-off is incomplete [H-1, H-2]
Too early, too late, out of order
(UCA-T-3)
Allowing entry too early before the residual toxic gas purge is complete [H-1]
Stopped too soon, applied too long
(UCA-S-4)
Leaving the existing permit approval state active for too long without re-evaluation despite changes in the hazardous environment [H-1]
Enforcement of Legal Compliance & Corrective Orders
(Government & Regulatory/Standards Bodies -> Top Management & Corporate EHS Organization)
1
Not providing causes hazard
Providing causes hazard
Too early, too late, out of order
Stopped too soon, applied too long
Automated Equipment Interlock & Emergency Off (EMO)
(On-site Integrated System & Safety Manager -> Semiconductor Manufacturing Process & Pollution Prevention Facilities)
1
Not providing causes hazard
(UCA-N-5)
Failure to provide Interlock/EMO emergency control even after detecting signs of a gas leak, leaving the leak unattended [H-1, H-2]
Providing causes hazard
(UCA-P-6)
Forcible EMO activation due to sensor malfunction during the process, shutting off even essential cooling systems [H-1]
Too early, too late, out of order
(UCA-T-7)
Emergency activation is too late due to leak notification relay delays, resulting in fatal concentration levels being exceeded [H-1]
Stopped too soon, applied too long
(UCA-S-8)
Releasing the safety shut-off too early while risk factors are still unresolved, causing a secondary leak [H-1]
Issuance of Safety Goals & Direction for Field Supervision
(Top Management & Corporate EHS Organization -> On-site Integrated System & Safety Manager)
1
Not providing causes hazard
Providing causes hazard
Too early, too late, out of order
Stopped too soon, applied too long
External Discharge of Treated Wastewater & Exhaust Gas
(Semiconductor Manufacturing Process & Pollution Prevention Facilities -> Surrounding Ecosystem & Local Community)
1
Not providing causes hazard
Providing causes hazard
Too early, too late, out of order
Stopped too soon, applied too long
Manual Operation of Chemical Equipment & LOTO Implementation
(Field Operators & Maintenance Personnel -> Semiconductor Manufacturing Process & Pollution Prevention Facilities)
1
Not providing causes hazard
(UCA-N-9)
Proceeding with equipment opening without physically locking out (LOTO) the hazardous energy valves [H-3, H-5]
Providing causes hazard
(UCA-P-10)
Applying LOTO to the wrong valve or mistakenly locking the scrubber exhaust valve where ventilation is essential [H-3, H-5]
Too early, too late, out of order
(UCA-T-11)
Implementing the LOTO lock too late, long after entering the work environment, leading to exposure from initially emitted gases [H-5]
Stopped too soon, applied too long
(UCA-S-12)
Releasing the LOTO lock too early before checking the surroundings while a colleague is still inside, causing a gas/power leak [H-5]
Step 4: Why Did Such UCAs Occur? (Loss Scenario Discovery)
We trace the causes (scenarios) explaining why the Unsafe Control Actions (UCAs) inevitably occurred. This includes all aspects such as system defects, sensor errors, and operator misunderstandings.
The complete list of Loss Scenarios related to the UCAs is generated for thorough analysis.
Control Action
SOP Issuance & Work Permit System Operation
Level
1
UCA-N-1
Failure to issue a mandatory work permit requiring a preliminary risk assessment, leaving the operator to conduct inspections exposed to danger
ID
Loss Scenario
LS-1
[Class 1/Case 6] Permit omission due to blind spots in rule-based logic that ignores exceptional situations
LS-1-1
The EHS system determines 'simple lighting/filter replacement' as a general task not requiring a preliminary risk assessment. Even though there were signs of a minor hydrofluoric acid gas leak the previous day, the system fails to recognize it and omits the permit.
LS-2
[Class 1/Case 20] Permit omission due to blind faith in past default safety states in the absence of feedback
LS-2-1
When the gas concentration sensor fails to operate due to a communication disconnection, the manager blindly trusts the past state, assuming it is safe because a purge was done yesterday, and verbally proceeds without a permit.
LS-3
[Class 1/Case 23] Failure to send the actual on-site permit because the server was left in simulation mode
LS-3-1
The EHS server remains in 'diagnostic mode' after a weekend inspection, mistaking a silane pipe repair permit request for test data and omitting issuance to the actual site.
LS-4
[Class 1/Case 30] Ignoring the risk assessment by prioritizing the production team's urgent request during multiple input conflicts
LS-4-1
As exhaust system failure alarms and urgent maintenance requests from the production team arrive simultaneously, a manager feeling psychological pressure prioritizes the production team and bypasses the permit procedure.
LS-5
[Class 1/Case 18] Exclusion from risk assessment target by misinterpreting pressure drop feedback as an empty gas cylinder
LS-5-1
A sudden drop in gas cabinet pressure is misinterpreted as 'the operator has already closed the valve' instead of a gas leak, thereby invalidating the review logic.
LS-6
[Class 2/Case 8] Failure to recognize fatal shielded area opening feedback due to alarm flood overload
LS-6-1
Due to a flood of tens of thousands of sensor alarms during a factory maintenance period, critical alarms such as the opening of a shielded door are missed, and the permit issuance procedure is not initiated.
LS-7
[Class 2/Case 11] Ignoring ambiguous field anomaly reports due to excessive blind faith in machine readings
LS-7-1
When a field operator's radio report of an ammonia smell conflicts with the system data showing 0%, the mechanical readings are absolutely trusted, ignoring the field judgment, and personnel are dispatched.
LS-8
[Class 2/Case 7] Overlooking the risk of gas leaks in bypass piping by applying an outdated process model (old drawings)
LS-8-1
Despite receiving operator feedback, the referenced drawing is an outdated process model, leading to the failure to recognize and address the gas leak risks in newly bypassed piping.
LS-9
[Class 3/Case 9] Both commands are rejected as control actions from different sources collide at the field terminal
LS-9-1
An 'access control' command from a smart pad and a 'maintenance permit' message from the center collide at the door terminal, and both are rejected, creating a control gap.
LS-10
[Class 3/Case 5] Permit packet is lost due to being overwritten by a broadcast message on the wireless communication network
LS-10-1
At the moment of transmitting a specific work instruction packet, a firmware update broadcast distributed throughout the factory consumes the communication network, causing data loss.
LS-11
[Class 3/Case 7] Reversal due to a past 'cancel' command arriving after the latest 'permit' due to a communication buffer delay
LS-11-1
Due to a buffer delay, an old 'permit cancellation' packet from yesterday is processed 1 second after the newly approved signal, resulting in the permit ultimately being canceled on-site.
LS-12
[Class 4/Case 7] A screen door control chip that should be blocked allows the door to open due to being stuck in standby mode
LS-12-1
Zone lockdown should be the default upon error, but the door control chipset gets stuck in standby mode, ignoring the system's access block command and allowing a free pass.
LS-13
[Class 4/Case 6] Physical unlocking outside the central control logic due to latch (component) damage caused by corrosive gas
LS-13-1
The central server assumes the door is locked because no permit was issued, but the actual on-site locking latch is completely rusted by chemical fumes and clatters open.
LS-14
[Class 4/Case 13] Corruption of received permit data due to High Frequency (RF) electromagnetic noise interference
LS-14-1
The permit data packet on the field pad is corrupted by strong noise generated during the operation of high-power plasma etching equipment. The operator mistakes it for a standby delay and begins arbitrary disassembly.
LS-15
[Class 4/Case 4] Standby team exposure due to self-triggered emergency exhaust caused by errors such as internal sensor failure
LS-15-1
While waiting for central review, a secondary sensor inside the cabinet malfunctions and autonomously triggers emergency exhaust, causing backflowing gas to strike the work team waiting outside the door for the permit.
Step 5: How Do We Make It Safe? (Establishing Countermeasures)
Finally, countermeasures that can fundamentally block the identified scenarios are established as system requirements.
Examples of comprehensive countermeasures are mapped directly to their corresponding Loss Scenarios to ensure complete coverage.
ID
Priority
Countermeasure
Description
Traceability
CM-1
1
Cross-validation Interlock Logic
Integrates with gas leak anomaly symptom monitoring to completely block automatic entry approval upon detecting errors.
LS-1
CM-2
1
Hardware-level Fail-Safe Default Design
Upon loss of sensor feedback, ignores the previous state and immediately defaults to a 'Danger (Access Prohibited)' state.
LS-2
CM-3
1
Forced Network Isolation and Auto-Reboot in Diagnostic Mode
Applies a test mode timer; unconditionally forces a return to real-time operation mode upon timeout.
LS-3
CM-4
1
Hard-coded Hierarchical Safety Interrupt
Grants the highest scheduling authority to life-threatening alarms (e.g., exhaust system failure) and suspends operations indefinitely.
LS-4
CM-5
1
Multi-variable Sensor Fusion (AND Gate)
Approves 'Safe' logic only when complex conditions (purge/pump, etc.) are met, rather than relying on a single pressure sensor.
LS-5
CM-6
1
Alarm Triage and Independent Relay Indicator Network
Assigns critical hazard signals to hardware warning lights isolated from general monitors and prevents operators from muting them.
LS-6
CM-7
1
Manual Override Lockdown
Overrides existing 0% digital sensor readings upon manual field reports (e.g., emergency buttons), enforcing the highest-priority forced lockdown control.
LS-7
CM-8
1
Digital Twin Piping Recognition Forced Lock
Triggers a system-level permit lock if the RFID information of newly added valves does not match the central drawings.
LS-8
CM-9
1
Deterministic Arbiter Gate
Outputs an unconditionally conservative measure (e.g., total lockdown) via hardware logic combinations when conflicting commands are received at the device.
LS-9
CM-10
1
Dedicated Independent Fieldbus Network Allocation for Safety
Physically separates the safety permit packet network to prevent interference from high-volume traffic such as general firmware updates.
LS-10
CM-11
1
Timestamp Sequencing Discard (TTL)
Forces fail-safe discarding of arriving packets that have exceeded their timeout to prevent command reversal caused by communication buffer delays.
LS-11
CM-12
1
Normally Closed (NC) Relay Integrated Watchdog
Treats chipset freezing in standby mode as a missing survival heartbeat, cutting power and forcibly maintaining a door-closed state.
LS-12
CM-13
1
Physical Tension-based Position Feedback Sensor
Measures actual latch corrosion/friction in addition to door closure sensors; triggers immediate lockdown if the probability of loosening/separation increases.
LS-13
CM-14
1
Noise-Shielded Optical Line Conversion & Enhanced CRC
Designs optical cables for high-frequency interference process zones and installs structures to instantly block corrupted received data packets.
LS-14
CM-15
1
Hardware-Isolated Check Valve Forced Vent Configuration
Designed to completely block gas backflow using a mechanical damper (Check Valve) even if the internal S/W sensor malfunctions and reverses rotation.
LS-15
In this way, STPA maps out the big picture of the system, persistently investigates blind spots that can occur in the interactions between components, and derives the most definitive safety measures.
<VisualPro: The Official MIT-Certified STPA Analysis Tool>
By systematically conducting analyses with VisualPro, you can easily follow along and perform seemingly complex STPA analyses without omissions. Through its intuitive UI and automated traceability management, anyone can perform expert-level STPA analysis. Recently, an AI Chat feature was added, assisting in STPA analysis to make it much easier and more accurate.
Experience easy-to-understand STPA analysis with VisualPro.