Process Safety Incident Investigations

Process Incidents can involve loss of containment, control, a challenge to the safety system or a failing in the application of a management system. Investigating these types of incidents is not the same as investigating an occupation incident, however there is some overlap. To achieve maximum output from conducting a process safety incident investigation  the requirement to have a known structured approach is paramount. Many organisations have a system that is “tuned” for occupational safety, but not process safety. Here some important points for each phase of the investigation process. 

An investigation process can be broken down into four main phases:

  1. Incident Categorisation
  2. Investigate their root causes
  3. Implement actions
  4. Ensure follow-up occurs

Incident Categorisation - If a process safety event is not recorded & categorised, it is not investigated. It is recommended not to have just a single category of a process safety event, but split it into meaningful categories, that include releases (of hazardous substances), activation of critical safeguards, loss of control and violations of key procedures/ systems. How to recognise and categorise process safety events must be understandable by front line staff as they are the ones made responsible to log events.

Investigation of Root Causes - Once an event is categorised as a process safety incident, the correct investigation team needs to be formed ( not just the HSE advisor &/or operator). If you have the luxury of a process safety engineer, this person should be part of the team by default.  The investigation technique needs to be appropriate for the event and there are many available  such as ICAM or the 5 Why's.  Pick your method and ensure the root cause analysis drives down beyond the technical causes and identifies the the systemic causes. For example, the activation of a trip (which is logged as an event) may reveal weaknesses in maintenance systems or SOPs.

Implementing Actions - To prevent a re-occurrence, actions are needed to address to both the direct cause (eg the level control failed), similar potential causes (eg other level controls) and the systemic causes (eg maintenance procedure on level control systems). The action chosen should be verified by both the investigation team and an independent person outside of line management. I have come across several cases where only action is taken on the direct cause to make the incident "easy" close out. 

Follow-up - It is important that front line staff are kept in the loop with the investigation and actions taken. This will encourage reporting and engagement on process safety matters.

Other good points to note for good investigation practice:

  1. Incidents are reported within 24 hours of event
  2. Near misses are reported and categorised
  3. Incident severity is ranked for both actual and potential
  4. All substances that could lead to serious harm should be reported - not just those included in the regulations.
  5. API RP 754 gives us good guidance what can be considered for process safety incidents
    • Tier 1 and 2 represent a loss of containment of hazardous materials,
    • Tier 3 represents an activation of critical controls (e.g. PSVs or SIL rated systems) 
    • Tier 4 represents failure to follow key managements systems (e.g. MoC, PtW etc
  6. All incidents with the potential of serious harm or greater are to follow a formal root cause process – led by a trained person. The decision to use root cause analysis is NOT to be based on perceived risk.
  7. Action completion dates are to be monitored by senior management as a KPI and the action implementers are held to account to meet these dates.

Interest in other process safety blogs. Click hereDownload Incident Investigation Checklist