Process Safety and OT Security – A Symbiotic Relationship

Process Safety and OT Security - A Symbiotic Relationship
Sinclair

In my previous article about the security triad, I briefly mentioned process safety. Some of the readers of the article demanded that process safety should be included as one of the elements of the triad. In this subsequent article, I will provide a more comprehensive explanation of my perspective on why I believe that the traditional triad’s components of availability and integrity effectively fulfill the process safety prerequisites of an Industrial Control System (ICS). Furthermore, I will explore an entirely different triad. To accomplish this, I need to adopt a process engineering standpoint and highlight the fundamental elements for process design and operation.

These fundamental elements are operability, observability, and controllability. These three elements are key concepts in the field of process engineering, particularly in the design, operation, and control of complex processes such as a petrochemical plant. These three elements are essential for ensuring the efficient and safe operation of chemical plants, manufacturing facilities, and various industrial processes.

It was Marina Krotofil and Jason Larsen who recognized, over 10 years ago, how important these concepts were for securing a chemical plant. In several key papers and presentations on the topic, they explained the importance of these three elements and even created their own security triad, COO. Their papers had a significant impact on my perspective regarding OT security because I share the belief that these three concepts are the core elements of a cyber-physical system that require protection against a cyber attack. Let’s begin by discussing each of these elements.

  1. Controllability:
  • Definition: Controllability refers to the degree to which a process can be controlled or manipulated to achieve desired outcomes. A controllable process is a process that can be adjusted and regulated effectively to maintain process variables within specified ranges. These specified ranges are part of what is known as the operating window, which is an element of the next concept, operability.
    • Importance: Controllability is essential for maintaining process stability and meeting product quality and production targets. Processes with good controllability are easier to optimize, respond well to disturbances, and can adapt to changing conditions. Loss of controllability results in processes going out of control, which leads to hazardous conditions and possible accidents. For example, a chemical plant could experience unintended chemical runaway reactions or explosions if control is compromised.
  1. Operability:
  • Definition: Operability refers to the ability of a system or process to function effectively and efficiently under normal operating conditions. It involves ensuring that the process can be started, operated, and shut down safely and smoothly without any major issues.
    • Importance: Operability is crucial because it directly affects the reliability and productivity of a process. A well-designed and operable process minimizes downtime, reduces the risk of accidents, and maximizes resource utilization. Loss of operability can compromise safety systems, including critical functions such as emergency shutdown systems, potentially increasing the risk of safety incidents. Operability and physical constraints are interconnected in the context of production processes, requiring process and automation design to stay within the equipment, dynamics, resource, and technological limitations.
  1. Observability:
  • Definition: Observability is the capability to monitor and gather data about the state and performance of a process or system in real-time. It involves the use of sensors, instruments, data collection, and presentation techniques to track variables and parameters relevant to the process.
    • Importance: Observability is vital for process operation and troubleshooting. It allows operators and engineers to detect deviations from the desired operating conditions, identify potential problems, and make informed decisions to maintain or improve process performance. The loss of observability in an industrial or process control system can result in the inability to monitor and gather real-time data about the system’s state and performance. This can lead to delayed detection of process deviations, equipment failures, or abnormalities, including those caused by a cyber attack making it challenging to respond promptly and effectively. As a consequence, it leads to reduced process efficiency, increased downtime, compromised safety, and potential financial losses resulting from the inability to identify and address issues in a timely manner.

Process safety is fully covered by these 3 concepts, and so are reliability, and productivity. Maintaining operability, controllability, and observability is the first task in protecting a cyber-physical system. These three concepts are also part of the process automation function, each is essential to the other two, but they have a good deal of autonomy. They are implemented in various ways in the ICS.

Operability

Operability in an ICS is reflected in various ways. It is the control configuration, which sets the ranges, rates, algorithms, and dynamics of control loops. It’s also seen in the instrumentation configuration, signal conditioners, and the selected control valves. Additionally, operability includes aspects like setting appropriate alarm priorities and limits to avoid excessive or misleading alarms.

The design of the Human-Machine Interface (HMI) in the automation system is crucial for operability. The HMI should be user-friendly, intuitive, and provide operators with the necessary information and controls for effective process monitoring and management.

Additionally ensuring safety is a fundamental part of operability. Emergency Shutdown (ESD) functions configured within the Safety Instrumented Systems (SIS) play critical roles in maintaining safe conditions.

Redundancy in critical ICS components and failover mechanisms is essential to guarantee uninterrupted operation in the event of automation component failures, and where feasible, ensure a seamless transfer of control between the redundant components. This contributes to both operability and system reliability.

Operability includes continuous process optimization functions and tuning of the control functions. These activities help maintain efficient and stable operations, ensuring that the system operates at its best, realizing the quality goals needed.

Controllability

Controllability manifests itself in several ways within an ICS. It begins with the design, configuration, and tuning of control loops. This includes the selection and implementation of various control strategies, such as feedback, feedforward, and cascade control, aimed at regulating process variables within their predefined ranges. Ensuring that actuators (valves, motors, etc.) respond quickly and accurately to the control signals to effect changes in the process by proper sizing, the use of positioners, and selecting the correct type of valve.

Observability

Observability within an ICS takes various forms, including HMI stations for process operators and engineers, alarm annunciator panels for operators, fire and gas alarm panels, visual and audible alarm functions, historical data and trend displays, graphical interfaces, detail, group, and overview displays that logically cluster control loops for a better overview of a control strategy, system status and alarm displays, dashboards, status indicators, wall panels, and more.

Each of these functions is critical for the production process and must be safeguarded through cybersecurity measures to prevent deviations from the intended design and operation. Availability is a key requirement, as controllability, operability, and observability rely on the availability of these elements. For instance, the absence of HMI functionality diminishes observability, while a failed controller or sensor affects operability. Likewise, a malfunctioning valve positioner leads to a loss of controllability. Therefore, availability is a fundamental requirement for COO.

The same can be said for integrity, or more precise system integrity. The main difference between system integrity and data integrity is their scope. System integrity addresses the overall reliability and correctness of the entire control system and network, including its hardware, software, and configuration. It ensures that the system operates as intended and has not been compromised in any way.

Data integrity, on the other hand, is a narrower concept that focuses specifically on the accuracy and reliability of the data itself, regardless of the system’s overall integrity. It ensures that data remains unchanged, accurate, and uncorrupted. One of the seven foundational security requirements of IEC/ISA 62443 is system integrity. In the IT world, the focus is more often on data integrity. While system integrity and data integrity are distinct concepts, they are closely related because the integrity of the control system and network (system integrity) ultimately affects the integrity of the data it stores and processes (data integrity). Operability depends on system integrity, without it the operating window in the control system might no longer reflect the physical dimensions of the installation. Observability also relies on system integrity; if system integrity is compromised, the operator’s display may no longer accurately represent the physical conditions in the process. For example, a pump that appears as ‘running’ in a graphic display may actually be ‘stopped. Equally, controllability is affected by system integrity. Control loops, which are essential for controllability, depend on reliable system components, including sensors, positioners, actuators, and communication systems. System integrity ensures these components are functioning correctly. If these components are compromised, control loops may become unreliable, affecting the ability to control the process effectively.

So, COO requires IA, where ‘I’ represents system integrity, which includes data integrity. However, we still need to address the ‘C’ (Confidentiality) aspect of the security triad. Loss of confidentiality does not directly impact COO; it is more of a business requirement. Nevertheless, if the confidentiality of login credentials is compromised, it poses an indirect threat to COO. Additionally, if the industrial property of the plant is breached, it can have a direct business impact.

As I discussed in my previous blog, the sequence of the elements in CIA (Confidentiality, Integrity, and Availability) is not fixed; it is determined by risk criteria and can vary among different plants, even if they have the same production process.

So, what are the most important aspects of OT security? Well, for me, it’s maintaining Controllability, Operability, and Observability, just as Jason Larsen and Marina Krotofil defined over 10 years ago. Is one more important than the other? No, you need all three to run a plant. If one fails, the production process fails. Each of these three elements is defined during the ICS implementation, and each element is essential to the other two, but they also operate with a good deal of autonomy.

If my goal were to damage the production or physical installation of a plant using a cyber attack, these are the three elements I would target. However, if I needed to protect the production process and process equipment of the plant, I would prioritize maintaining system integrity and availability of the ICS functions. Were I to protect the business in general, I would also need to consider the protection of confidentiality.

Nevertheless, by maintaining COO, the aspects of process safety, productivity, and reliability are comprehensively addressed. Therefore, focusing on OT security and conducting risk analysis for cyber-physical systems with respect to the COO and CIA triads encompasses the entire scope of the protection task. When building a defense, we must begin by identifying the target and then surround it with layers of protection. The inner layers typically depend on the outer layers. In a cyber-physical system, the ultimate target is the process installation and its associated processes, making this our starting point. In my book on what I’ve termed ‘Deep Defense,’ there are nine layers to consider. It begins with the ultimate target, the process installation, and concludes with an outer layer focused on ‘Personnel,’ which encompasses human factors, with digital and organizational layers in between. Such a layered concept of security demands that risk is based on conditional probability because these rings of protection have interdependencies. Conditional probability is quite different from “normal” probability, with different rules, and a different approach. It focuses on defense strength, the probability of (defense) failure if attacked (PFA) as such PFA has some relationship with the Probability of Failure on Demand (PFD) such as used by process safety, though PFD is based on reliability and as such a traditional non-conditional probability. For more on this topic see my S4 presentation.

A complimentary guide to the who`s who in industrial cybersecurity tech & solutions

Free Download

Related