The OT security skills gap

The OT security skills gap
Sinclair

We often talk about an OT security skills gap, what are the skills required to secure an industrial control system? To answer that question we first need to define what an industrial control system (ICS) is, I have seen many different definitions, for me, an ICS is, first of all, a system of systems. That is easily said, but what are those systems in the ICS?

To start with I must say that I personally am not so happy with the term industrial control system, my problem is the word control because an ICS has more functions than just a control function. I agree that control is the primary task of an ICS, but in an ICS we also have a process safety function, we have various diagnostic functions such as machine monitoring systems (MMS) that monitor the health and condition of for example pumps, turbines, and compressors. The most common of them is the vibration monitoring system. And we have compressor control systems, burner management systems, continuous emission monitoring systems, metering systems, advanced process control systems, asset management systems, analyzer management systems, etc, etc. Some do control, but the majority do not.

I, therefore, prefer the term process automation system (PAS) because that is what this bundle of systems basically does, it automates with all these different functions the production process. For this, we need a lot of different specialized functions. In today’s PAS many of these functions exchange data and often depend on each other. Main automation contractors (MAC) have multiple of these systems in their own product portfolio, but seldom all so they also integrate systems of other suppliers in the PAS.

But it is not just the production process that needs to be automated, also supporting functions such as systems that track plant personnel, smart camera surveillance functions that zoom into process control equipment when a process alarm occurs, systems that manage a range of smart wearables, and telecom functions all interface into this PAS. Different security zones, different network segments, some wired networks, other wireless networks, sometimes satellite connections in other cases laser connections, in today’s larger process automation systems it is all there. And it is all computer-based equipment, so in principle potentially vulnerable to cyber-attacks.

Operators no longer have a simple console with a few screens to monitor the process, the way we started in the mid-1970s. Today’s consoles have multiple large screens, each screen with many windows showing the data of the various functions. Some control rooms look quite spectacular with tens of meters of display walls, for example, Aramco’s OSPAS operation has a 67-by-3 meter display wall showing every aspect of their oil operations. So process automation today and its cyber security is a lot more than a few SCADA monitors and PLCs, many different automation functions from different suppliers are combined and presented to the various process operation roles. Some of the larger projects require thousands of equipment cabinets to cover all functions, creating very complex systems where interconnectivity and interdependencies play a critical role in securing these systems.

If I look at technical skills to build and secure today’s automation systems I see three major technical disciplines:

  • A process engineering design discipline;
  • A process automation design discipline;
  • An OT infrastructure design discipline.

The next diagram shows this in an abstract way.

Figure 1- The process automation design roles
Figure 1- The process automation design roles

The process design engineer understands all the details of the production process: its chemical reactions, its integrity operating window, its process safety risks, and how to respond to the various process events, and much more detail. Often there are different specialisms for control, safety, and optimization. Each discipline has its own specialists. Typically the process design engineer also has a high-level understanding of the automation functions, but would typically not be the person that configures these systems. Though in smaller plants some engineers wear many hats.

The automation design engineer understands how the various automation functions work, how to configure them, which functions exchange data, and what data is needed to execute the control and safety actions. These engineers typically follow a pile of training courses on how to configure the different automation functions and are capable of converting the process design requirements into an automation function that executes the various process actions/functions necessary to produce the plant’s products and a process safety function that intervenes if the process gets into an unsafe state. Though following training courses are seen by some employers as a luxury, learning on the job is the practice of the day. Risking missing various hidden configuration settings is not immediately of interest for the functional performance but potentially crucial for the cyber security performance.

And we have the OT infrastructure design engineer, this engineer understands how the network functions work, the capabilities of the various operating system functions of the servers and stations, and how to connect the OT devices such as the process controllers, PLCs, RTUs, etc. They are not involved in creating the application-level functions but do understand what the functions do and what data exchange is required. It is the world of the certificates that seem to suggest knowledge, but in reality, this knowledge is very confined and often doesn’t cover the hands-on capabilities.

In large plants, these roles are generally separated, in smaller organizations some of these roles are combined and either partially or fully overlap. But how about the OT security engineer, where does this role fit?

Often the role of the OT infrastructure design engineer also gets assigned the responsibility for the cyber security. This is so because in many organizations cyber security is limited to what I call network security. Network segmentation, hardening servers/stations, configuring access control devices, protocol converters, code integrity protection solutions, and detection functions. Their knowledge of the production is limited if they are employed by the asset owner, and almost fully absent if they are non-site resident employees of an external service provider. So cyber security results primarily in network security often overseeing the automation function-specific settings that control for example what a process operator can do, overseeing dependencies and intrinsic trust relationships.

Authorizations are primarily domain user authorizations, but do not include if an operator can yes/no disable a process alarm, can yes/no modify an alarm limit, or can yes / no change the travel rate of a valve. This type of security authorization needs the knowledge of an automation design engineer.

Figure 2- OT infrastructure design role (Left – asset owner-employee, Right – external service provider)

The above diagram shows the limitations of the OT infrastructure role. Of course, it can be worse, not every service provider has the knowledge of all the automation functions, and this knowledge can significantly vary in depth. To understand for example the trust relations that exist between system components requires in-depth knowledge of the functions, hands-on experience implementing these functions, details that are most of the time only known by the suppliers of these products and by engineers following the training courses for these products (that is if they pay attention to the security aspects).

Figure 3- Insufficient skills to create a cyber-resilient process automation system
Figure 3- Insufficient skills to create a cyber-resilient process automation system

I already mentioned that the automation design engineer also has a set of security responsibilities/skills that are of importance for the overall level of security resilience of the process automation system.

I mentioned as an example the authorizations of the different roles embedded in the automation function. For example, there are roles that operate the production process, supervisor roles, automation engineer roles, maintenance roles, process safety roles, and application engineering roles.

The more functions the more roles, and the high level the authentication and authorization rights are controlled by a central active directory function, but at the detail level, this is often not the case. For example to get access to a safety control function might require authentication and authorization specific to that function and not under the control of the domain. Similar examples exist for machine monitoring applications, process historian applications, asset management applications, etc. Security capabilities for these authentication and authorization functions widely differ, and if not familiar often remain in the default factory settings leading to insecure situations. Typically the only role with sufficient knowledge of the automation function to oversee these details is the automation design engineer.

During the realization of greenfield (Build from scratch) projects, the automation design engineer is typically a role filled by the main automation contractor, for example, one of the major suppliers and integrators of automation functions. This role works closely together with the process design role – typically an engineering, procurement, and construction (EPC) company or the asset owner – and the OT infrastructure design engineer.

The skill profile of the automation design engineer is shown below. An automation design engineer needs process design knowledge, must be capable of converting a piping and instrumentation diagram (P&ID) into a working automation solution, he must be capable of writing a safety instrumented function (SIF) that shuts down the production process if an unsafe situation occurs, and he must be knowledgeable of configuration changes that control read/write access to process parameters and devices. This role needs to understand the security hazards these functions can pose to the production and automation process if these functions would be hacked.

Figure 4- Skill profile automation design engineer
Figure 4- Skill profile automation design engineer

Though often the responsibility for OT security is combined with the OT Infrastructure design role, in the OT world this is in my opinion less logical because it is the automation design engineer that has the wider overview of overall business functions in the system. If OT would be like IT, so primarily data manipulation, it makes sense to put the lead with OT infrastructure design. But because OT is not only data manipulation but also initiating various control actions that need to operate within a restricted operating window, it makes sense to give automation design this coordinating role. This is because automation design oversees all three skill elements and has more detailed knowledge of the production process than the OT infrastructure design role. It is very comparable to cyber security in a bank, where the lead role is linked to the overall business process and the infrastructure security is in a more supportive role.

Finally, there is the process design role, what are the cyber security responsibilities for this role?

First of all the process design role understands (and developed in the HAZOP and RULA exercises) all the process deviations that can lead to trouble, and they know what that trouble is, they know how to handle it, and they have set criteria for limiting the risk that this trouble occurs.

All of this is very important input for setting requirements for cyber security resilience, because this is basically the source of all risk, and risk is the source for all security requirements.

I say for all security requirements because even regulatory requirements for worker, societal, and environmental hazards are expressed as event frequencies which ultimately translates into probability – so risk. That the plant meets these criteria is the responsibility of the process design engineer, so these regulatory requirements are embedded in all design decisions.

Figure 5 - Skill profile of the process design engineer
Figure 5 – Skill profile of the process design engineer

The whole cyber security design process is very tightly connected with the process safety life cycle. The process of security design has many similarities with the process of safety design, and the results of process safety design are essential input for security design. But also the reverse is true if a process safety engineer decides he needs a SIL 2 or SIL 3 safety control (SIL – Safety Integrity Level) to reduce the risk a for the company (and often for the law if fatalities or environmental damage may occur) acceptable level, security better guarantees that the failure rate taken into account for the SIL 2 or SIL 3 control is not increased due to a cyber-attack.

A SIL 3 control basically allows us to reduce the event frequency of a specific failure with a factor of 1000, this factor is based on the failure rate and test frequency of the components of the SIL 3 control. Where component failures occur randomly, cyber attacks don’t. Their frequency of occurrence is far higher than once in a thousand years, so we need to protect the control so it meets the MTBF (Mean Time Between Failure) failure rate of the components.

In the end, the cause of an incident is not the criterium, a fatality due to a random failure in the process of installation is weighed on the same legal scale against the same criteria as if the failure was caused by a security attack. In the courtroom, the judge would way the risk reduction measures the same way as it would do for process safety, using the same ALARP (As Low As Reasonably Practical) / ALARA (As Low As Reasonably Achievable) criteria as for safety. Liability statements can protect companies and their engineers against financial damage, but never against gross failures causing fatalities, irreversible injuries, or environmental damage.

Figure 6 - Risk analyst profile
Figure 6 – Risk analyst profile

That leads me to the final role, the risk analyst. A risk analyst of course needs to understand how to measure risk, and needs to have some mathematical tools/methods to do this. But most of all the role needs to get all the necessary information from the three disciplines and construct scenarios to analyze. So this role needs to understand a bit about all disciplines. Risk scenarios ignoring automation design, and ignoring process design are only partially addressing cyber security risk being a potential source of the gross oversights I mentioned above.

It is a step forward in securing the system, but it is a useless exercise when building a plant. The cyberattack scenarios need to extend into the process safety scenario and business scenarios to be of value to understand the risk and align the cyber security resilience with the viable threats.

Cyber attack scenarios need therefore specify the functional deviation they cause in detail, a loss of view can not be linked directly to a specific process deviation and process consequence. A deviation in the integrity of the operating window would. So impacts specifying loss of control, and loss of view are useless for a risk assessment. They cannot be linked to a specific cyberattack on for example an industrial furnace or reactor. You don’t have to develop Industroyer and Trisis software for causing a plant trip, there are easier ways. Attack software such as Industroyer and Trisis has the capability to cause physical damage to the process equipment and as such the indirect capability to cause fatalities, irreversible injuries, environmental damage, and many more.

If we don’t protect the process automation systems against these risks the engineering community will be accountable for this failure, asset owners, macs, epcs alike. The only category that might escape are the consultants on the sideline telling us how to do things, but seldomly have the hands-on experience of cyber security engineering in a plant.

A complimentary guide to the who`s who in industrial cybersecurity tech & solutions

Free Download

Related