In a post that I wrote last week I described the need for detailed knowledge about a vulnerability so that an owner could make a reasonable decision as to whether or not to apply a patch to a vulnerable control system component. I have received a couple of comments and questions about what should be included in that patch decision making process. In the end it comes down to a business decision, but here are some of things that I think should enter into that process.
Security or Non-Security
The first thing that must be determined is that if the firmware or software patch or upgrade (and I am lumping all of that together for this discussion) is being offered to fix a security problem, a non-security program bug, or just providing new features. In this discussion I will be ignoring the last case; the addition of new features requires an entirely different set of evaluations.
For non-security related bugs, the question that has to be asked is if the bug has had, or reasonably could be expected to have, an adverse effect on the manufacturing process at the facility. If the answer is no, management should probably decide to implement that patch during the next scheduled turnaround as part of normal facility maintenance activity.
Security patches are, of course, a different matter and that forms the discussion to follow.
All patches need to be tested to ensure that the revised device remains compatible with the remainder of the automation system. This is less of a problem if the system was bought as a whole piece from a single vendor and no changes have been made since installation. Unfortunately, there are very few of that type of installation and perhaps none that are more than a year or two old. Changes, tweaks and additions are just too common in industrial control systems.
A facility (or its contract integrator) should have a test bed available that duplicates the control system environment so that changed components can be tested for compatibility with the remainder of the system. If this is not available, it is probably advisable to only do patching during turnarounds, otherwise you risk shutting down a functioning control system due to incompatibility issues.
Many security vulnerabilities only affect certain operations of a device/program or may only be accessed via a single route/port. If those particular operations are not used at the facility, putting off the patch until turnaround may be a reasonable option. If the affected port is not used and can be turned off so that it cannot be accessed, again postponing the update is probably an acceptable option.
In either case there must be some sort of documentation in the automation files that the particular patch has not been employed. Further, security demands that special attention should be paid to the functionality or port during system monitoring to ensure that someone has not modified the system to make that vulnerability function.
The patch should still be added to the turnaround maintenance list as a priority item. Just because a certain vulnerability is not applicable to the current system implementation, there is no way of telling when the vulnerable functionality or port may be needed in changes to the facility automation system. Having forgotten the necessity of a patch will not be an acceptable excuse when an attacker successfully uses a known vulnerability in an attack on the system.
Many times a vulnerability notice will provide specific mitigation measures that can be applied to protect the device against exploit of the vulnerability. Where these mitigation measures can be employed without making changes to the automation system (that would have to be vetted against the test bed the same way that a patch would have to be), they may be used to justify avoiding immediate application of the patch.
Conventional wisdom would seem to indicate that a full risk analysis should be conducted when deciding when (or even if) a patch should be applied. I think that that will lead to over thinking the situation. Most facility owners will not have any knowledge of specific threats against their facility. For those that are aware of a specific threat, a full risk analysis may be indicated.
For the remainder of facilities, the risk analysis is really sort of simple. In the current environment if a vulnerability has been publicized, then the facility must assume that there is a risk that someone could employ the vulnerability in an attack on the system. The only thing that matters then is the potential consequences of a successful exploit. This calls for a detailed systems analysis of how the device interacts with the remainder of the system. If the results of a successful attack on the device and its affected systems are acceptable to organization management in the short run, then putting off the implementation of the patch until the next turnaround could be acceptable.
There are some vulnerability effects, however, that cannot be tolerated in any control system. Postponing the implementation of a patch for these types of vulnerabilities should almost never be done. These include (short list) vulnerabilities that:
• Allow admin level access to the device or system;
• Allow access to other system components;
• Affect safety systems; or
• Affect systems involved in regulatory reporting;
Most facilities in this country only have risks that affect the owner and employees of the facility. A short term outage, or even a shutdown of the facility is only going to have at most local economic effects. Managers of these types of facilities may have a much higher degree of risk tolerance in making the risk decisions that I mentioned above.
Other facility managers, because of their product or processes, may also take societal risk into account when they make those risk decisions. A hospital administrator must take into account the possible effects on patient health and even lives. An electrical power producer must take into account the effect on the grid of a facility shutdown or even variations in production. A chemical manufacturing facility must take into account the effects of a possible chemical discharge on the surrounding community. An airplane or automobile owner must take into account the potential effects of an accident on both passengers and impacted personnel.
In the cases where societal risk is involved it is hard to argue that postponing patch implementation is ever acceptable. Only weighing the risks associated with the application of the patch (usually loss of service) against with that of not applying the patch can justifiably be used to delay patch application.
It should be argued that were society risk is involved that there is a duty to inform the effected portions of the society so that they are fully aware of the decision that was made on their behalf. At the very least regulatory agencies should require that they be notified of any such decisions. They should also be given authority (with appropriate legal safeguards) to over-ride such decisions that present an unreasonable risk to the public.