Tuesday, November 23, 2010

Stuxnet Man in the Middle Attack

Ralph Langner has been one of the people that have been providing the most detailed looks at the internal workings of the Stuxnet worm. The Symantec people have provided a great deal of data on the worm, but their expertise has been focused on the Windows side of the worm’s operation. That, after all, reflects the nature of their typical cybersecurity work. Ralph has been providing insights into the operations on the Siemens side of the process.

Ralph’s blog has been active on this topic since August, but he has been posting blog entries daily (sometimes twice a day) since November 11th. His postings have been vocal in their criticism of both Siemens and ICS-CERT in the lack of information that they have shared with Siemens users, suggesting that the reason is that neither organization actually knows much about the workings of Stuxnet.

In addition to these complaints, he has been providing a detailed look at the Stuxnet 315 and 417 attack codes. These are the two separate sections of the Stuxnet code that actually cause changes in PLC programming. These two attack modes are so different in their operation that Ralph suggests that they are actually targeted at two completely separate targets in the Iranian nuclear industry.

Cyber security professionals should be concerned about the method of operation of both of these attack codes, but as a process chemist I am much more concerned with the methodology used in the 417 attack codes. So much so, that I have to admit that this attack mode professionally scares the hell out of me. This is because it compromises the chemical process professional’s trust of, and reliance on, data provided by the process control system.

Description of Stuxnet 417 Operations

Ralph takes a number of blog posts (11-15-10, 11-16-10a, 11-16-10b, and 11-18-10) to properly describe the mode of operation for the man in the middle attack. To truly understand the operation you should read all four of these posts by Ralph. I’ll provide a brief summary here (a summary that Ralph is no way responsible for).

In an automated control system the system takes inputs from various process measurements, compares them to the rules provided by the process professional, and automatically directs process equipment to respond appropriately. For instance, a typical process parameter is temperature. Inputs come from various temperature measurement devices in the process. If the measured temperatures are too low, valves are automatically opened to allow heat (via steam, hot water, etc) to flow into the process; when the process reaches the appropriate temperature range the valves are then automatically closed to stop the application of heat to the process. At the same time, the human machine interface (HMI) visually displays the temperature and the process response for that temperature to the human overseer of the process.

As Ralph describes the Stuxnet 417 operation the worm passively records process measurement inputs and outputs. When it executes the process changes programmed into the attack, it replays the normal process data for display in the HMI, allowing the operator and the process historian to see only that data that is expected in normal operation. This effectively prevents effective human oversight of the process or proper diagnosis of the process upset after the attack is over.

Process Implications

One of the biggest benefits of automated process controls in chemical manufacturing is that product quality is much easier to maintain because process variables are maintained through a narrower range than is possible with active human monitoring of those variables. Destroying that control capability via a man-in-the-middle attack like this could financially ruin many chemical manufacturers; increased raw material costs to remanufacture good product to fulfill orders and waste disposal costs are not easily recovered, especially in today’s economic environment.

From a security perspective the process safety implications are even more important. For many chemical processes there are safety limits for process variables as well as quality limits. One very typical limit is process temperature. There are a large number of processes, that if a critical temperature limit is exceeded, it is no longer possible to control the chemical reaction that takes place. These runaway reactions can, in some instances, result in process temperatures and pressures that will result in an overpressure situation that the press commonly calls an explosion.

While these catastrophic failures of process vessels are not technically explosions, the effect of such a failure would be the equivalent of a very large explosive device, potentially larger than even a VBIED. I have worked with military grade explosives and I have seen the results of a catastrophic process-vessel failure. I am much more concerned about a large vessel failure than I am with a typical terrorist bomb.

Process professionals typically understand the risks of these overpressure events and take appropriate precautions. Process controls are designed to prevent temperatures from approaching the runaway initiation temperature. Separate automated safety controls are designed to shut down the heat producing reactions that typically are the cause for these events. Unfortunately, it is becoming more and more common for these safety systems to utilize the same process input devices, computer systems, and often the same software, as the process control system as a cost control measure. Thus the compromise of process data by the Stuxnet 417 attack code could compromise these safety systems as well as the process control system.

Less than Catastrophic Attacks

Most industrial chemical processes that can result in catastrophic consequences only do so during specific, well defined portions of the manufacturing process. This means that a Stuxnet type attack leading to that catastrophic failure will require quite a bit of insider process knowledge. The attacker will need to know the process conditions that would lead to that event, when those conditions could lead to that event and how to identify that point in the process from information available through the process controller.

Terrorists would be expected to be best served by these high-visibility process failures, but there are other potential attackers that could benefit from less than catastrophic process failures. State actors wanting to disrupt opponents critical industrial processes, criminals wanting to extort money or even other companies trying to obtain a competitive advantages could benefit from a less targeted process upset event. Given recent Al Qaeda propaganda to the effect that they intend to conduct economic warfare in their future attacks, even they could be potential beneficiaries of process control attacks that don’t necessitate insider knowledge.

Langner identifies how this would be done. He notes that all “that needs to be done is to blind the legitimate program along with operators by re-playing normal input signals and manipulate outputs randomly”. Such random process changes are unlikely to cause catastrophic events in any but the most dangerous processes. They would almost certainly cause product quality issues. And Langner points out that it is entirely possible that such an attack “can be packaged into an exploit tool that lets attackers assemble an attack by point-and-click”.

To understand how thoroughly this can disrupt a chemical manufacturer (and to a lesser extent any manufacturer that uses process control software) you have to understand how companies that rely on product quality to market their wares deal with off-spec product. First the off-spec material must be isolated so that it cannot be inadvertently shipped to customers. Next an investigation must determine the cause of the quality issue and identify process changes that will prevent a recurrence of the problem. All of this will have an economic cost associated with the effort.

Since the advent of modern process control one of the key tools that the industrial chemist or engineer uses to diagnose process upsets is the Process Historian. This is a data file that collects information from the control system. If the data in the control system is corrupted, then the data in the Process Historian will be similarly worthless for the diagnosis of the problem.

If the cause of the quality issue cannot be determined by a detailed process data review (and it can’t be if the data is corrupted), most companies will resume production, keeping a closer eye on the process. If subsequent random and undocumented process changes are made, there will be additional quality issues (usually different issues), that will cause the isolate and investigate cycle to be repeated with their associated costs. With the actual cause of those problems equally impossible to identify, the manufacturer will have to shut down the process; continued manufacture will be just too expensive.

Prevention is the Key

So, you can see why I am very concerned about the potential for this man-in-the-middle type of attack being used against chemical manufacturing facilities. Similar types of issues are possible in a wide range of industries utilizing process control software. Prevention of these types of attack are key to preventing the occurrence of these quality or safety issues. That means keeping the malware out of the control system is very important. Unfortunately, it seems that that may be very difficult to achieve. I’ll address that in a future post.

No comments:

/* Use this with templates/template-twocol.html */