The idea of identifying malware by running it in a controlled execution environment (a “sandbox”) and analyzing its runtime behavior has been around a long time, and this technique is still an important tool in a malware analyst’s toolkit. In recent years there has been renewed interest in sandboxes and, in particular, the idea of detecting customized malware automatically by running executable content in an array of highly instrumented, typically virtualized sandbox environments, observing its runtime behavior programmatically, and trying to decide, without human intervention, whether it is malicious or not. Some such systems connect directly to the network and attempt to extract executable objects from the network data stream and run them in the sandbox environment in “near-real time,” as the objects are flowing over the network.
The allure of this approach is that it has the potential, at least theoretically, of being able to identify fully customized, semi-customized and/or dynamically polymorphic malcode that cannot be detected using traditional, signature based technologies like anti-virus and intrusion prevention systems. And, frankly, who wouldn’t want to have a magical box that plugs into the network and finds all the malcode, and only the malcode, automatically? I mean, who doesn’t like an easy solution to a difficult problem?
Unfortunately, in the real world, easy solutions to difficult problems are rare as hen’s teeth, and the problem of automatically identifying custom malware, particularly as it is flowing over a network, is no exception. The unfortunate reality is that there are many ways to defeat automated malware analysis systems.
For one thing, most modern threats, particularly those of the “advanced” and/or “persistent” persuasion, are absolutely paranoid about running in a sandbox, a virtual machine, or any kind of non-realistic execution environment, and will simply “play dead” (not exhibit any malicious execution behavior) if they have even the slightest inkling that they’re running in an automated analysis environment.
While some malware will use sophisticated techniques to detect when it’s running in a sandbox environment, some will take a simpler, more passive approach and just “sleep” for a period of time before doing anything bad. This approach can be particularly effective against network-connected sandboxes that are trying to “keep up” with the flow of executable content on the network and can’t afford to wait even a few seconds, let alone minutes or hours, for the malware to exhibit bad behavior.
And then there is the problem of replicating the malware’s target client environment. Many modern threats target a particular version of a particular application running on a particular release of a particular operating system at a particular patch level, etc. There are a near-infinite number of combinations, and it would take a near-infinite number of sandboxes to replicate them all. In extreme cases, some malware will actually target a specific user’s client environment (e.g. an environment with a specific Windows login name) and will not execute in any other environment.
But, beyond all the techniques mentioned above, there is a whole class of attacks that implement what I call an “implicit reverse Turing test,” and will not exhibit malicious behavior until they are sure that they are interacting with an actual human being and not another machine. There are many ways to do this - some are active, some are passive - but they all require non-trivial human interaction (i.e. something more than clicking on a URL or opening an email attachment) before they do their dirty work. Take a look at this web page for a few simple examples of simulated malware that illustrate this concept.
Now, don’t get me wrong. I think automated malware analysis has a place in the pantheon of network security tools. But if you’re expecting it to be the panacea that’s going to deliver you from all the evils of advanced, targeted malware, you’re going to be deeply disappointed.






Excellent post! Bulk Malware Analyzers are part of a larger requirement and can't address what was stolen, only how the actor entered the wire. Yes, they are valuable for threat ID, but for operational best practices should only be used in a near-line capacity. Putting one in-line would limit a network's/users throughput dramatically. More importantly most have a secret hash list used for comparative threat analysis to determine what to look at, which given the nature of polymorphous malware trends is a severe limiting factor.
Posted by: tellin it like it is | Tuesday, September 27, 2011 at 05:00 PM