Last week I broke the vendor’s code of silence by divulging the dirty little secret that some network security vendors may actually exaggerate the degree to which their products are “content aware”. I also put together a web site you can use to test the “Content IQ” of your favorite network security system yourself.
In this blog post I want to talk about content type. By “content type” I mean the type of file that’s being transported over the network. There are many situations in which a security analyst, incident responder or forensic investigator needs to identify network sessions that contain a certain type of content. For example, you might need to find network sessions that contain Microsoft executable files or Shockwave Flash files embedded in Microsoft Office documents.
Many network security systems such as firewalls, IPSs, and network forensics systems determine the type of the file being transferred over the network by looking at its file extension or - for certain protocols such as HTTP and SMTP that have protocol fields for describing content type - it’s “mime type,” which is derived directly from the file extension.
In other words, many network security systems “believe” the file type that was asserted by the person who named the file. Not to put too fine a point on it, but in any kind of attack scenario that would be “the bad guy”.
There are a couple of problems with this approach.
Problem 1 is that even the most unsophisticated attacker can very easily “fake” a file type simply by changing the file’s extension (e.g. from “.exe” to “.jpg”).
Problem 2 is that this approach can’t determine the type of a file that’s embedded or contained within another type of file (for example an executable file that’s contained in a zip file, attached to a PDF file, or embedded in a Microsoft Office document) – it will only see the file type of the outermost containing file.
To avoid Problem 1, a network security system must be able to identify file types independently by performing binary file type recognition using known file-type signatures (i.e. by looking for certain byte sequences at particular offsets within the transferred files).
To avoid Problem 2, a network security system needs to “open up” content containers (such as compressed archives, Microsoft Office documents and PDF files), extract the embedded files, and do binary file type recognition on them. This has to be done “recursively” so that it can identify the file-types of files that are contained within containers within containers within containers…
In the spirit of last week’s post, I created a test page that you can use to test your network security system’s “Content-Type IQ” and a video (below) showing a Fidelis XPS system taking the Content-Type IQ test.