2 comments on “Detecting Malicious Microsoft Office Macro Documents

  1. Man, you haven’t even begun scratching the surface of the horrendous crap that is the world of Office macros…

    While macros are most often used in Word and Excel, just about every Office application supports them. PowerPoint, Access, Visio, you name it…

    Stuff like Document_Open MUST be in the ThisDocument module – it won’t work outside of it. Regular stuff like AutoOpen can be anywhere.

    While AutoOpen and Document_Open have the advantage that they are automatically executed when the document is opened, there is a HUGE number of “auto” macro names that can be executed in various “convenient” circumstances. Names like AutoClose and Document_Close execute when the document is closed, that’s obvious. But there is much, much more. “Auto” macros can be executed when a new document is created (AutoNew, Document_New). Macros can be executed when specific actions are performed – e.g., a macro named FileClose will execute when the document is closed via File/Close from the menu – but not when the document is closed by clicking on the [x] button. You can “intercept” ANY action invoked via the menus like this – and many of the actions invoked via any of the buttons.

    And don’t even let me get started on the awful mess that is the internal representation of the macros…

    You see, an Office macro module has THREE different “code” areas, each containing the complete functionality of the macro and each being “executable” under different conditions.

    If a macro has ever been executed (or even started to be executed by single-stepping into it with the debugger), its code is compiled into “execodes” streams. If you look at the OLE2 stream tree, these streams have the name “__Exe”, followed by digits (e.g., “__Exe0”, “__Exe1”, etc.). This is a kind of “compiled and linked p-code” (the exact format is not known to me) that can be executed directly by the Office application. This part of the code is run ONLY if these streams exist AND are created by the exact same version of the Office application as the one that is opening them. Most of the time this happens rarely (e.g., only on your own machine and only with macros you’ve already used).

    Normally, the macro text you enter into the VBA Editor is compiled into some sort of p-code (like Java bytecode, but different, of course). It resides elsewhere (i.e., unlike the execodes) – it resides in the module stream name in the OLE2 file (e.g., “ThisDocument”). Interestingly, what the VBA Editor displays is the de-compiled p-code. There is a one-to-one relationship – modifying a line of code with the VBA Editor results in it being immediately compiled into p-code and modifying the p-code would result into different source being displayed by the VBA Editor.

    Usually, the p-code is what is being executed (unless proper execodes are present, as explained above, but this is rare) UNLESS you open the document with an Office application that has a different major version of VBA than the one that has created the document (e.g., VBA6 vs VBA5). Then some different shit happens.

    You see, the OLE2 stream containing the macro module also contains the source of that module – compressed trivially and attached to the end of the stream. Normally, this source code is not used – not even by the VBA Editor. But, if you open the document with an Office application that sports a different major VBA version from the one that has created the document, the source code is used to re-compile the p-code and then the new p-code is executed.

    This is because when changing VBA versions Microsoft, in their infinite wisdom, inserted a bunch of new p-code opcodes in the middle of the existing ones (instead of at the end), so the opcodes of a large number of p-code instructions suddenly changed. The re-compilation is necessary in order to achieve VBA portability across major VBA versions.

    So, in practice, you can have 3 different programs in the same macro, each being executed under different circumstances. (You can’t achieve this from the Office application; you’d have to doctor the OLE2 image containing the macros.)

    Oh, and speaking of Excel, not sure about the latest versions, but the previous ones also support a completely different sort of macros, called Excel Formula macros. The support for running them continued even when the support for creating them was dropped (in Excel97).

    For some other macro stuff (somewhat trivial and somewhat biased toward viruses rather than malware in general), take a look at some of my papers:

    http://bontchev.nlcv.net/papers/macidpro.html
    http://bontchev.nlcv.net/papers/upconv.html
    http://bontchev.nlcv.net/papers/paix.html

Leave a Reply

Your email address will not be published. Required fields are marked *