On the Semantics of Self-Unpacking Malware Code
Department of Computer Science
University of Arizona
Tucson, AZ 85721, U.S.A.
The rapid increase in attacks on software systems via malware such as viruses, worms, trojans, etc., has made it imperative to develop effective techniques for detecting and analyzing malware binaries. Such binaries are usually transmitted in packed or encrypted form, with the executable payload decrypted dynamically and then executed. In order to reason formally about their execution behavior, therefore, we need semantic descriptions that can capture this self-modifying aspect of their code. However, current approaches to the semantics of programs usually assume that the program code is immutable, which makes them inapplicable to self-unpacking malware code. This paper takes a step towards addressing this problem by describing a formal semantics for self-modifying code. We use our semantics to show how the execution of self-unpacking code can be divided naturally into a sequence of phases, and uses this to show how the behavior of a program can be characterized statically in terms of a program evolution graph. We discuss several applications of our work, including static unpacking and deobfuscation of encrypted malware and static cross-phase code analysis.