SandMark <title> <center> <h1> SandMark </h1> <h3>Reviewed by: Ricardo Carlos</h3> <h3>Date: Sept. 6, 2008</h3> </center> <h2>Abstract</h2> <p>SandMark is a Java-based tool that is used to implement and evaluate the effectiveness of software protection algorithms. It provides a wide variety of software protection algorithms to perform dynamic and static watermarking and obfuscation, as well as optimization to ensure performance requirements are met while securing the code. The tool's developers have the goal of determining which software protection algorithms have the best resilience against attacks (i.e. watermark or tamper-proofing extraction) and smallest impact on performance (i.e. overhead)</p> <h2>Introduction</h2> <p>SandMark was developed by Christian Collberg, Ginger Myles, and Andrew Huntwork. Their goal in developing SandMark is to enable users to fairly evaluate available software protection methods, and help them determine the impact these methods will have on their software's performance. SandMark implements a set of software protection algorithms for dynamic and static watermarking, obfuscation, static code analysis (e.g. Java bytecode differences between original and watermarked/obfuscated methods), optimization, and decompilation.</p> <p>SandMark is an open-source tool that was developed to allow users to develop their own software protection algorithms as plug-ins to extend the tool's functionality. It was originally developed in 2000, and version 3.4 was released in August 2004. SandMark's development is supported by the National Science Foundation (NSF), the Air Force Research Laboratory (AFRL), and the New Economy Research Fund of New Zealand.</p> <h2>Installation</h2> <p>SandMark is available for download at <a href="http://sandmark.cs.arizona.edu">http://sandmark.cs.arizona.edu</a> as a Java JAR file (sandmark.jar), as well as a ZIP package of source code files (sandmark-src.zip) for users who wish to build the package themselves. The tool also requires an additional set of JAR files that implement some of the algorithms used, and those are also available for download at the same webpage.</p> <p>SandMark is compatible with Linux, Windows, and Mac OS, and requires Java 1.4 or greater. There is no installation process required, so the tool is ready to run as soon as the required JAR files are downloaded to the user's computer.</p> <h2>Usage</h2> <p>SandMark provides a GUI that is very straightforward to use, and is invoked in the same way that Java JAR packages are run (e.g. command line, double-clicking the file through a file system browser), but may require additional arguments to properly specify the classpath required to have Java's <em>tools.jar</em> package recognize <em>sandmark.jar</em>. The sandmark.jar file must be placed in a directory named smark3, and the 5 additional JAR files must be placed in a subdirectory of smark3 named smextern3.</p> <p>Here is a sample SandMark command line call, executed from the smark3 directory: <em>java -classpath /usr/local/jdk/lib/tools.jar:sandmark.jar sandmark.gui.SandMarkFrame</em><br>Screenshots from this SandMark call are shown later on this page. SandMark can also be run by an automated script containing a series of commands to execute a test suite.</p> <h2>Internals</h2> <p>SandMark uses watermarking, obfuscation, and code analysis algorithms to provide users with a variety of tools to evaluate different methods of software protection. One of the watermarking algorithms implemented in the tool is the Collberg-Thomborson (CT) algorithm, the first dynamic software watermarking algorithm. The CT algorithm embeds the watermark within runtime data structures, where the key consists of a sequence of inputs to the program during execution. The watermark is then extracted by executing the same sequence of inputs while the recognizer is active. There are also 11 static watermarking algorithms available for use in SandMark.</p> <p>There are over 25 obfuscation algorithms implemented in SandMark, such as reordering obfuscations which reorder method parameters, local variables, and symbol table entries without interfering with the program's overloading structure. There are also splitting obfuscation algorithms, where basic blocks, methods, and variables (arrays and scalars) can be split into multiple structures, as well as merging obfuscations.</p> <p>SandMark also provides a number of tools to evaluate protection algorithms to determine performance characteristics, as well as resilience to attacks. (This is where the <em>REAL</em> fun begins). Users can invoke various benchmark programs, such as <em>specjvm</em> and <em>caffeinemark</em>, and a viewer that enables a user to display statistics for individual classes and methods, as well as viewing Java bytecode for specified code. Users can also invoke attack methods to determine whether certain protection methods provide an appropriate level of protection. For example, there are code optimization algorithms that may destroy software watermarks by reordering certain instructions that could be part of the watermark. This empowers software developers to put their "black hat" on, and expose vulnerabilities in their software so that they can mitigate security risks before releasing it to the public (or beta partners). And it also lets those of us who wish to play the "black hat" hacker without getting locked up, or having a guilty conscience :-)</p> <p>Analysis algorithms are also available to users to view various statistics regarding their software implementation, such as the frequency of opcodes, number of API calls, and complexity of methods (e.g. inheritance hierarchy depth, loop nesting depth). These statistics give software developers insight into knowledge that attackers may be obtaining to use in determining watermarks or tamper-proofing methods. Users can also view details on specific classes and methods, compute desired statistics, and compare Java bytecode between 2 different watermarked or obfuscated versions of the same code to identify watermarks.</p> <p>SandMark's developers highlight the fact that there isn't a widely accepted set of metrics that can be used to evaluate software protection algorithms, but they have proposed the following evaluation techniques:</p> <ul> <li><em>Data Rate</em>: ratio of the size of the watermark to the size of the program</li> <li><em>Embedding Overhead</em>: increase in size or decrease in performance of the protected application compared to the original version</li> <li><em>False Positive Rate</em>: probability that a random value is recognized as a valid watermark; amount of confusion by obfuscation methods</li> <li><em>Resilience against Manual Attacks (Stealth)</em>: statistical difference between a protected application and typical applications; probability of adversaries using differences to identify watermarks or obfuscation</li> <li><em>Resilience against Semantics-preserving Transformations</em>: watermark survival across code optimization and obfuscation; impact on the application after watermarks or obfuscation have been eliminated (i.e. is the program usable after adversaries eliminate protection?)</li> <li><em>Resilience against Collusive Attacks</em>: watermark determination by comparing multiple versions of the protected application</li> </ul> <h2>Evaluation</h2> <p>First and foremost, I'd like to clarify that I'm not an excellent Java programmer, so I ended up having a hard time using SandMark due to issues with the classpath not being set correctly. Once I was able to get it working correctly (with the help of Prof. Collberg), SandMark proved to be a very useful tool in being able to perform software protection operations on a variety of Java programs. SandMark seems to require a strict procedure to perform dynamic watermarking, due to how the runtime execution being performed impacts the tracing, embedding, and recognizing process, and it takes a user some time to get used to following that procedure.</p> <p>The tool does work for small and simple programs well, and does not require a lot of time and resources (overhead) to perform some of the protection algorithms. But, there is an increase in the overhead while trying to perform some of the most intensive algorithms such as decompiling and deriving the application's control flow graph (CFG). SandMark does a good job of adequately warning the user that it may take HOURS to form the application's CFG, so it does help to prevent undue frustration.</p> <p>A drawback for SandMark is that it's not very user-friendly. For example, most of the documentation for the tool is outdated, and refers to panels that have been modified. I attempted to find examples for a simple execution of each of the options (e.g. watermarking, obfuscation, optimizing), but those were only provided in the source package as TEX files, so it requires users to have some experience with converting TEX files to PS or PDF files.</p> <p>Overall, SandMark works really well to introduce software developers to software protection algorithms, and provides an effective way of evaluating the impact those measures may have on security and performance. It could use some work on becoming more user-friendly (the GUI does help, but documentation needs improvement), and updates are probably required since the tool's last release from 4 years ago.</p> <h2>Screenshots</h2> <em>Dynamic Watermark Embed Panel: Specifying input and output JAR file, and watermark</em><br> <img src="dynamic_wm1.JPG" border=2></img><br><br> <em>Dynamic Watermark Recognize Panel: Specifying input JAR file, and recreating sequence of inputs</em><br> <img src="dynamic_wm2.JPG" border=2></img><br><br> <em>Dynamic Watermark Recognize Panel: Recognizing embedded watermark</em><br> <img src="dynamic_wm3.JPG" border=2></img><br><br> <em>Static Watermark Embed Panel: Specifying JAR file to be watermarked, watermark and key</em><br> <img src="static_wm1.JPG" border=2></img><br><br> <em>Static Watermark Recognize Panel: Specifying key to recover embedded watermark</em><br> <img src="static_wm2.JPG" border=2></img><br><br> <em>Obfuscate Panel: Specifying input and output JAR files</em><br> <img src="obfuscate.JPG" border=2></img><br><br> <em>Bytecode Diff Panel: Showing Java bytecode differences between original and obfuscated JAR files</em><br> <img src="bytecode_diff.JPG" border=2></img><br><br> <em>Decompile Panel</em><br> <img src="decompile.JPG" border=2></img><br><br> <h2>References</h2> <ul> <li><em>Christian Collberg, Ginger Myles, Andrew Huntwork. "SandMark - A Tool for Software Protection Research." IEEE Security and Privacy, Vol. 1, Num. 4, July/August 2003.</em> <li><em>Christian Collberg, Ginger Myles, and Mike Stepp. "An Empirical Study of Java Bytecode Programs." Technical Report TR04-11, 2004.</em> <li><em>Christian Collberg, Clark Thomborson. "Watermarking, Tamper-Proofing, and Obfuscation - Tools for Software Protection." IEEE Transactions on Software Engineering 28:8, 735-746, August 2002</em> <li><em>Christian Collberg, Clark Thomborson. "Software Watermarking - Models and Dynamic Embeddings." ACM POPL'99.</em> </ul> <p>Download SandMark from <a href="http://sandmark.cs.arizona.edu">http://sandmark.cs.arizona.edu</a> </p> </body> </html>