SandMark

SandMark is a tool developed at the University of Arizona for software watermarking, tamperproofing, birthmarking, and code obfuscation of Java bytecode. The tool incorporates several dynamic and static watermarking algorithms, a large collection of obfuscation algorithms, a code optimizer, and tools for viewing and analyzing Java bytecode.

The SandMark website is here. The latest version of the SandMark tool is here: sandmark.jar.


Publications

  1. Christian Collberg, Clark Thomborson, Gregg M. Townsend, Dynamic Graph-Based Software Fingerprinting, ACM Transactions of Programming Languages and Systems, Volume 29, Number 6, October 2007. pdf
  2. Ginger Myles, Christian Collberg, Software Watermarking via Opaque Predicates: Implementation, Analysis, and Attacks, Electronic Commerce Research Journal, Volume 6, Number 2, pp. 155-171, 2006. pdf
  3. Ginger Myles, Christian Collberg, k-gram Based Software Birthmarks, Proceedings of the 2005 ACM Symposium on Applied Computing, Computer Security Track, pp. 314-318, 2005. pdf
  4. Christian Collberg, Tapas Sahoo, Software Watermarking in the Frequency Domain: Implementation, Analysis, and Attacks, Journal of Computer Security, Volume 13, Number 5, 721--755, 2005. pdf
  5. Ginger Myles, Christian Collberg, Zachary Heidepriem, Armand Navabi, The evaluation of two software watermarking algorithms, Software - Practice and Experience Volume 35, Number 10, pp 923-938, 2005. pdf
  6. Christian Collberg, Edward Carter, Saumya Debray, Andrew Huntwork, John Kececioglu, Cullen Linn, Michael Stepp, Dynamic Path-Based Software Watermarking, ACM Programming Languages Design and Implementation (PLDI), 2004. pdf
  7. Christian Collberg, Andrew Huntwork, Edward Carter, Gregg Townsend, Graph Theoretic Software Watermarks: Implementation, Analysis, and Attacks, 6thInformation Hiding Workshop, 2004. pdf
  8. Ginger Myles, Christian Collberg, Software Watermarking via Opaque Predicates: Implementation, Analysis, and Attack, The Seventh International Conference on Electronic Commerce Research (ICECR-7), June 2004. pdf
  9. Kelly Heffner, Christian Collberg, The Obfuscation Executive, 7th Information Security Conference (ISC'04), September 2004. pdf
  10. Ginger Myles, Christian Collberg, Detecting Software Theft via Whole Program Path Birthmarks, 7th Information Security Conference (ISC'04), September 2004. pdf
  11. Christian Collberg, Ginger Myles, Andrew Huntwork, Sandmark--A Tool for Software Protection Research, IEEE Security & Privacy, Volume 1, Number 4, pp. 40--49, 2003. pdf
  12. Christian Collberg, Edward Carter, Stephen Kobourov, Clark Thomborson, Error-Correcting Graphs for Software Watermarking, 29th Workshop on Graph Theoretic Concepts in Computer Science (WG'2003), June 2003. pdf
  13. Ginger Myles, Christian Collberg, Software Watermarking Through Register Allocation: Implementation Analysis, and Attacks, 6th Annual International Conference on Information Security and Cryptology (ICISC), November 2003. springer
  14. Christian Collberg, Clark Thomborson, Douglas Low, Obfuscation techniques for enhancing software security, United States Patent 6,668,325, Assignee: InterTrust Technologies (Santa Clara, CA), Filed June 9, 1998, Issued December 23, 2003. pdf
  15. Christian Collberg, Clark Thomborson, Watermarking, Tamper-Proofing, and Obfuscation -- Tools for Software Protection, IEEE Transactions on Software Engineering, Volume 28, Number 8, pp. 735--746, August 2002, This paper was among the most cited journal articles in software engineering from 2002 based on a citation study conducted by Prof. Claes Wohlin, pdf
  16. Christian Collberg, Clark Thomborson, Software Watermarking --- Models and Dynamic Embeddings, ACM Principles of Programming Languages (POPL'99), January 1999. pdf
  17. Christian Collberg, Clark Thomborson, and Douglas Low, Manufacturing Cheap, Resilient, and Stealthy Opaque Constructs, ACM Principles of Programming Languages (POPL'98), January 1998. pdf (scanned), pdf (clean)
  18. Christian Collberg, Clark Thomborson, Douglas Low, Breaking Abstractions and Unstructuring Data Structures, IEEE International Conference on Computer Languages (ICCL'98), May 1998. pdf

Supporting Grants and Contracts

  1. September 1, 2000--August 1, 2004, $265,000 from the NSF: Software Watermarking, Obfuscation, and Tamper-Proofing for Software Protection, grant CCR-0073483.
  2. June 2002, $417,000 (Option/Year 1) + $417,000 (Option/Year 2), Air Force Research Lab (AFRL): Protecting Software Against Tampering and Reverse Engineering, contract F33615-02-1146.

Splat

Self-plagiarism occurs when an author reuses portions of their previous writings in subsequent research papers. Occasionally, the derived paper is simply a re-titled and reformatted version of the original one, but more frequently it is assembled from bits and pieces of previous work.

It is our belief that self-plagiarism is detrimental to scientific progress and bad for our academic community. Flooding conferences and journals with near-identical papers makes searching for information relevant to a particular topic harder than it has to be. It also rewards those authors who are able to break down their results into overlapping least-publishable-units over those who publish each result only once. Finally, whenever a self-plagiarized paper is allowed to be published, another, more deserving paper, is not.

You can read more about Splat here.


Collaborators


Publications

  1. Christian Collberg, Stephen Kobourov, Self-Plagiarism in Computer Science, Communications of the ACM, April 2005. pdf
  2. Christian Collberg, Stephen Kobourov, Joshua Louie, Thomas Slattery, SPLAT: A System for Self-Plagiarism Detection, IADIS International Conference WWW/Internet (ICWI 2003), pp. 508-514, November 2003. pdf

Automatic Retargeting

There are three popular methods for constructing highly retargetable compilers: (1) the compiler emits abstract machine code which is interpreted at run-time, (2) the compiler emits C code which is subsequently compiled to machine code by the native C compiler, or (3) the compiler's code-generator is generated by a back-end generator from a formal machine description produced by the compiler writer. These methods incur high costs at run-time, compile-time, or compiler-construction time, respectively.

We're interested in a fourth method which combines the fast retargeting of C code generating compilers with the efficiency of specification-driven code generators.

The basic idea is to use the native C compiler at compiler construction time to discover architectural features of the new architecture. From this information a formal machine description is produced. Given this machine description, a native code-generator can be generated by a back-end generator such as BEG or burg.

You can download the tool here.


Publications

  1. Christian Collberg, Automatic Derivation of Compiler Machine Descriptions, ACM Transactions on Programming Languages and Systems, Volume 24, Number 4, July 2002, pp. 369--408. pdf
  2. Christian Collberg, Reverse Interpretation + Mutation Analysis = Automatic Re" targeting, ACM SIGPLAN Conference on Programming Language Design and Implementation, (PLDI'97), June 1997. pdf
  3. Christian Collberg, Automatic Derivation of Machine Descriptions, Proceedings of the Twentieth Australasian Computer Science Conference, February 1997. pdf

Code Rendering

ART is a language-independent and specification-driven program rendering tool that is able to produce high-quality code renderings of arbitrary complexity. The tool can incorporate arbitrary types of information together with the program code, allowing it to be used for debugging and profiling as well as for producing beautiful renderings of programs for publication.

You can download the tool here and the README file here.


Collaborators


Publications

  1. Christian Collberg, Sean Davey, Todd Proebsting, Language-Agnostic Program Rendering for Presentation, Debugging and Visualization, IEEE Symposium on Visual Languages (VL'2000), September 2000. pdf

AlgoVista

AlgoVista is a web-based search engine designed to allow applied computer scientists to classify problems and find algorithms and implementations that solve these problems. Unlike other search engines, AlgoVista is not keyword based. Rather, users provide a set of input=>output samples that describe the behavior of the problem they wish to classify. This type of query-by-example requires no knowledge of specialized terminology, only an ability to formalize the problem. The search mechanism of AlgoVista is based on a novel application of program checking, a technique developed as an alternative to program verification and testing.

You can download the tool here.


Collaborators


Publications

  1. Christian Collberg, Stephen Kobourov, Suzanne Westbrook, AlgoVista: an algorithmic search tool in an educational setting, Technical Symposium on Computer Science Education (SIGCSE), pp. 462-466, March 2004. acm
  2. Christian Collberg, Todd A. Proebsting, Problem Classification using Program Checking, Fun with Algorithms (FUN '01), May 29--31, 2001. pdf
  3. Christian Collberg, Stephen Kobourov, Suzanne Westbrook, AlgoVista: an algorithmic search tool in an educational setting, Technical Symposium on Computer Science Education (SIGCSE), pp. 462-466, March 2004. acm

Flexible Encapsulation

Most modular programming languages provide an encapsulation concept. Such concepts are used to protect the representational details of the implementation of an abstraction from abuse by its clients. Unfortunately, strict encapsulation is hindered by the separate compilation facilities provided by modern languages. The goal of the work presented here is to introduce techniques which allow modular languages to support both separate compilation and strict encapsulation without undue translation-time or execution-time cost.

You can download the tool here.


Publications

  1. Christian Collberg, Distributed High-Level Module Binding for Flexible Encapsulation and Fast Inter-Modular Optimization, International Conference on Programming Languages and Systems Architectures, LNCS 782, March 1994. pdf
  2. Christian Collberg, Flexible Encapsulation, Ph.D. Thesis, Lund University, December 1992.
  3. Christian Collberg, Magnus Krampell, Design and Implementation of Modular Languages Supporting Information Hiding, 6th International Phoenix Conference on Computers and Communications, February 1987.