CSc 453: Some Thoughts on Testing

The problem of coming up with a good set of test cases is an important problem in software development, has been studied extensively in the software engineering literature, and has a wide variety of supporting tools available (see, e.g., http://www.testingfaqs.org). There is far too much on this topic "out there" to imagine that I can give any sort of reasonably thorough introduction to the topic here. This page aims, instead, to simply list a few guidelines that may be helpful in suggesting how to structure test cases. It has no pretensions to any sort of completeness. Caveat emptor.

Zero, One, Many

A very useful way to construct test cases is to consider a Zero-One-Many rule. The essential idea here is to have, for each feature X of interest, test cases that have: (a) zero occurrences of X; (b) one occurrence of X; and (c) many occurrences of X. For example, when testing the parsing of function prototypes, we might consider the following cases:

no prototypes;
a single prototype:
- no arguments (void);
- a single argument;
- several arguments;
a single declaration with several prototypes listed, i.e., of the form
int f1(...), f2(...), ...
several declarations, each with a single prototype (but with the various combinations of argument types, as described above);
several declarations, each with several prototypes (again with the various combinations of argument types, as described above);

Similarly, to test the parsing of if statements, we might consider test cases with no ifs, a single if, several ifs one after another, and several ifs nested one within another.

Boundary Cases

Construct inputs to test any limiting values of your system. For example, what happens if the input is empty? If your implementation uses a buffer size of 1K, what happens if you get an identifier whose length exceeds that value? (Remember that implementation-specific limits on such things as the length of an identifier or the value of an integer are permissible, as long as the limits are "reasonable", but the implementation should behave gracefully on inputs that exceed those limits. Dumping core, or halting and catching fire, are not acceptable. Buffers that overflow on big inputs will be a source of delight to hackers.)

Testing Mechanics

Consider starting with a (large) number of test inputs, each of which tests one feature. Then construct a further (even larger) number of test inputs that explore any possible interactions between different features (such as the nesting of if and while statements, or an assignment whose left hand side contains intermingled array references and function calls).

Potentially, this means that you may be looking at dozens or even hundreds of test inputs. How do you manage them all? The simplest approach is to automate the testing, e.g., using a script. For example, my own test script looks something like this:

#!/bin/csh -fx set TESTDIR = ... set TESTS = (test01.c test02.c test03.c ...) set BINARY = "./parse"
foreach t ( ${TESTS} ) echo "TESTING: ${t}" if ( -e ${t}.out ) /bin/rm -f ${t}.out if ( -e ${t}.diff ) /bin/rm -f ${t}.diff ${BINARY} < ${TESTDIR}/${t} > ${t}.out if ( -e ${t}.out ) diff ${t}-expected.out ${t}.out > ${t}.diff if ( -e ${t}.diff && -z ${t}.diff) then echo "TEST ${t} : PASSED" else echo "TEST ${t} : FAILED" endif end

The Relationship of C-- to C

The programming language for which you are building a compiler, C--, is a subset of C. This means that every legal C-- program is also a legal C program, but some legal C programs are not legal C-- programs. You can sometimes take advantage of this by using GCC to do some of your work for you:

if GCC gives a syntax or type error for a program, then your compiler should also give an error (the converse does not hold);
for code generation, if you want to know what output a C-- program should produce when executed, you can compile it with GCC and execute the resulting code.