Håkan Söderström (firstname.lastname@example.org)
Söderström Programvaruverkstad AB
S-122 42 Enskede, Sweden
Copyright © 1994-1996 Hakan Soderstrom and Soderstrom Programvaruverkstad AB, Sweden. Permission to use, copy, modify, distribute, and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice and this permission notice appear in all copies of the software and related documentation.
itweak is an Icon interactive debugging utility. The idea is that you compile your Icon program to ucode files (.u1, .u2). itweak then tweaks the ucode, inserting potential breakpoints. The resulting ucode files are linked with a debugging run-time and off you go.
The itweak system provides you with many of the facilities you would expect from an interactive debugger, including the ability to evaluate a wide range of Icon expressions. Personally I wouldn't like to be without this tool, but I may be biased. It can be used both for finding bugs and to convince oneself that an Icon program indeed works the intended way.
itweak owes a lot to the pioneering debugify system by Charles A. Shartsis. This heritage is gratefully acknowledged. What itweak offers over debugify is radically improved performance (in time as well as space) and a more fully-fledged run-time system.
The author believes the software is useful but wouldn't imagine it is free from bugs. The software is provided "as-is" and without warranty of any kind. Please send bug reports, change requests, and other comments to the address above.
itweak has been tested with Icon 8.10 and 9.0 under Unix (SunOS 4.1.4) and DOS. The software is completely written in Icon, and should be as portable as Icon itself.
Installation is straightforward. For Unix there is a makefile that does most of the job.
Under Unix, type make in the installation directory. The following files are generated.
itweak comes with two Icon source files, itweak.icn and dbg_run.icn. Run the following command to produce the itweak program,
Put itweak (the resulting file) in a commonly accessible directory and include it in your PATH. (If you can, you should of course use the Icon compiler to produce itweak.) Now run the following command,
icont -c dbg_run.icn
The resulting files (dbg_run.u1, dbg_run.u2) constitute the debugging run-time system which will be linked with your tweaked programs.
Make the debugging run-time available to the Icon linker by including its directory in the IPATH environment variable. Or, alternatively, make sure that the dbg_run.u files are present in the same directory as the program you are going to debug.
There are at least two ways you may examine itweak without committing yourself too heavily to it.
The itweak distribution comes with a demo. Under Unix, type make demo to make it happen.
On other platforms, or on platforms without make: do the following commands.
icont -c ipxref.icn
icont -c options.icn
itweak -o samp_ini.icn ipxref options
icont -c samp_ini.icn
icont -o sample ipxref.u1 options.u1
setenv DBG_INPUT demo.cmd
The commands compile and tweak a sample program. The source files are ipxref.icn and options.icn. The resulting 'executable' is called sample. The last command runs a canned debugging session.
Debugging commands for the demo are taken from the file demo.cmd. To make the demo more meaningful you should open an editor on demo.cmd and compare it to the output of the debugging session. The commands are annotated.
Read this to get a first impression of what kinds of debugging commands itweak offer. For reading convenience all commands are spelled out fully. (Commands may be abbreviated as long as the abbreviation is unambiguous.)
Set a breakpoint on a source code line and then let the program run to its first break.
In the following examples we omit the goon command which makes the program continue until the next break (or until it exits).
Print the current value of a simple variable (word).
Attach a macro which automatically prints word every time we hit this breakpoint.
Attach a condition to the breakpoint which causes a break only if word contains the string buffer.
cond . word == "buffer"
The dot means the current breakpoint.
Now some more advanced printing: Print every value generated by an expression. This is useful if the variable contains a list, for example.
You may use subscripting and record field references when printing an expression:
The printing commands actually accept almost all Icon expressions. You may invoke procedures or Icon functions, for instance.
You may use the info command to get information about a breakpoint, source files, local or global variables, among other things:
info break .
These are not all commands. Please refer to the special section on debugging commands. The itweak on-line help contains details about all available commands.
In order to debug an Icon program you will need to go through the following major steps. These steps assume you have installed itweak as described above.
The demo described in the previous section provides an example. The next few sections go more into detail.
Let us assume you have a program built from source files named alpha.icn, beta.icn, and gamma.icn. Compile all source files, but do not link them yet. A suitable command is
icont -c alpha.icn beta.icn gamma.icn
This will produce .u1 and .u2 (i.e. ucode) files for each of the source files.
It is not necessary to tweak all files. However, you will be able to set breakpoints only in tweaked files. In order to illuminate this point, let us assume you decide to tweak only files alpha and gamma. Do this the following way. Note that the itweak command takes base file names, omitting the file name extension (.u1, for example).
itweak alpha gamma
The above command will tweak alpha.u1 and gamma.u1 and one of the .u2 files. It is important to tweak the files in a single itweak command. For reasons described in the quirks section the general recommendation is that you include the file containing the main procedure in the set of tweaked files.
Whenever a ucode file is tweaked the original file is saved under a different name. A .u1 file will have its extension changed to .u1~. A tweaked .u2 file will have its extension changed to .u2~.
Later, when running the program, reference will only be made to source files, not to ucode files.
The itweak command produces an additional Icon file. Its default name is dbg_init.icn. You may change the name of this file by using the -o command line option. For instance, the following is a possible command,
itweak -o proginit.icn alpha gamma
This command will generate a file named proginit.icn, but otherwise perform the same function as the itweak command above. You must compile the generated Icon file. The following command does this (now assuming the default name has been used).
icont -c dbg_init.icn
Finally link the program as you would normally do it. Like this, for instance,
icont alpha.u beta.u gamma.u
The itweak command tweaks one of the .u2 files involved. It inserts the equivalent of link statements. This will, in effect, add dbg_init.icn and dbg_run.u to the link list. The dbg_init.u files will usually be present in the current directory. Of course the dbg_run.u files may also reside in the current directory. However, it is often more useful to have the run-time files in a separate directory which is included in the IPATH environment variable. If the linkage is successful, the result is an executable program alpha (under Unix).
Usually you would develop a program in an edit-compile-debug cycle. itweak notices if a file is already tweaked and does not tweak it a second time. Thus you may run the same itweak command after you have modified and compiled just one of the source files. This means the itweak command is suited for inclusion in a Makefile.
itweak and the debugging run-time introduce numerous global names for its own use. A common prefix is used on all such names to minimize the risk of name clashes with your program. The prefix is '__dbg_' (beginning with a double underscore). It is, of course, possible for the target program to interfere with the debugging run-time, possibly causing it to crash.
itweak detects the main Icon procedure of your program. It inserts code for executing a parameterless procedure named __dbg_init before anything else. This procedure initializes the run-time environment. (The procedure is generated by itweak as part of the dbg_init.icn file.)
If you omit the file containing main from the set of tweaked files you must modify your program to invoke __dbg_init before execution reaches a tweaked file. Otherwise the program will terminate with a run-time error.
This is one reason why tweaked ucode files are not suited for shared libraries. Tweaking a file in a way marks it for a particular program. You (or somebody else) may attempt to tweak the same file in order to use it in a different program, but itweak will not touch it, because it has been tweaked already. There will probably be a conflict at linkage time, however: __dbg_init: inconsistent redeclaration. What you have to do in this case is erase the ucode files and recompile and tweak from scratch.
For each tweaked file itweak creates a global variable holding a set of active breakpoints. The name of this variable contains the base name of the file. This limits file names to the syntax accepted as Icon identifiers.
This section describes what a debugging session looks like.
After having tweaked and linked your program according to the description above you should be able to start it as usual. It will behave slightly different, however. After starting up a '$' prompt will appear (on standard error). The prompt means you are expected to enter a debugging command (on standard input).
Detailed command descriptions are available on-line through the help command. Type help to see a list of available commands. Type help command to get a description of a particular command.
Environment variables may be used to re-direct debugging input and output.
The debugging commands will enable you to control and monitor the execution of your program. This section contains general information and some examples. Detailed descriptions are available on-line through the help command.
All debugging command keywords may be abbreviated as long as the abbreviation is unambiguous. For instance, goon nobreak may usually be written g no.
The reason we say usually is that you may define new commands by means of the macro command. Macro names are subject to the same abbreviation rules as built-in commands.
The break command defines a breakpoint on a source line or on a number of consecutive source lines. The break will take effect after the expression on the source line has been evaluated. (This is a difference from most other debuggers where breaks occur before the source line is executed.)
In some cases the break occurs in a slightly different place from where you would expect it. This is the reason the break command optionally covers more than one source line. By setting breakpoints on a few lines around the interesting spot you may make sure that there really is a break.
A source line cannot have more than one breakpoint. Each break command silently supersedes any previous breakpoints it happens to overlap. The clear breakpoint removes a breakpoint.
A breakpoint is identified by a small integer, the breakpoint number. The break command prints the breakpoint number of the breakpoint it creates. The breakpoint number can be used in other debugging commands.
You may identify a breakpoint by its literal breakpoint number, or by the special symbols '.' (dot) and '$' (dollar). Dot means the current breakpoint, i.e. the breakpoint that caused the current break. Dollar means the last breakpoint defined by a break command.
Use the info breakpoint command to see the definition of a breakpoint (or all breakpoints).
A plain breakpoint as created by break is unconditional. There are several ways you may modify its behavior to suit your needs.
When a plain break occurs a special macro called the prelude is executed. The standard prelude prints the breakpoint number and the location of the breakpoint. In a similar way a special macro called the postlude is executed just before execution is resumed after a break. The standard postlude is empty.
The prelude and postlude are ordinary macros which you may redefine by means of the set command.
Note that the prelude is not executed if a break is caused by a breakpoint with a do macro.
Breakpoint zero is special. The next debugging command causes a break to occur after the next source line has been executed (or after a specified number of lines). A break caused by a next command is treated as if defined by breakpoint number zero. (This is the case even if there is an ordinary breakpoint on the same source line.) Breakpoint number zero may be assigned a condition, a do macro, or an ignore count, just like other breakpoints. It may not be cleared, however.
Expressions may be included in the various print commands and in breakpoint conditions. Expressions may be formed from
A few keywords have been added or altered:
Expression evaluation is guarded by error conversion. An Icon error during evaluation should cause a conflict message, but not terminate the program.
There are several debugging commands for evaluating and printing expressions.
The print command takes any number of expressions separated by semicolon. The command evaluates and prints the image of the first value returned by each expression. This is a common way to inspect variables, for instance.
The eprint command (e as in every) takes a single expression and prints the image of every value it generates. The following example shows a simple way of printing the contents of a list,
The fprint command (f as in format) expects a format string followed by any number of expressions. The format string can be any expression returning a string-convertible value. The expressions must be separated by semicolon. The format string may contain placeholders. The remaining expressions are expected to return values to insert into the format string, replacing the placeholders. In this case the actual value is used, not the image. A conflict is generated if any of the values is not string-convertible, so you may have to use the image function, or some other explicit conversion.
The fprint command is useful when you care about the appearance of the output.
The fprint command does not print a newline unless it is explicitly included in the output. Usually it can be inserted at the end of the format string.
A format string placeholder is basically a percent (%) character followed by a digit 1-9. Thus there can be up to nine different placeholders. A particular placeholder ('%1' for example) may occur any number of times. Each occurrence of '%1' will be replaced by the value of the first expression after the format string. Each occurrence of '%2' will be replaced by the value of the second expression after the format string, and so on.
A plain placeholder represents a variable-length field. It is possible to specify a fixed-length field. Add '<' for a left-justified, or '>' for a right-justified field. Also add the length of the field. For instance, '%1<20' defines a left-justified field with a fixed length of 20 characters.
To print a percent character, double the character in the format string (%%). Backslash (\) can also be used to quote other characters.
A placeholder for which there is no value is silently replaced by its placeholder number.
The itweak algorithm for deciding source line limits is rather simple-minded. This is the reason breaks do not always occur exactly where you expect.
The implementation of the alternation (|) control structure is naive; works only in simple cases. (See The Icon Analyst, Number 23, April 1994.)
It is currently not possible to list macro definitions (including do macros).
A few commands use the display file: frame, info globals, where. The display file is simply the output from the display Icon function. Writing the display file requires write permission in the current directory.
It should be possible to negate a breakpoint condition, but this is not implemented yet.
It is possible to invoke a target program procedure in an expression. This can be useful for side effects. The run-time is not fully re-entrant, however, so if there is a breakpoint in the procedure the run-time may get confused when it returns. (No fatal error should occur.)
Escaping characters in fprint format strings do not always work.
Beware of the following format string.
It generates a long, long output.
My main dissatisfaction with the debugify package was performance. Thus a lot of effort has gone into finding ways to minimize the debugging overhead. The following performance measurements were made on a Sun SPARCstation IPC under SunOS 4.1.3 with 24 Mb of memory.
A tweaked ucode file will be less than 2 times the size of the untweaked file (debugify: 5 times). A tweaked program without any breakpoints (goon nobreak) runs approximately 4 times slower than an untweaked program (debugify: 200 times; this easily becomes unbearable). The itweak program itself runs at over 3 times the speed of debugify.
The increased performance carries a certain cost: Only a single potential breakpoint is created per source line. No provision is made for setting variables. The code is not executable unless certain global variables (created by itweak) have been initialized.
Debugging commands are compiled to an internal representation as they are entered. This is especially important for expressions. Expressions are parsed with simple string matching, backtracking and all. They are immediately unwound and converted to a postfix notation. This means that breakpoint conditions and macros can be evaluated efficiently.
The Icon source code generated by itweak mainly creates and initializes a number of global variables. An Icon set is created for each tweaked source file. The sets are used to hold breakpoint line numbers.
itweak creates a potential breakpoint on every source line it finds in the ucode file. A potential breakpoint consists of code testing the current line number against the set of breakpoint line numbers for the current source file.
If the test says 'yes' then a jump is made to code added at the end of the current procedure. This code collects the values and names of all locals and calls the debugging run-time. The same code is used for all potential breakpoints in one procedure. This means that besides potential breakpoints a chunk of code is added at the end of every procedure.
A global variable named __dbg_test is used to test for breakpoints. It may be set to different Icon functions to achieve various effects. The function will be called with two parameters: a set of breakpoint line numbers and an integer line number. The following values are currently used,
The debugging run-time is a procedure. It must fail in order not to disturb the logic of the current procedure.
It surprises me that it is possible to do this amount of tweaking to an Icon program. I have debugged fairly complex programs without noticing any unexpected weirdness (like tweaked program logic). However, itweak as a whole is a case of reverse engineering. Someone with greater theoretical insight may be able to detect cracks in the tweaking scheme. Please tell me in such case.