Version 9.3 of the Icon Programming Language

Ralph E. Griswold, Clinton L. Jeffery, and Gregg M. Townsend

Department of Computer Science
The University of Arizona
Tucson, Arizona

IPD278b
July 2, 1998
http://www.cs.arizona.edu/icon/docs/ipd278.htm


1. Introduction

The current version of Icon is Version 9.3 and is described in the third edition of the Icon book [1]. This document supplements the second edition of the Icon book [2], which describes Version 8.0.

The current minor version of Icon is Version 9.3.1. It contains one new feature. See Section 2.4.

Most of the language extensions in Version 9.3 are upward-compatible with previous versions of Icon and most programs written for earlier versions work properly under Version 9.3. The language additions to Version 9.3 are:

There also are several improvements to the implementation. See Section 3.

2. Language Features

2.1 Preprocessing

All Icon source code passes through a preprocessor before translation. The effects of preprocessing can be seen by running icont or iconc with the -E flag.

Preprocessor directives control the actions of the preprocessor and are not passed to the Icon translator or compiler. If no preprocessor directives are present, the source code passes through the preprocessor unaltered.

A source line is a preprocessor directive if its first non-whitespace character is a $ and if that $ is not followed by another punctuation character. The general form of a preprocessor directive is
$ directive arguments # comment
Whitespace separates tokens when needed, and case is significant, as in Icon proper. The entire preprocessor directive must appear on a single line which cannot be continued. The comment portion is optional. An invalid preprocessor directive produces an error except when skipped by conditional compilation.

Preprocessor directives can appear anywhere in an Icon source file without regard to procedure, declaration, or expression boundaries.

Include Directives

An include directive has the form
$include filename
An include directive causes the contents of another file to be interpolated in the source file. The file name must be quoted if it is not in the form of an Icon identifier. #line comments are inserted before and after the included file to allow proper identification of errors.

Included files may be nested to arbitrary depth, but a file may not include itself either directly or indirectly. File names are looked for first in the current directory and then in the directories listed in the environment variable LPATH. Relative paths are interpreted in the preprocessor's context and not in relation to the including file's location.

Line Directives

A line directive has the form
$line n [filename]
The line containing the preprocessing directive is considered to be line n of the given file (or the current file, if unspecified) for diagnostic and other purposes. The line number is a simple unsigned integer. The file name must be quoted if it is not in the form of an Icon identifier.

Note that the interpretation of n differs from that of the C preprocessor, which interprets it as the number of the next line. $line is an alternative form of the older, special comment form #line. The preprocessor recognizes both forms and produces the fully specified older form for the lexical analyzer.

Define Directives

A define directive has the form
$define name text
The define directive defines the text to be substituted for later occurrences of the identifier name in the source code. text is any sequence of characters except that any string or cset literals must be properly terminated within the definition. Leading and trailing whitespace are not part of the definition. The text can be empty.

Redefinition of a name is allowed only if the new text is exactly the same as the old text. For example, 3.0 is not the same as 3.000.

Redefinition of Icon's reserved words and keywords is allowed but not advised.

Definitions remain in effect through the end of the current original source file, crossing include boundaries, but they do not persist from file to file when multiple names are given on the command line.

If the text of a definition is an expression, it is wise to parenthesize it so that precedence causes no problems when it is substituted. If the text begins with a left parenthesis, it must be separated from the name by at least one space. Note that the Icon preprocessor, unlike the C preprocessor, does not provide parameterized definitions.

Undefine Directives

An undefine directive has the form
$undef name
The current definition of name is removed, allowing its redefinition if desired. It is not an error to undefine a non-existent name.

Predefined Symbols

At the start of each source file, several symbols are automatically defined to indicate the Icon system configuration. Each potential predefined symbol corresponds to one of the values produced by the keyword &features. If a feature is present, the symbol is defined with a value of 1. If a feature is absent, the symbol is not defined. See Appendix A for a list of predefined symbols.

Predefined symbols have no special status: like other symbols, they can be undefined and redefined.

Substitution

As input is read, each identifier is checked to see if it matches a previous definition. If it does, the value replaces the identifier in the input stream.

No whitespace is added or deleted when a definition is inserted. The replacement text is scanned for defined identifiers, possibly causing further substitution, but recognition of the original identifier name is disabled to prevent infinite recursion.

Occurrences of defined names within comments, literals, or preprocessor directives are not altered.

The preprocessor is ignorant of multi-line literals and can potentially be fooled this way into making a substitution inside a string constant.

The preprocessor works hard to get line numbers right, but column numbers are likely to be rendered incorrect by substitutions.

Substitution cannot produce a preprocessor directive. By then it is too late.

Conditional Compilation

Conditional compilation directives have the form
$ifdef name
and
$ifndef name
$ifdef or $ifndef cause subsequent code to be accepted or skipped depending on whether name has been previously defined. $ifdef succeeds if a definition exists; $ifndef succeeds if a definition does not exist. The value of the definition does not matter.

A conditional block has this general form:
$ifdef name   or   $ifndef name
... code to use if test succeeds ...
$else
... code to use if test fails ...
$endif
The $else section is optional. Conditional blocks can be nested provided that all of the $if/$else/$endif directives for a particular block are in the same source file. This does not prevent the conditional inclusion of other files via $include as long as any included conditional blocks are similarly self-contained.

Error Directives

An error directive has the form
$error text
An $error directive forces a fatal compilation error displaying the given text. This is typically used with conditional compilation to indicate an improper set of definitions.

Subtle Points

Because substitution occurs on replacement text but not on preprocessor directives, either of the following sequences is valid:
$define x 1    $define y x
$define y x    $define x 1
write(y)       write(y)
It is possible to construct pathological examples of definitions that combine with the input text to form a single Icon token, as in
$define X e3   $define Y 456e
write(123X)    write(Y+3)

2.2 Graphics Facilities

Version 9.3 provides graphics facilities through a combination of high-level support and a repertoire of functions. Not all platforms support graphics. Consult the current reference manual [3].

2.3 New Functions and Keywords

The new functions and keywords are described briefly here. Appendix B contains more complete descriptions in the style of the second edition of the Icon book.

There are six new functions:
chdir(s)         Changes the current directory to s but
                 fails if there is no such directory or
                 if the change cannot be made.

delay(i)         Delays execution i milliseconds.
                 Delaying execution is not supported on
                 all platforms; if it is not, there is no
                 delay and delay() fails.

function()       Generates the names of the Icon (built-
                 in) functions.

loadfunc(s1, s2) Dynamically loads a C function. This
                 function presently is supported on a few
                 UNIX systems. See [4] for details.

serial(X)        Produces the serial number of X if it
                 is a type that has one.

sortf(X, i)      Produces a sorted list of the elements
                 of X. The results are similar to those
                 of sort(X), except that among lists
                 and among records, structure values are
                 ordered by comparing their ith fields.
There are six new keywords:
&allocated       Generates the number of bytes allocated
                 since the beginning of program
                 execution. The first result is the
                 total number of bytes in all regions,
                 followed by the number of bytes in the
                 static, string, and block regions.

&dump            If the value of &dump is nonzero at
                 program termination, a dump in the style
                 of display() is provided.

&e               The base of the natural logarithms,
                 2.71828 ...

&phi             The golden ratio, 1.61803 ...

&pi              The ratio of the circumference of a
                 circle to its diameter, 3.14159 ...

&progname        The file name of the executing program.
                 &progname is a variable and a string
                 value can be assigned to it to replace 
                 its initial value.
The graphics facilities add additional new keywords [3].

Some UNIX platforms now support the keyboard functions getch(), getche(), and kbhit(). Whether or not these functions are supported can be determined from the values generated by &features. Note: On UNIX platforms, "keyboard" input comes from standard input, which may not necessarily be the keyboard. Warning: The keyboard functions under UNIX may not work reliably in all situations and may leave the console in a strange mode if interrupted at an unfortunate time. These potential problems should be kept in mind when using these functions.

2.4 Other Language Enhancements

Lists

The functions push() and put() now can be called with multiple arguments to add several values to a list at one time. For example,
put(L, x1, x2, x3)
appends the values of x1, x2, and x3 to L. In the case of push(), values are prepended in order that they appear from left to right. Consequently, as a result of
push(L, x1, x2, x3)
the first (leftmost) item on L is the value of x3.

Records

Records can now be sorted by sort() and sortf() to produce sorted lists of the record fields.

A record can now be subscripted by the string name of one of its fields, as in
z["r"]
which is equivalent to
z.r
If the named field does not exist for the record, the subscripting expression fails.

Records can now be used to supply arguments in procedure invocation, as in
p ! R
which invokes p with arguments from the fields of R.

Multiple Subscripts

Multiple subscripts are now allowed in subscripting expressions. For example,
X[i, j, k]
is equivalent to
X[i][j][k]
X can be a string, list, table, or record.

Integers

The sign of an integer is now preserved when it is shifted right with ishift().

The form of approximation for large integers that appear in diagnostic messages now indicates a power of ten, as in 10^57. The approximation is now accurate to the nearest power of 10.

Named Functions

The function proc(x, i) has been extended so that proc(x, 0) produces the built-in function named x even if the global identifier having that name has been assigned another value. proc(x, 0) fails if x is not the name of a function.

String Invocation

String invocation can now be used for assignment operations, as in
":="(x, 3)
which assigns 3 to x.

Line Terminators

In Version 9.3.1, the function read() recognizes any of the three kinds of line terminators in translated mode: MS-DOS, Macintosh, or UNIX. This means that text files created on one platform can be read by an Icon program running on a different platform.

2.5 Other Changes

3. Implementation Changes

Linker Changes

By default, unreferenced globals (including procedures and record constructors) are now omitted from the code generated by icont. This may substantially reduce the size of icode files, especially when a package of procedures is linked but not all the procedures are used.

The invocable declaration and the command-line options -f s and -v n are now honored by icont as well as iconc [5]. The invocable declaration can be used to prevent the removal of specific unreferenced procedures and record constructors that are invoked by string invocation. The -f s option prevents the removal of all unreferenced declarations and is equivalent to invocable all.

The command line option -v n to icont controls the verbosity of its output:
-v 0 is the same as icont -s

-v 1
is the default

-v 2 reports the sizes of the icode sections (procedures, strings, and so forth)

-v 3 also lists discarded globals
Note: Programs that use string invocation may malfunction if the default removal of declarations is used. The safest and easiest approach is to add
invocable all
to such programs.

Other Changes

4. Limitations, Bugs, and Problems

References

1. R. E. Griswold and M. T. Griswold, The Icon Programming Language, Peer-to-Peer Communications, Inc., San Jose, CA, third edition, 1996.

2. R. E. Griswold and M. T. Griswold, The Icon Programming Language, Prentice-Hall, Inc., Englewood Cliffs, NJ, second edition, 1990.

3. G. M. Townsend, R. E. Griswold and C. L. Jeffery, Graphics Facilities for the Icon Programming Language; Version 9.3, The Univ. of Arizona Icon Project Document IPD280, 1996.

4. R. E. Griswold and G. M. Townsend, Calling C Functions from Version 9 of Icon, The Univ. of Arizona Icon Project Document IPD240, 1995.

5. R. E. Griswold, Version 9 of the Icon Compiler, The Univ. of Arizona Icon Project Document IPD237, 1995.


Appendix A -- Predefined Symbols

predefined symbol              &features value

 _AMIGA                        Amiga
 _ACORN                        Acorn Archimedes
 _ATARI                        Atari ST
 _CMS                          CMS
 _MACINTOSH                    Macintosh
 _MSDOS_386                    MS-DOS/386
 _MS_WINDOWS_NT                Windows NT
 _MSDOS                        MS-DOS
 _MVS                          MVS
 _OS2                          OS/2
 _PORT                         PORT
 _UNIX                         UNIX
 _VMS                          VMS

 _ASCII                        ASCII
 _EBCDIC                       EBCDIC

 _CO_EXPRESSIONS               co-expressions
 _DYNAMIC_LOADING              dynamic loading
 _EVENT_MONITOR                event monitoring
 _EXTERNAL_FUNCTIONS           external functions
 _GRAPHICS                     graphics
 -KEYBOARD_FUNCTIONS           keyboard functions
 -LARGE_INTEGERS               large integers
 -MULTITASKING                 multiple programs
 -PIPES                        pipes
 -RECORD_IO                    record I/O
 -SYSTEM_FUNCTION              system function

 -ARM_FUNCTIONS                Archimedes extensions
 _DOS_FUNCTIONS                MS-DOS extensions
 _MS_WINDOWS                   MS Windows
 _PRESENTATION_MGR             Presentation Manager     
 _X_WINDOW_SYSTEM              X Windows
 _WIN32                        Win32
 _WIN16                        Win16
In addition, the symbol _V9 is defined in Version 9.


Appendix B -- New Functions and Keywords


chdir(s) : n -- change directory

chdir(s) changes the directory to s but fails if there is no such directory or if the change cannot be made. Whether the change in the directory persists after program termination depends on the operating system on which the program runs.
Error:   103    s not string


delay(i) : n -- delay execution

delay(i) delays execution i milliseconds. This function is not supported on all platforms; if it is not, there is no delay and delay() fails.
Error:   101    i not integer

flush(f) : n -- flush buffer

flush(f) flushes the output buffers for f.
Error:   105    f not file

function() : s1, s2, ..., s -- generate function names

function() generates the names of the Icon (built-in) functions.


loadfunc(s1, s2) : p -- load external function

loadfunc(s1, s2) loads the function named s2 from the library file s1. s2 must be a C or compatible function that provides a particular interface expected by loadfunc(). loadfunc() is not available on all systems.


proc(x, i) : p -- convert to procedure

proc(x, i) produces a procedure corresponding to the value of x, but fails if x does not correspond to a procedure. If x is the string name of an operator, i specifies the number of arguments: 1 for unary (prefix), 2 for binary (infix) and 3 for ternary. proc(x, 0) produces the built-in function named x even if the global identifier having that name has been assigned another value. proc(x, 0) fails if x is not the name of a function.
Default: i      1

Errors:  101    i not integer
         205    i not 0, 1, 2, or 3

push(L, x1, x2, ..., xn) : L -- push onto list

push(L, x1, x2, ..., xn) pushes x1, x2, ..., onto the left end of L. Values are pushed in order from left to right, so xn becomes the first (leftmost) value on L. push(L) with no second argument pushes a null value onto L.
Errors:  108    L not list
         307    inadequate space in block region

See also: get(), pop(), pull(), and put()

put(L, x1, x2, ..., xn) : L -- put onto list

put(L, x1, x2, ..., xn) puts x1, x2, ..., onto the right end of L. Values are pushed in order from left to right, so xn becomes the last (rightmost) value on L. put(L) with no second argument puts a null value onto L.
Errors:  108    L not list
         307    inadequate space in block region

See also: get(), pop(), pull(), and push()

serial(x) : s -- produce serial number

serial(x) produces the serial number of x if it is a type that has one but fails otherwise.


sort(X, i) : L -- sort structure

sort(X, i) produces a list containing values from x. If X is a list, record, or set, sort(X, i)
produces the values of X in sorted order. If X is a table, sort(X, i) produces a list obtained by sorting the elements of X, depending on the value of i. For i = 1 or 2, the list elements are two-element lists of key/value pairs. For i = 3 or 4, the list elements are alternative keys and values. Sorting is by keys for i odd, by value for i even.
Default: i      1

Errors:  101    i not integer
         115    X not structure
         205    i not 1, 2, 3, or 4
         307    inadequate space in block storage region

See also: sortf()

sortf(X, i) : L -- sort structure by field

sortf(X, i) produces a sorted list of the values in X. Sorting is primarily by type and in most respects is the same as with sort(X). However, among lists and among records, two structures are ordered by comparing their ith fields. i can be negative but not zero. Two structures having equal ith fields are ordered as they would be in regular sorting, but structures lacking an ith field appear before structures having them. Tables cannot be sorted by sortf().
Default: i      1

Errors:  101    i not integer
         126    X not list, record, or set
         205    i =  0
         307    inadequate space in block region

See also: sort()

&allocated : i1, i2, i3, i4 -- cumulative allocation

&allocated generates the total amount of space, in bytes, allocated since the beginning of program execution. The first value is the total for all regions, followed by the totals for the static, string, and block regions, respectively. The space allocated in the static region is always given as zero. Note: &allocated gives the cumulative allocation; &storage gives the current allocation; that is, the amount that has not been freed by garbage collection.


&dump : i -- termination dump

If the value of &dump is nonzero when program execution terminates, a dump in the style of display() is provided.


&e : r -- base of natural logarithms

The value of &e is the base of the natural logarithms, 2.71828 ... .


&phi : r -- golden ratio

The value of &phi is the golden ratio, 1.61803 ... .


&pi : r -- ratio of circumference to diameter of a circle

The value of &pi is the ratio of the circumference of a circle to its diameter, 3.14159 ... .


&progname : s -- program name

The value of &progname is the file name of the executing program. A string value can be assigned to &progname to replace its initial value.


Icon home page