Supplementary Information for the Implementation of Version 8 of Icon
Ralph E. Griswold
| Department of Computer
Science
The University of Arizona
Tucson, Arizona
IPD112b
April 7, 1996
http://www.cs.arizona.edu/icon/docs/ipd112.htm |
1. Introduction
The Icon programming language [1] is fairly stable now, although refinements
and new features are added occasionally [2]. The implementation of Icon,
on the other hand, is still changing constantly. Besides changes made to
support new features of the language, changes are made to correct errors,
improve performance, increase portability, accommodate the requirements
of new compilers and operating systems, improve readability of the source
code, and so on.
The book on the implementation of Icon [3] is based on Version 6 of Icon.
When the book was written, Version 6 was fairly new. The book is a snapshot
of the implementation at a particular time during the final phases of the
development of Version 6.2. In fact, this snapshot itself is slightly out
of focus, since the implementation was changing even as the book was completed.
While the fundamental aspects of the implementation have not changed, there
have been many minor changes, and in some respects the source code now differs
noticeably from what appears in some places in the book.
This report is designed to be used in conjunction with the implementation
book and source-code listings of Icon. Section 2 describes the most significant
differences between what appears in the book and Version 8 of Icon. A list
of corrections to known errors in the book appears in Section 3.
2. Recent Changes to the Implementation
2.1 Maintenance of Scanning Environments [Section 9.6, pp. 158-162]
String scanning saves, sets, and restores the scanning environment, which
consists of the values of &subject
and &pos
.
Two problems with nested string scanning have been fixed. The first problem
occurs when a suspended scanning operation is resumed after the scanning
environment has been changed in the outer scan. Consider the following case:
"abcde" ? {
("xyz" ? &null) & (&pos := 3) & &fail
write(&pos)
}
The change to &pos
occurs in the outer scan, so 3
should be printed. Formerly, 1
was printed. This was because
the inner scanning expression restored the outer scanning environment it
found when it was invoked and not the one it found when it was resumed.
The second problem occurs when scanning is prematurely terminated by break
,
next
, return
, fail
, or suspend
.
Consider the following case:
"abcde" ? {
p()
write(&subject)
}
procedure p()
"xyz" ? return
end
Logically this code should write "abcde"
. Formerly
it wrote "xyz"
. This was because return
did not restore the outer scanning environment when it caused the scanning
expression to be exited.
Implementation
Consider the example on pages 159-162 of the implementation book. The first
two diagrams are the same. On the third, the bscan
instruction
is executed, pushing the current values of &subject
and
&pos
. It sets &subject
to "coconuts"
and &pos
to 1. The bscan
instruction needs
to return a pointer to the saved values of &subject
and
&pos
, so it constructs a variable on the stack and suspends.
Since bscan
suspends, the saved values of &subject
and &pos
are preserved in a generator frame until bscan
is resumed or something causes removal of the current expression frame.
This removal can be caused by break
, next
, fail
,
return
, or the end of a bounded expression. In any of these
cases, the interpreter returns a signal to bscan
, which then
restores the values of &subject
and &pos
before passing the signal on to the interpreter invocation below it.
Continuing the example, move(4)
is evaluated, it suspends,
and the top of the stack is
The escan
instruction is executed next. It reverses the top
two elements on the stack so that the result of scanning becomes the result
of move()
. It then exchanges the current values of &subject
and &pos
with the saved values and suspends:
Since escan
suspends, the variable referencing the saved values
of &subject
and &pos
is preserved in a
generator frame on the stack until the interpreter returns a signal to escan
.
When this happens, escan
again switches the values of &subject
and &pos
with those saved on the stack. It then passes
the signal on to the interpreter invocation below it.
The changes described so far handle the termination of string scanning by
producing a result, expression failure, break
, next
,
return
, fail
, and the removal of a suspended scanning
expression on reaching the end of a bounded expression. The one case left
is exiting string scanning by suspending out of a procedure. A field has
been added to the procedure frame to handle this. It points to the descriptors
holding the saved values of &subject
and &pos
that were in effect when the procedure was invoked:
struct descrip *pf scan; /* ptr. to saved scanning environment */
When the procedure frame is constructed, this field is set to null because
the scanning environment associated with the invocation is still active.
If bscan
determines that it is saving this environment (by
seeing the null field), it fills in the field. When this environment is
made active again, the field must be reset to null. Both bscan
and escan
participate in maintaining the field.
When a procedure suspends, the field is checked to see if the scanning environment
of the invocation is active. If this environment is in a saved state, the
currently active environment is exchanged with the saved environment. If
the procedure is resumed, the environments must be switched back.
Having a procedure suspended in the middle of string scanning adds one additional
complication. The bounded expression containing the procedure call may be
removed. This results in an unwinding signal being propagated through bscan.
However, bscan
must not exchange the scanning environments
again. This problem is solved by having bscan
and escan
switch environments only when a signal is received from the procedure in
which they were called. This is detected by comparing procedure frame pointers.
2.2 Co-Expression Activation [Section 10.4, pp. 180-182]
Previously, the implementation of co-expressions did not properly support
an arbitrary sequence of activators for a given co-expression. Co-expression
blocks had a descriptor that pointed to the most recent activator of the
co-expression. Suppose co-expression A
is activated by co-expression
B
. The activator descriptor of A
references B
and when A
fails or returns, the activator descriptor of A
directs it to pass control to B
and this works properly. Consider,
however:
A := create @B
B := create @C
C := create @B
@A
&main
activates A
, A
activates
B
, and B
activates C
. At this point,
the activator of B
is A
, but when C
activates B
, the activator of B
is changed to
C
. At this same time, the activator of C
is B
.
Thus, when B
fails or returns, it passes control to C
,
which in turn passes control back to B
-- an endless loop.
Implementation Rationale and Overview
While a single activator works for the case of two co-expressions calling
each other in a coroutine fashion, it seems to reasonable to support a more
general case and allow co-expressions to have an arbitrary sequence of activators.
This goal was achieved by replacing the single activator descriptor in a
co-expression with a stack of activator descriptors. This stack allows for
an arbitrary sequence of activators to be recorded and reversed with subsequent
coret
and cofail
operations.
Theoretically, the sequence of activators can be of an arbitrary length
and be comprised of an arbitrary number of distinct activators. Thus, the
size of the stack used to record the sequence of activators should only
be limited by available memory. To provide expandability, the activator
stack is maintained as a linked list of blocks, each of which contain a
fixed number of stack elements.
In many applications, a co-expression's activator is often the same, time
after time. The most obvious example of this is two co-expressions A
and B
calling each other in a coroutine fashion: after the
start-up sequence, the activator of A
is always B
,
and conversely. In such a case, after n changes of control, each
co-expression would have an activator stack containing n elements,
all of which refer to the other co-expression. To accommodate this common
use of co-expressions, each stack element has a count field that indicates
the number of consecutive activations by the associated activator. In the
example, instead of having a stack of n identical elements, the stack
consists of one element that has a count field of n. Although this
heuristic can reduce memory throughput substantially, it should be noted
that the underlying problem is one of data compression.
With a stack data structure such as this, it is only necessary to associate
a stack with each co-expression, pushing activators in response to coact
operations, and popping activators in response to coret
and
cofail
operations.
Implementation Details
The activator field of the b_coexpr
structure has been replaced
with es_actstk
, a pointer to an activator stack block, known
as an astkblk
. The astkblk
structure is declared
as:
struct astkblk {
int nactivators;
struct astkblk *astk_nxt;
struct actrec {
word acount;
struct b_coexpr *activator;
} arec[ActStkBlkEnts];
};
The array arec
, composed of actrec
entries, is
a segment of the stack. nactivators
indicates the number of
valid activator entries in this segment of the stack. arec[nactivators-1]
is the most recent activator. astk_nxt
is a pointer to the
next stack block in the list. ActStkBlkEnts
is a compile-time
constant whose current value is 100 for systems with a large amount of memory
and is 10 for systems with a small amount of memory or fixed-size memory
regions.
When a co-expression is created, the es_actstk
field is set
to zero. Upon the first activation of the co-expression, memory for an astkblk
is allocated using malloc()
, and es_actstk
is
pointed to this block.
Four routines manipulate co-expression activator stacks:
pushact(C1,C2) The co-expression C1 is added to the
activator stack of co-expression C2.
popact(C) The activator stack of C is popped and
the top-most activator is returned.
topact(C) The most recent activator of C is
returned.
dumpact(C) The activator stack of C is dumped
(used for debugging only).
pushact(C1,C2)
first determines if the most recent activator
of C1
was C2
and if so, it merely increments acount
in the appropriate element of arec
and returns. If C2
was not the last activator of C2,
a new arec
element
is required. If arec
is full, a new astkblk
is
allocated and added at the front of the list based at es_actstk
.
The appropriate slot in arec
is located, filled in with the
new activator, acount is set to one, and nactivators
is incremented.
pushact()
is called in the Op_Coact
case of interp
,
and it is also called in init()
to make &main
its own activator.
popact(C)
locates the arec entry that is associated with the
most recent activator and holds the co-expression pointer for later return.
If the acount field is one, nactivators
is decremented to reflect
the removal of an arec
entry. If acount
is greater
than one, acount
is merely decremented. When all the arec
entries in an astkblk
have been popped, the astkblk
needs to be freed. An astkblk
is freed when nactivators
is zero and a pop is required.
popact()
is called in the Op_Coret
and Op_Cofail
cases of interp()
.
topact(C)
is only required for &source
and
merely returns the most recent activator. The only complication is to skip
the first astkblk
on the chain if its nactivators
field is zero.
Garbage Collection Issues
A co-expression is live with respect to activators if it is the activator
of a live co-expression, since the co-expression it activated may coret
or cofail
. Thus, all the activators in the activator stack
of a live co-expression must be marked. This marking is done with special-case
code in markblock()
. markblock()
requires the
address of a descriptor, but the activators are stored as a list of addresses
of b_coexpr
blocks rather than as a list of descriptors. Because
back-chaining is not done with co-expressions and the descriptors that reference
them, it is acceptable to use a dummy descriptor, filling in its v-word
with each activator address in turn and calling markblock()
with the address of this descriptor.
When a co-expression is freed in cofree()
, it has a still-allocated
astkblk
associated with it that must be freed. It seems that
there is no way to produce a dead co-expression that has a non-empty activator
stack, but nonetheless, code is present to handle multiple astkblk
s.
When pushact()
requires a new astkblk
, the ensuing
malloc()
may cause a garbage collection. This potential allocation
is handled by doing the pushact()
in the Op_Coact
case prior to establishing any pointers to relocatable data objects. This
ensures that no C pointers are invalidated, and that there are no reachability
problems, so this solution seems satisfactory. Having predictive need for
the static memory region would avoid this special case, but the nature of
the memory management used in the static region precludes a reasonably quick
way of determining if a given amount of memory is available. This problem
merits further thought.
2.3 Storage Management [Section 11.3.5, pp. 213-214]
The original implementation of storage management for Icon was predicated
on the assumption that the user's memory region can be expanded using sbrk()
if additional space is needed during program execution. This approach works
well on some systems, but is awkward on others and some systems do not support
sbrk()
at all. The trend is away from the support of sbrk()
.
While the matter still is not completely settled (storage management remains
the biggest single problem in the implementation of Icon), an alternative
method of handling Icon's memory regions has been added. This method, called
"fixed regions" uses the system's malloc()
to provide
separate string and block regions at the beginning of execution, as opposed
to the usual "expandable regions", in which one large block for
all regions is obtained by brk()
.
With fixed regions, the string and block regions are not necessarily contiguous,
and neither can be expanded (although their initial sizes can be set by
environment variables). Co-expressions are allocated as needed using the
system's malloc(
), instead of being allocated by Icon from
its static region. Fixed regions do not affect the method used for garbage
collection, but if there is not enough space after collection in either
the string or block region, execution is terminated with an error message
instead of expanding the region.
Portions of the source code that are affected by the fixed-regions option
are identified by the defined symbol FixedRegions
. The main
module related to storage management, rmemmgt.c, now includes either rmemfix.c
or rmemexp.c
depending on whether or not FixedRegions
is defined.
Some other changes have been made to the expandable-regions version of storage
management to adjust region sizes in the event there is not enough memory
to expand a region to the desired size.
Close examination of the present source code will reveal that the fixed-
and expandable-regions options co-exist somewhat uncomfortably; the situation
in Version 8 is a work-around to allow Icon to run on systems that do not
allow expansion of the user's memory region, but it is not a complete solution
to the problem. It probably would be a good idea to start from scratch and
completely redesign Icon's storage-management system. This would be a major
project, however, and it is not even clear that a single method, however
clever and sophisticated, can do a good job on the wide range of computer
architectures presently available.
2.4 Large Integers [Section 4.1, p. 44]
Version 8 of Icon supports large-integer arithmetic in which the magnitude
of integers is not limited by machine architecture. Large integers do not
come into existence until integer overflow would occur. See h/rt.h
and iconx/rlargint.c
for details.
Overflow checking in C is now provided. See iconx/rmisc.c
.
2.5 Dynamic Hashing [Chapter 7, pp. 96-109]
Earlier versions of Icon implemented sets and tables using hash tables with
a fixed number of buckets (37). This worked well for small tables, but performance
degraded for large tables due to the long hash chains in each bucket. Now,
each hash table starts out with a fixed number buckets (typically 8), but
the number of buckets doubles repeatedly as the number of elements grows.
Small tables benefit from reduced memory requirements, while large tables
show dramatic performance gains.
Data Structures
Sets and tables use very similar data structures, and now use common code
in many places. Tables differ from sets only by having an additional descriptor
in the header, for the default value, and an additional descriptor in each
element, for the referenced value. The discussion that follows refers to
sets, but the implementation of tables is similar.
Hash buckets are grouped in segments; each segment is a separate block in
the heap region. A minimal set has a single segment containing 8 hash buckets.
Successive segments hold 8, 16, 32, 64,... buckets, with each additional
segment doubling the total bucket count. Segments are located via an array
of pointers in the set header block. Because this array is fixed in size,
there is a limit to the number of times the hash table can expand.
The set header also contains a mask that is combined with a hash value to
produce its bucket number. The mask is always one less than the number of
buckets, which is in turn a power of two.
Structure Growth
A set's load factor is its size divided by the number of hash buckets. When
the addition of an element causes the load factor to exceed 5, an additional
bucket segment is allocated. One additional bit is added to the set's hash
mask, and each element having that bit set in its hash value is moved to
one of the new buckets.
Deletion of elements does not reduce the number of hash buckets because
of the complicating effect on element generation. However, if a sparse set
is copied, the copy has a smaller and more appropriate number of buckets.
Element Generation
Element generation is usually a straightforward process: the hash chains
of each bucket of each segment are traversed, generating each element in
turn. Complications arise if hash buckets are added while the generator
is suspended; in general, the new buckets contain some elements already
generated and some not yet generated.
Each time a suspended generator regains control, it checks to see if the
set has split its buckets. If so, it records the set's old mask value and
the suspended element's hash value for later reference. These are saved
in two arrays indexed by the number(s) of the newly created hash bucket
segment(s).
When a hash bucket is about to be traversed, the arrays are checked. From
the saved masks and hash values it is possible to determine whether or not
the contents of the hash bucket were generated earlier. If so, the bucket
is skipped. But the contents may have been partially generated, in which
case those elements are skipped whose hash values do not exceed the saved
value.
One additional complication comes from sequences of elements in a chain
having identical hash values: the mask and hash value are not sufficient
to record the position within such a sequence. The problem is avoided by
deferring the detection of any bucket splits until reaching the end of the
sequence. This works because the sequence can never be split into multiple
buckets.
Improved Hashing
New hash functions were introduced for strings, integers, and reals.
The former string hash function did not produce enough distinct values for
use with a large number of hash buckets, and was insensitive to permutations.
A new function fixes both of these problems.
Integers were formerly used as their own hash values without further manipulation.
That worked well with 37 hash bins but would have been unacceptable in the
new scheme; for example, a set composed of even numbers would have used
only half the hash bins. Integers are now hashed by multiplying by a constant
based on the golden ratio.
Real numbers were formerly "hashed" by simple truncation to integer.
Besides the even-number problem, this discarded all information from the
fractional part. Now, real numbers are multiplied by a constant that scales
them up and scrambles the bits.
Configuration Parameters
Four configuration parameters control the functioning of tables and sets:
HSlots
Sets the initial number of hash buckets, which must be a power
of 2. The default is 4 on machines with small address spaces and 8 otherwise.
HSegs
- Sets the maximum number of hash bucket segments. Consequently, he maximum
number of hash buckets is
HSlots*2HSegs-1
. The default is 6
on machines with small address spaces and 10 otherwise.
MaxHLoad
Sets the maximum allowable loading factor before additional buckets
are created. The default is 5.
MinHLoad
Sets the minimum loading factor for new structures. The default
is 1. Because splitting or combining buckets halves or doubles the load
factor, MinHLoad
should be no more than half MaxHLoad
.
Visible Impacts
The main effects of these changes are reduced CPU requirements, better memory
utilization, different results from ?X
, and generator behavior
that is both different and "more random".
Set and table element generation has always produced results in an undefined
order. Generation is now "more random" in the sense that the sequences
of results are even more unpredictable than before. First, the generated
sequences are dependent on configuration parameters. Second, adding new
elements can change the relative order in which existing elements are generated;
this quirk even could be used to determine the configuration parameters.
Finally, it is possible to construct a structure X
such that
!X
and !copy(X)
produce results in different sequence.
2.6 C Data Sizes [Sections 4.1 and 4.4, pp. 48-51, 57]
The C programming language provides no guarantees that ints and pointers
are the same size, although that was true for most computers on which C
was originally implemented. In fact, Icon was written on such a computer
and with the assumption that ints and pointers were the same size
and there was no computer available for which the assumption did not hold.
While the implementation subsequently was modified to support "mixed
sizes", and this subject is covered in the implementation book, many
changes have been made since the book was written to accommodate the idiosyncrasies
of various C compilers.
To get an idea of the problem, consider the implication of the original
specification in the C language that the difference of two pointers is an
integer (but explicitly not whether this is an int or a long).
There generally is no problem if ints and pointers are the same size, but
for the so-called "large memory model" where ints are 16 bits
and pointers are 32 bits, there are serious practical problems. Some large-memory-model
C compilers take the difference of two 32-bit pointers to be a 16-bit int.
It doesn't take a lot of imagination to see what happens if the difference
of two pointers is larger than the largest int.
Such problems can be handled by casting the pointers to longs wherever
such problems might arise. The catch is finding all the places and doing
it properly. In fact, the source code for Icon presently is not completely
correct in this respect. We have, however, added many casts to the
source code since the book was written. This changes the appearance of the
code substantially in some places (and not for the better). The point is,
if you are reading the book and the source code side-by-side, you will see
many differences for this reason alone.
In a related matter, the interpreter program counter, ipc
(page
128 of the implementation book) is now a union. This change was made so
that space could be saved in icode files by mixing ints (for opcodes)
with pointers to data (for arguments). References to the icode now specify
the appropriate member of the union.
In order to provide more flexibility in configuring the Icon source code
for computers with different C data sizes, the following values now are
used:
ByteBits
number of bits in a byte (normally 8)
IntBits
number of bits in a C int
(nominally 32;
may also be 16 or 64)
WordBits
number of bits in an Icon "word" (nominally 32; may
be 64)
It's also worth noting that Icon no longer can be built as a "small-memory-model"
program: pointers must be at least 32 bits long.
2.7 Changes in Handling of Data Objects
Integers [Section 4.1 and Section A.2.1, p. 51, 247]
Icon source-language integers are 32 (or 64) bits, regardless of the size
of the C int. Consequently, on some systems Icon integers are ints,
while on others, they are longs. In Version 6, there were two kinds
of integers on 16-bit computers: those that fit into a C int and
those that required a C long. Blocks were allocated for the latter.
Now all Icon integers are kept in "words" which are at least 32
bits long. This is handled by a typedef to either int or long,
depending on the size of an int. Descriptors are simply two words
and blocks are no longer allocated for "long integers". Thus,
the discussion and diagrams on pages 51 and 247 of the book no longer apply.
Csets [Section 5.2, p. 78]
When the implementation book was written, the size of a cset was computed
when it was created. This computation is fairly expensive, but few programs
ever use the sizes of csets. The size of a cset is now set to -1 when it
is created and the actual size is only computed and reset if it is needed.
Pointers in Blocks [Chapter 6 and Section 11.3.2, pp. 80-109, 199, 291]
In Version 6, all pointers in blocks to other blocks were contained in descriptors.
Now most of these are just pointers. The layout of blocks now is first all
non-pointer and nondescriptor data, then pointers, and finally descriptors.
The garbage collector uses two new tables, firstp
and ptrno
,
to locate and process pointers.
Serialized Structures
Blocks for structures are now serialized, which adds an additional word
for those types. The serial numbers are used in hashing and also appear
in string images of structures.
Pointers to Variables [Section 4.3, pp. 53-54, 200-203]
In order to handle pointers in place of descriptors within blocks during
garbage collection, a variable descriptor that points to a block now points
to the title of the block and the offset to the corresponding value is in
the d-word of the variable, rather than pointing to the value with the offset
being to the title.
Lists [Section 6.1, pp. 81-82]
Formerly all newly created lists contained space for a minimum of 8 elements.
Now, only empty lists contain space for extra elements.
Previously, list-element blocks that are allocated when a list is extended
as the result of a push()
or put()
contained space
for 8 elements. If large lists are created in this way, the total amount
of space overhead is excessive and the time to access elements increases
with the position. To avoid this, when a list-element block is allocated
for extending a list, it now contains one-half the total number of elements
in the entire list prior to extension or 8, which ever is larger. However,
if there is not enough memory to allocate a list-element block of the desired
size, the size is reduced to fit into available memory. While this may waste
space in the last list-element block, the total amount of space for the
list is always less than for the previous allocation strategy, and the time
to access list elements toward the end of the list increases less rapidly
than formerly.
2.8 Run-Time Errors [Section D.2.8, p. 288]
Most run-time errors now can be converted to expression failure under the
control of a keyword. Consequently, the function runerr()
may
now return, whereas it formerly did not. To avoid accidentally continuing
unexpectedly in code that formerly followed calls to runerr()
,
such calls have been replaced by instances of the macro RunErr()
which is defined as
#define RunErr(i,dp) {\
runerr(i,dp);\
Fail;\
}
In those cases where a run-time error cannot be converted to failure, a
new function, fatalerr()
, is used.
To distinguish those cases where a run-time error cannot be attributed to
a specific offending value, the negative of the error number is used in
calls to runerr()
and fatalerr()
. The actual error
number reported is always positive.
2.9 Function Declaration and Definition [Section D.2, pp. 280-284, 289]
There are now two additional forms of function declaration that specify
arguments are not to be dereferenced prior to function invocation:
FncNDcl(name,n) /* n arguments */
FncNDclV(name) /* variable number of arguments */
In addition, the function-definition macro FncDef()
now has
a second argument that specifies the number of arguments for the function:
FncDef(name,n)
.
2.10 Reorganization of the Translator and Linker [Section D.1, p. 279]
At the time the implementation book was written, there were four components
to the Icon source code: a command processor (icont
), a translator
(itran
), a linker (ilink
), and an executor (iconx
).
The first three of these have been combined into a single component so that
the command processor (icont
) now calls the translator and
linker as functions.
2.11 Changes for the ANSI C Draft Standard [Section 4.4, p. 57]
Several changes have been made to conform to the ANSI C draft standard.
For example, there is now a typedef for pointer that is void
*
for C compilers supporting the draft standard but char *
for those that do not.
2.12 Cosmetic Changes
Most of the cosmetic changes that have been made to the source code of Icon
since the implementation book was written and are obvious when comparing
source code in the book to the current source code.
Two names have changed: MkInt
is now MakeInt
and
mkreal
is now makereal
.
One change that should be specifically noted is the introduction of
typedef struct descrip *dptr;
3. Corrections to the Implementation Book
The following errors appear in the first printing of the implementation
book. The line numbers given include captions and titles, but not running
heads. Negative line numbers are relative to the bottom of the page.
Page 19, line 7: Replace "string of length i
positioned"
by "string of length i
with s1
positioned"
Page 20, line 19, last word: Replace "second" by "rightmost".
Pages 53-55: The label s
next to the variable descriptor should
instead be next to the descriptor it points to (three places).
Page 63, line 4: Replace "NULL
" with "CvtFail
".
Page 71, line 1: Replace "s1[1:2]
" by "s1[1:3]"
.
Page 72, line 10: Replace "Sec. 2.2" by "Sec. 4.3.2".
Pages 72-73: The label s
next to the variable descriptor should
instead be next to the descriptor it points to (two places).
Page 74, line 14: Replace "information must information must"
by "information must".
Page 77, line 14: Replace "C code cset" by C code for cset".
Page 78. The indentation in the code is inconsistent and there are unnecessary
braces. Since cset sizes are no longer computed when csets are created,
the latter portion of the code no longer exists. The present code is:
/*
* Allocate a new cset and then copy each cset word from Arg1
* into the new cset words, complementing each bit.
*/
bp = (union block *)alccset();
for (i = 0; i < CsetSize; i++)
bp->cset.bits[i] = ~cs[i];
Arg0.dword = D Cset;
BlkLoc(Arg0) = bp;
Return;
}
Note that alccset
no longer has the argument that previously
provided the cset size.
Page 87, line -2: "put
" should be "pop
".
Page 106, line -8: Replace "or operation, and the result is cast as
an integer" by "or operation." .
Page 140, line -11 (in the second sentence in the paragraph that starts
"Generator Frames"): "begins with generator frame"
should be "begins with a generator frame".
Page 141, line -10: There should be a pnull
instruction between
local i
and int 1
.
Page 151, line 4: There should be a pnull
instruction between
local i
and int 1
.
Page 155, line 11: "if expr produces a result" should be
"if expr does not produce a result".
Page 179, lines 13-14: The transmitted value is shown as a descriptor. It
should be a single word that points to a descriptor. In addition, co-expression
blocks now contain fields for an identifying number and for a pointer to
an activator stack as described in Section 2.2.
Page 198, near end of first paragraph: Replace the sentence that starts
"The back chain is established" by "The back chain is established
by setting the title word of the block to point to the descriptor, and setting
the v-word of the descriptor to the previous contents of the title word."
.
Page 201, line 12: There is an extraneous "y
" at
the left of the last descriptor in the diagram.
Page 210, line 10: There is another extraneous "y
"
as on page 201.
Page 215, lines 24-25 (at the beginning of the second paragraph of text):
It is stated that the context for evaluation is switched to the co-expression
for &main
, so that a larger C stack is available. This
is not done in the current implementation; it got lost somewhere in an update.
(It should be done.)
Page 226, line 11 (third line after list of routines): Replace "NULL
"
by "CvtFail
".
Page 228, line -6: Replace "NULL
" by "CvtFail
".
Page 256: The remarks about page 179 apply here also.
Page 276, last sentence in first paragraph: Replace the sentence that begins
"The instruction escan
" by "The instruction
bscan
saves the current values of &subject
and &pos
and establishes their new values before expr2
is evaluated. The instruction escan
restores their values."
.
Page 290, line -7: Replace "int title
" by "word
title
"
Page 291, line 2: Replace "int title
" by "word
title
"
Page 291, line 3: Replace "int blksize
" by "word
blksize
"
page 284, line 20: Replace "type code" by "d-word".
Page 294, line 18: Replace "iconx/rt.h
" by "h/rt.h
and iconx/gc.h
".
Pages 296-297: Replace all occurrences of "NULL
"
by "CvtFail
".
Page 327, line -1: Replace "Yyngve" by "Yngve".
Acknowledgements
Ken Walker made the changes to the maintenance of scanning environments
and provided the documentation included here. Bill Mitchell made the changes
to the handling of co-expression activation and provided the documentation
included here. Cheyenne Wills and Kelvin Nilsen made most of the changes
to support "mixed sizes". Bob Alexander suggested deferring the
computation of cset sizes until they are needed. Dave Gudeman provided the
new code for the handling of variable-sized list-element blocks. Sandy Miller
changed descriptors in blocks to pointers and made the associated changes
to garbage collection. Gregg Townsend implemented dynamic hashing and provided
the documentation included here.
Many other persons, too numerous to list here, provided changes to the source
code related to porting it to new computers.
Bob Alexander, Rick Fonorow, Dave Hanson, Robert Henry, and Janalee O'Bagy
found the errors in the implementation book that are listed in Section 3.
References
1. R. E. Griswold and M. T. Griswold,
The Icon Programming
Language, Prentice-Hall, Inc., Englewood Cliffs, NJ, second edition,
1990.
2. R. E. Griswold,Version 8 of Icon, The Univ. of Arizona Tech. Rep.
90-1, 1990.
3. R. E. Griswold and M. T. Griswold,
The Implementation
of the Icon Programming Language, Princeton University Press, 1986.
Icon home page