Programming Corner from Icon Newsletter 14
January 17, 1984; Icon Version 5
Answer to a Previous Query
In Newsletter 13, the expression
suspend |writes("\^[Y",col[?80],row[?24],s)
was used to generate a sequence of displays of s
at randomly
selected positions on a terminal screen. The question that was posed was
whether the repeated evaluation in the expression could be placed at any
place other than in front of writes
.
In fact, the repeated evaluation can be placed anywhere in the expression
as long as both random selection operations are evaluated subsequently to
the evaluation of the repeated alternation. One possibility is
suspend writes(|"\^[Y",col[?80],row[?24],s)
When the argument of the suspend
expression is resumed to produce
another result, the arguments of writes
are resumed from right
to left. None produce a result until |"\^[Y"
is resumed,
which produces "\^[Y"
again. The arguments to its
right are then evaluated again, giving new positions, after which writes
is called again to write s
at another position. The repeated
alternation in this case serves as an endless generator of a constant value
for an argument, serving as a kind of barrier for the left-to-right resumption.
Other possible locations for the repeated alternation are
suspend writes("\^[Y",|col[?80],row[?24],s)
suspend writes("\^[Y",col[|?80],row[?24],s)
suspend writes("\^[Y",col[?|80],row[?24],s)
The results are the same in all cases. The last expression is the most efficient.
Why?
Note that the following expression does not produce the desired effect:
suspend writes("\^[Y",col[?80],|row[?24],s)
Although a new row position is produced, the previous column position is
used again and s
is always written in the same column. The
barrier occurs too far to the right.
On the other hand,
|suspend writes("\^[Y",col[?80],row[?24],s)
does not produce the desired effect at all. In fact, it generates only one
display. Why?
Returning More than One Value from a Procedure
Persons have asked about ways to return multiple values from a procedure.
Icon argument transmission is strictly by-value and there is no way, per
se, to return multiple values from a procedure. One way around this problem
is to return a structure that contains several values as in
procedure p()
. . .
return [expr1,expr2,expr3, ... exprn]
end
In some cases, a record may be more appropriate than a list.
The values then may be obtained from the structure that is returned, as
in
a := p()
x := a[1]
y := a[2]
z := a[3]
. . .
Note that this approach allows a procedure not only to produce multiple
values, but also to produce an arbitrary number of values, possibly varying
from call to call. In this case, a more general method must be used to access
the values that are returned.
Since structures in Icon are pointers to data objects, something akin to
call-by-reference can be obtained by passing a structure as an argument
to a procedure, as in
a := list(n)
p(a)
with
procedure p(a)
. . .
a[1] := expr1
a[2] := expr2
a[3] := expr3
. . .
a[n] := exprn
return
end
When p
returns, the values are in the list a
and
can be accessed as before.
An entirely different approach to returning multiple values is to generate
them in sequence:
procedure p()
. . .
suspend expr1 | expr2 | expr3 | ... | exprn
end
This method fits naturally into Icon's expression evaluation mechanism.
The values can be put into a list, if desired, by
a := []
every put(a,p())
The following expression does the same thing:
every put(a := [],p())
Note that the number of values that p
produces need not be
known. On the other hand, there is a problem if the generated values are
to be assigned to separate variables. This can be done by making explicit
assignments as above, using
x := a[1]
y := a[2]
z := a[3]
. . .
but it is tempting to avoid the list and to use iteration, as in
every (x | y | z | ...) := p()
This does not work as intended, since evaluation is left-to-right and resumption
is right-to-left. Thus, the alternation expression first produces x
and p()
is called, producing the value of expr1
,
which is assigned to x
. However, p()
is resumed
next, producing the value of expr2
, which is also assigned
to x
, and so on. (What values are assigned to y
and z
?)
The lack of "parallel" evaluation in Icon can be circumvented
by using a co-expression:
e := create p()
every (x | y | z | ...) := @e
The resumption of a co-expression activation does not produce another value.
Consequently, the first value is assigned to x
, the alternation
is resumed, y
is produced, and the second value produced by
p()
is assigned to it by the second activation of e
,
and so on.
While this approach is a bit oblique for this simple situation, it illustrates
the control that co-expressions provide over the production of results.
Initial Assigned Values in Tables
When a table is created, as in the expression
t := table(x)
the value of x
is the initial assigned value for new entries
in t
. This initial assigned value is used for references to
entries that are not already in the table. For example, if a count is being
kept of strings, the initial assigned value might be 0
, as
in
count := table(0)
Now suppose the following expression is evaluated:
count[s] +:= 1
This expression produces the same result as
count[s] := count[s] + 1
Suppose s
is not in the table. Then the reference to
count[s]
in the addition operation produces 0
, the initial assigned
value. (The augmented assignment operation is preferable to addition and
assignment, since the latter expression requires that s be looked up twice
in the table.)
But suppose that the initial assigned value is not known. How can it be
determined? The problem is that there is not, in general, any way of knowing
what entries there are in a table, short of converting it to a list by sorting
it.
One approach is to pick some unlikely entry value (perhaps a value that
is not a string, assuming all the entries in the table are strings). This
may work in practice, but for an arbitrary table with arbitrary entries,
what value can be guaranteed not to be in the table?
The answer is easy -- use an entry value that cannot have existed before
the test. Any newly created structure will do, but the following expression
is particularly simple:
x := t[[]]
Since a list creation expression creates a new structure, the entry value
[]
cannot be in t
and the expression above assigns
the initial entry value of t
to x
-- absolutely
guaranteed!
Matching Expressions
Matching expressions, which are analogous to patterns in SNOBOL4, provide
a way to elevate string scanning in Icon to a higher conceptual level. By
definition, a matching expression is an expression that may change &pos
and always returns the substring of &subject
between the
new and old values of &pos
. A matching expression that
fails also must restore &pos
to its old value. For example,
tab(i)
and move(i)
are built-in matching expressions,
but &pos := i
is not a matching expression, since it does
not return the substring of &subject between the former and new values
of &pos
.
Suppose that expr1
and expr2
are matching expressions.
Then which of the following also are matching expressions?
expr1 & expr2
expr1 | expr2
expr1 || expr2
x := expr1
if expr then expr1 else expr2
while expr1
every expr1
A matching procedure is a procedure whose call is a matching expression.
An example is
procedure Arb()
local pos, s
pos := &pos
every s := &subject[pos:&pos := &pos to *&subject + 1] do
suspend s
&pos := pos
end
which corresponds to the SNOBOL4 pattern ARB
. This procedure
can be written considerably more compactly; what is the most concise possible
form?
The requirements that an expression must meet in order to be a matching
expression can be modified to produce a variety of different classes of
"patterns". One possibility is suggested by John Polstra:
A useful extension to the idea of the matching procedure is
what I call the transforming procedure. A transforming procedure is just
like a matching procedure, except that it returns a variable which is a
substring of &subject
. Assignment to a call of a transforming
procedure can then be used for modifying the subject, as could assignment
to tab(i)
and move(i)
before Version 5 of Icon.
A useful general transforming procedure is the following:
procedure xform(p)
suspend &subject[.&pos:p() & &pos]
end
If p
is a parameterless matching procedure, then p()
is the corresponding matching expression and xform(p)
is the
corresponding transforming expression. Using co-expressions, a more general
version (allowing parameters) can be written.
The remark "before Version 5" refers to Version 4 of Icon, which
is now obsolete.
Problems with Dereferencing
The arguments of functions are not dereferenced until all arguments are
evaluated. Consequently, the expression
write(s,s := "a")
writes aa
, regardless of the value that s had prior to the
evaluation of this expression.
Such expressions generally are considered to be bad style and they rarely
occur in programs, at least in such an obvious form. More subtle and perplexing
problems may occur because subscripting expressions in Icon are variables
and are polymorphic. Consider
x[y] := z
Here x
may be a string, a list, a table, or even a record.
Now consider the expressions
x := "hello world"
x[3] := (x := "abc")
What happens when the value of x
changes between the time it
is subscripted and the time the assignment to the subscripted variable is
made? What about
x := "hello world"
x[3] := (x := "ab")
Here the subscript of x
is in range when it appeared on the
left side of the assignment but is out of range when the assignment is made?
What about
x := "hello world"
x[3] := (x := [1,2,3,4])
where the type of the value of x
is changed? Or even
x := "hello world"
x[3] := (x := 397)
where the type is changed, but to one that is coercible to the previous
type?
Granted that such expressions are unlikely to occur in "real"
programs, they must be accounted for by the semantics of Icon and implementations
must handle them properly.
Syntactic Pitfalls
The Icon translator automatically inserts semicolons between expressions
on adjacent lines in cases where the token at the end of the first line
is legitimate as the end of an expression (an "ender") and the
token at the beginning of the second line is legitimate as the beginning
of an expression (a "beginner"). Thus, semicolon insertion makes
it possible to write programs without having to put semicolons at the ends
of expressions.
All prefix operators are beginners. Since many prefix operators also are
legal in infix operators, semicolon insertion some-times can produce unexpected
results. For example,
x
| y
is translated as if it had been written
x; | y
not as
x | y
(Identifiers are both beginners and enders.) Note that
x; | y
is syntactically correct, if a bit unlikely. It is advisable to guard against
unexpected translations of infix operations by putting the infix operator
at the end of the first line, not at the beginning of the second. Thus,
x |
y
is translated as
x | y
since no operator is an ender and hence a semicolon is not inserted.
The semicolon insertion mechanism does not take into account situations
in which a line ends with an ender and the next line begins with a beginner,
but in which the lines do not form complete expressions. Thus,
write(i
+ j)
is translated as if it were
write(i; +j)
and is diagnosed as a syntactic error, although
i
+ j
is translated as if it were
i; + j
which is syntactically correct.
In the case of expressions with optional arguments, semicolon insertion
may produce mysterious effects if care is not taken. For example,
return
x
is translated as if it were
return;
x
This occurs because the argument of return
is optional, making
it an ender. When this code segment is evaluated, the null value is returned
and x
is never evaluated. If the problem is not recognized,
it appears that x
always has the null value, even though it
may obviously have a different value.
As another example, consider the following code segment that was produced
by a SNOBOL4 programmer who is used to having an omitted right argument
of assignment default to the empty string:
s[1] :=
return s
On the face of it, this is erroneous in Icon. However, since :=
is not an ender, a semicolon is not inserted and this code segment is translated
as if it were
s[1] := return s
Although this is a strange expression, it is both syntactically and semantically
correct. When the right argument of the assignment is evaluated, a return
occurs and the assignment is never completed. Unless the translator's interpretation
of this expression is recognized, the effect during program execution may
appear mysterious.
Fortunately, semicolon insertion works well in practice. Observation of
the rules of program layout given above avoids all these problems.
Trivia Corner
What is the longest string of distinct prefix operators which, when applied
to a value, might compute a meaningful result? (You may assume any value
that you wish.) What if the prefix operators need not be distinct?
Icon home page