Programming Corner from Icon Newsletter 15
June 7, 1984; Version 5
Assignment to Subscripted Strings
In the last Newsletter, the semantics of expressions such as
x[i:j] := (x := expr)
were posed, where x
is string-valued when the subscripting
expression is evaluated, but in which expr
changes the value
of x
before the (left) assignment is made to replace the subscripted
string. (Expressions such as x[i]
and x[i+:j]
are just special cases of x[i:j]
.)
While expressions like this are uncommon (and generally are considered to
be in poor style), they are legal and therefore must be well-defined and
handled properly in the implementation. (It should be no surprise that all
the possibilities were not considered in the initial design and that there
were several bugs related to these matters in the early versions of the
implementation.)
This is a case where efficiency and implementation considerations influenced
language design. The problem is that the transator cannot, in general, determine
whether an expression such as x[i:j]
will have a value assigned
to it. Even if x[i:j]
is the target of an assignment operation,
the assignment never may be made because of failure in evaluation elsewhere
in the expression. In the case of
return x[i:j]
the translator has even less information, since the use of the returned
expression depends on the context in which the function containing this
return is called. For these reasons, the translator treats all expressions
usch as x[i:j]
in the same way. (There is the potential here
for an implementation optimization, since there are many situations in which
the translator could determine that a subscripting expression is not the
target of an assignment.) When an expression like x[i:j]
is
evaluated, if the value of x[i:j]
is a string, a trapped variable
is produced. A trapped variable is a special kind of variable that points
to a small block of data which contains enough information to assign a new
value to x
if an assignment is made to x[i:j]
.
This information consists of the variable x
and the location
of the substring in x
. Every string subscripting expression
produces a trapped variable. Although the block of data that is created
usually is used only transiently, it causes a certain amount of storage
throughput.
Now consider what happens if the value of x
is changed before
an assignment is made to x[i:j]
. Since the value of x
can be changed to anything, the assignment cannot be made blindly -- the
position of the replaced string might be out of range, even if the new value
of x
is a string. Consequently, the type of x
is checked when assignment is about to be made to x[i:j]
. If
the value of x
is a string, its length is checked to be sure
the sub-string specified by i:j
is still in range. If it is,
the assignment is made, even if the value of x
is different
from what it was when x[i:j]
was evaluated. Thus, in
x := "hello world"
x[3] := (x := "abc")
the value of x
becomes "ababc"
. On the
other hand, if the value of x
is a string, but it is too short,
run-time error 205 (value out of range) occurs, as in
x := "hello world"
x[3] := (x := "ab")
One might well argue that assignment to x[i:j]
should be an
error if the value of x
has changed, even if the substring
is still in range. After all, such a situation seems more likely to be an
error than an intentional computation. Here, however, there is an efficiency
consideration. In order to be able to detect that the value of x
has changed, it would be necessary to save the value of x
as
well as the variable x
in the trapped variable.
Furthermore, this would have to be done for every evaluation of a string
subscripting expression. The result would be substantially higher storage
throughput just to treat a pathological case more elegantly.
From a language design viewpoint, a somewhat more radical alternative would
be to bind the value of x
to x
at the time x[i:j]
is evaluated, so that
x := "hello world"
x[3] := (x := "abc")
would change the value of x
to "heabclo world"
.
This solution also would require saving the value of x
in the
trapped variable -- additional overhead that again does not seem justified
for such a pathological situation.
Returning to the situation as it actually is handled, given that any string
value for x
that is long enough is acceptable, the next question
is what to do if the value of x
is not a string when the assignment
is made to x[i:j]
? In consonance with Icon's general philosophy
of converting types automatically whenever possible, if the value of x
can be converted to a string, it is. Thus,
x := "hello world"
x[3] := (x := 397)
changes the value of x
to "39397"
. Weird,
maybe, but consistent with the result of concatenating two integers -- which
is, after all, what this expression amounts to.
If the value of x
cannot be converted to a string, a run-time
error (103) occurs, as in
x := "hello world"
x[3] := (x := [1,2,3,4])
Note that these problems are essentially problems of dereferencing -- when
and how the value of x
is determined when assignment is made
to x[i:j]
. There are a number of other situations in Icon in
which dereferencing is a problem. One is string scanning, which will be
discussed in the next Newsletter, along with more material on matching expressions.
Trivia Corner
In the last Newsletter, the following problem was posed:
What is the longest string of distinct prefix operators which,
when applied to a value, might compute a meaningful result? (You may assume
any value that you wish.) What if the prefix operators need not be distinct?
For distinct prefix operators, one possibility is
|+=-?*~\@^!x
It might go like this: Let x
be a list of co-expressions. Generate
one, refresh and activate it, being sure the result is non-null. Assuming
the result is a cset, use the size of its complement to provide a range
for a randomly selected integer. Negate this integer. Match the equivalent
string in &subject
and convert the result back to an integer.
Repeat the whole process (whatever that means). Enough!
Strictly speaking, repeated alternation is a control structure, not an operator,
but it is denoted with operator syntax. Note that the prefix operators .
and /
are not included in the expression above. They can be
added, but not in a "meaningful" way.
If the prefix operators do not have to be distinct, there is no specific
limit on the number that can occur. Consider, for example,
=== ... ==x
The expression =x
matches x
in &subject
,
==x
matches two consecutive occurrences of x
,
and so on.
What about expressions such as
??? ... ??x
Pitfalls
Steve Wampler contributes the following program, in which the procedure
tally
echos its argument and tallies it in the table count.
In the main procedure, empty input lines are converted into the more prominent
marker <empty line>
. Or are they? What does this program
actually do? What does it take to fix the problem?
global count
procedure main()
count := table(0)
while line := read() do
tally(("" ~== line) | "<empty line>")
. . .
end
procedure tally(s)
count[s] +:= 1
write(s)
end
Icon home page