Programming Corner from Icon Newsletter 29
February 14, 1989; Icon Version 7
Table Keys
If T
is a table, !T
generates the values in the
elements of T
, not the keys (entry values). This probably was
a design mistake, since it's possible to get the values from the keys, but
not the other way around. So ...
Problem: Write a procedure key(T)
that generates the
keys from table T
.
Comment: The only way to do this is to convert T
to
a list.
Approach 1: Create a list of key/value lists:
procedure key(T)
L := sort(T)
every pair := !L do
suspend pair[1]
end
Whenever a suspend
appears in the do
clause of
an every expression, look to see if both are needed, since suspend
also forces its argument to generate all its results. Thus,
procedure key(T)
L := sort(T)
suspend (!L)[1]
end
Note the parentheses: !L[1]
groups as !(L[1])
,
not what you want.
One more refinement: get rid of the identifier L
and
just put the call of sort in the suspend expression:
procedure key(T)
suspend (!sort(T))[1]
end
Approach 2: Create a list of alternate key/value pairs:
procedure key(T)
L := sort(T,3)
while x := get(L) do { # get key
suspend x
get(L) # discard value
}
end
Observation: The while/suspend
construction suggests
a more compact technique as in every/suspend
. Use the "make
a generator from a non-generator" paradigm:
procedure key(T)
L := sort(T,3)
every x := |get(L) do { # get key
suspend x
get(L) # discard value
}
end
Now the every/suspend
can be collapsed as usual:
procedure key(T)
L := sort(T,3)
suspend |get(L) do
get(L)
end
Note the use of the optional do
clause for suspend
.
This could also be written more compactly (but opaquely) as
procedure key(T)
L := sort(T,3)
suspend |1(get(L),get(L))
end
However, it's not possible to get rid of the identifier L
,
since the list must be an argument of two separate function calls.
Which approach is best? At the bottom line, both are obscure and make hard
reading. The first approach produces the most compact code and with no auxillary
identifier. However, the list of lists produced by the default option for
sorting takes up a lot of room, which may be a practical consideration.
Note also that both of these methods do something more than generate the
key -- they generate them in sorted order, which is different from !T
.
Unique Values in a List
Here's another problem: Write a procedure uniquelem(L)
that returns a list containing only the distinct values of L
-- that is, filtering out duplicates.
Approach 1: If order doesn't matter, there's a very simple way:
procedure uniqelem(L)
return sort(set(L))
end
This takes advantage of the fact that distinct values can be members of
a set only once.
This procedure, in fact, produces the elements in sorted order (since the
only way to convert a set to a list is to sort it).
Although simple, this method requires both constructing a set that is discarded
and also sorting it (perhaps unnecessarily).
Approach 2: If you want to preserve the order of the elements of
the list, you have to do something like this:
procedure uniqelem(L)
local L1, x
L1 := [ ]
every x := !L do
if x === !L1 then next else put(L1,x)
return L1
end
Of course, this can be very slow if L
is large and has mostly
distinct elements.
Note that it won't do to rephrase the test and addition of an element as
if x ~=== !L1 then put(L1,x)
since the test succeeds if x
is not the same as (say) the first
element of L
1 but is the same as (say) the second. Be careful
of inverting logic in expressions that contain generators.
Also, never do something like
while x := get(L) do ...
since this destroys the list L
, a pointer to which is passed
in as an argument and "belongs" to someone else, who may not appreciate
having the list trashed.
Can you think of a better method?
Data Backtracking
We were recently asked the following question: Why are
while move(1) do { ... }
and
every |move(1) { ... }
different?
The difference has to do with data backtracking. move(1)
increments
&pos
, but it suspends (like find(s)
). It does
this not so that it can produce another result if it's resumed (there is
only one way to increment &pos
by 1), but instead to restore
&pos
to its previous position (data backtracking).
On other hand, the control clause of while-do
is bounded. In
a bounded expression, suspended generators are discarded, once a result
is produced. Thus, in
while move(1) do { ... }
the change to &pos
by move(1)
is irreversible,
since move(1)
won't be resumed.
In every-do
, on the other hand, the control clause is not bounded
-- in fact, the control clause must be able to generate a sequence of results
for every-do
to have the effect it does. Thus, in
every |move(1) do { ... }
move(1)
is resumed after the do
clause is evaluated.
It restores &pos
, and then repeated alternation causes
move(1)
to be evaluated again, with the same value of &pos
as before -- an endless loop.
The problem in understanding this probably is not in what while-do
does -- most folks find that to be what they expect. It's in every-do
that the problem with understanding arises. A simple prescription
that will do for most cases is "don't use every-do
in
string scanning". That may keep you out of trouble, but a deeper understanding
of expression evaluation in Icon will show you both why you usually don't
want to use every-do
in string scanning and also why you sometimes
might want to use it.
Quiz: Assuming s := "Hello world!"
, what does
the following expression write?
every write(s ? move(1 to 10))
Icon home page