Programming Corner from Icon Newsletter 21


June 10, 1986 ; Icon Version 6

Solutions to Previous Problems

1. If the default value for a table is an empty list, as in
t := table([ ])
unexpected things may happen if operations are performed on this list. It is important to understand that there is only one default value associated with a table. For example, if a program changes the contents of the default value, as in
while word := getword() do put(t[word],lineno)
no new elements are added to t; instead every reference to t[word] produces the default value, to which the value of lineno is added. There is never any assignment to t[word].

On the other hand, in
while word := getword() do
   t[word] |||:= [lineno]
the list concatenation operation creates a new list and assigns it to t[word] every time the do clause is evaluated. The first time this happens for a particular value of word, the empty list default value is concatenated with a list containing the value of lineno and this new list is assigned. The default value itself is never modified.

2. The upper bound on the number of results that find(s1,s2) can produce is max(*s2 - *s1 + 1,0).

3. The difference between
return x
and
suspend x
fail
is simply an extra resumption in the second case if another result is needed in the context in which the corresponding procedure is called. This extra resumption is detectable in trace output, but otherwise does not affect program behavior. However, if the identifier x is replaced by an arbitrary expression, the two cases may produce very different results. Even if this expression itself only produces a single result, it is resumed in the second case, and this resumption may produce side effects. For example,
return tab(i)
and
suspend tab(i)
fail
may behave very differently. In the first case, assuming tab(i) itself succeeds, &pos is left set to i. In the second case, if the call is resumed, tab(i) is resumed and it restores &pos to its former value. The latter form generally is used in matching procedures to assure that scanning state variables are restored to conform to the matching protocol. In such cases, return tab(i) would be an error. Another way to think about it is that return limits its argument to at most one result. Thus,
return expr
and
suspend expr \ 1
fail
are equivalent (except for the extra resumption).

4. The outcome of a looping expression such as
while expr1 do expr2
need not be failure. If a break expression in either expr1 or expr2 is evaluated, the outcome of the looping expression is the outcome of the argument of the break expression.

It is common to omit the argument of a break expression. In this case, the argument defaults to a null value. Consequently, if the break expression in
while expr1 do {
   . . .
   break
   . . .
   }
is evaluated, the outcome of the looping expression is the null value. In fact, if this effect is not wanted,
break &fail
can be used to assure the outcome of the looping expression is failure.

However, the argument of a break expression can be a generator. For example, if
break 1 to 5
is evaluated in a looping expression, the result sequence for the looping expression is {1, 2, 3, 4, 5}.

5. Icon programmers usually have no interest in the intermediate "ucode" that is produced by the translator to serve as input to the interpreter for Icon's virtual machine. However, getting into the internals of the implementation at this level can give insight into what goes on when an Icon program is actually executed. Hence the posed problem of finding the shortest possible Icon program whose translation contains at least one instance of every different ucode instruction.

In Version 5.10 of Icon, there are 80 different executable ucode instructions. (There also are ucode declarations, which are not of interest here.) Some of these instructions are simple stack-manipulation operations. The bulk correspond to Icon's operators -- there is a different ucode instruction for every different source-language operator, except for the augmented-assignment operators.

As a start, therefore, any program that produces every different ucode instruction must have at least one instance of every operator. Beyond that, there are ucode instructions related to control structures and various specialized constructions. In the absence of a list of all the ucode instructions, they can be determined empirically. Since ucode is printable text, experimentation is easy. For example,
icont -c inst.icn
produces the ucode file inst.u1. Getting at the ucode instruction set this way is a good exercise it illuminates aspects of Icon that few programmers ever think about.

Once all the ucode instructions are determined, the problem becomes one of finding a minimal program that produces all of them. Some things about the syntax of Icon programs can be learned by trying this.

Here is the shortest known program that produces at least one instance of every different ucode instruction under Version 5.10 of Icon (Version 6.0 has a slightly different ucode instruction set):
procedure y();initial|0.0[suspend(,)to"":create+-
?~=!@^*./\x/x*x%x^x<x<=x=x>=x>x~=x++x--x**x|| x<<x<<=x==x>>=x>>x~==x|||x+:=x<-x:=:x<-
>x~===x&x.x?:=x\&pos["]]-case[]of{1:return};end
The program actually consists of a single 182-character line to avoid increasing its size with newline characters. We have broken it into separate lines here to fit it on the page.

Of course, if this program is executed, it immediately terminates with a run-time error message, but that is not the issue here.

To compound this lunacy, other questions can be posed. What Icon program produces the smallest ucode file that contains at least one instance of every different ucode instruction? Does the program above do this? Is it possible to simultaneously minimize the sizes of the Icon program and its corresponding ucode file?

As an aside, the size of a ucode file in Version 5.10 depends on the name of the file in which its source code is contained. What is the shortest file name for a source-language program that icont will accept?

Icon home page