Programming Corner from Icon Newsletter 30

June 4, 1989 ; Icon Version 7

Pointer Semantics, Graphs, and Sets

Icon uses pointer semantics for structure values. What this means is that a list, for example, is a pointer to a collection of the values that the list contains. A pointer, which is a memory address in implementation terms, is small and fixed in size, even though the collection of values in the list may be arbitrarily large.

An assignment in Icon, such as

L := list(1000)

merely copies the pointer produced by list(1000) into L; the 1,000 values are irrelevant as far as the assignment is concerned -- they are not even referenced.

The use of pointer semantics has several consequences. Some are good (efficiency in handling structure values) and some are (at least potentially) bad, such as unintentional sharing of structure values via pointers. These issues are discussed in the Icon language book.

What may not be obvious is how pointers to structures in an Icon program can be used to reflect, in a natural way, structures that exist in the problem domain. Consider, for example, a simple directed graph:

In order to perform computations on such a structure, it is necessary to represent them in some way in a program. One way to do this is to represent each node in the graph by a set. Then the values in the set are pointers: arcs to the nodes to which the node points. For example, the program structures for the graph shown above are:

A := set()
B := set()
C := set()
insert(A,B)
insert(A,C)
insert(B,B)

The important conceptual point is that a set is a pointer to a collection of pointers to other sets. A slightly different visualization of the structures in the programming domain illustrates this:

Thus, an arc is represented by a (pointer to) a set and a node is represented by the values in the set.

The ease of manipulating this program representation of graphs is illustrated by procedures to compute the transitive closure of a node (the node and all nodes reachable from it by a succesion of arcs):

procedure closure(n)
   S := set()
   insert(S,n)          # start with the node itself
   return more(S,n)
end

procedure more(S,n)
   every n1 := !n do    # process reachable nodes
      if not member(S,n1) then {   # skip ones already in
         insert(S,n1)   # add new node
         more(S,n1)     # recurse
         }
   return S
end

Note that a set is used to keep track of nodes already accumulated.

There are several problems that arise in computations on graphs that may require a somewhat more sophisticated representation of structures. For example, if the values are associated with arcs (or if there may be more than one arc between two nodes), the set-of-sets approach is inadequate. In such cases, a record type can be used for arcs, as in

record arc(value,node)

where the value field contains the value associated with the arc and the node field contains the set to which the arc points. Suppose, for example, that the arcs in the example above are weighted:

Then the graph can be represented in the program as follows:

insert(A,arc(2.0,B))
insert(A,arc(1.5,C))
insert(B,arc(3.7,B))

Exercise: Modify the procedure closure given above to handle this representation of directed graphs.

Two-Way Tables

Programs that manipulate graphs generally need to be able to read a representation of a graph in string form and write results in string form. For example, the (unweighted) form of the graph above might be represented as

A->B
A->C
B->B

Exercise: Write a procedure to read this representation of unweighted graphs and build the corresponding program structures.

One problem is associating labels for the nodes with corresponding program structures. The natural solution in Icon is to use a table in which the keys (entry values) are the labels and the assigned values are the corresponding sets. Written out explicitly for the graph above, this might be:

Node := table()
Node["A"] := A
Node["B"] := B
Node["C"] := C

Consequently, Node["A"] produces the node (set) labelled A. Such a table might be used, for example, in constructing a graph from its string representation as posed in the exercise above.

On the other hand, the converse may be needed. For example, in writing out the results of a computation on a graph (such as the transitive closure of a node), the labels associated with nodes may be needed.

Since any kind of value can be used as a key into a table, a table like the one above, with the keys and entry values reversed, can be used:

Label := table()
Label[A] := "A"
Label[B] := "B"
Label[C] := "C"

It is not necessary to have two tables, however. Since the keys in a table need not all be of the same type, the same table can be keyed with both the labels and the nodes (sets):

Graph := table()
Graph["A"] := A
Graph["B"] := B
Graph["C"] := C
Graph[A] := "A"
Graph[B] := "B"
Graph[C] := "C"

Such a ''two-way'' table keeps all the information needed to associate labels with nodes and vice versa in one structure. Subscript it with a label to get the corresponding node and subscript with a node to get the corresponding label.

Icon home page