CSc 352: Graphs and Reachability

1. Basic Definitions

In mathematical terms, a structure such as this
is called a graph. A graph consists of a set of vertices (the circles in the figure above) and a set of edges (the lines joining the vertices in the figure above). In this case, the edges aren't directed, i.e., ``an edge from X to Y'' is the same as ``an edge from Y to X''. If there is an edge between two vertices X and Y, then X and Y are said to be neighbors.

The reachability problem for a graph is the following problem:

Given two vertices a and b in a graph, is it possible to go from vertex a to vertex b following only edges in the graph? In other words, is there a sequence of vertices
v0, v1, ..., vn
such that:
  1. v0 = a;
  2. vn = b; and
  3. for each i (0 <= i < n), there is an edge between vertices vi and vi+1.

2. Reachability: An Algorithm

We can specify the ``is reachable from'' relation between vertices as follows:

This specifies a natural recursive algorithm for determining reachability. The problem with this algorithm, as stated, is that it can get into an infinite loop when processing a cycle. As an example, consider the figure shown above. Suppose we want to know whether vertex j is reachable from e. One possible execution of the algorithm above might be as follows:

is j reachable from e?
is j reachable from f? [using the edge between e and f]
is j reachable from g? [using the edge between f and g]
is j reachable from h? [using the edge between g and h]
is j reachable from f? [using the edge between h and f]
...

and the execution is stuck in a loop.

We can fix the problem by marking each vertex as we visit it, and making sure that we never visit a marked vertex a second time. To determine whether a vertex B is reachable from a vertex A, the algorithm now is:

  1. mark all vertices ``unvisited'';
  2. return Reachable(A, B).

where the function Reachable() is defined as follows:

function Reachable(vertex x, vertex y)
  1. if x == y return TRUE;
  2. else for each unvisited neighbor z of x do:
    1. mark z ``visited'';
    2. if Reachable(z, y) return TRUE.
end /* Reachable */

Note that the step ``mark all vertices unvisited'' is done outside the function Reachable(), and has to be carried out for each query processed.

3. Data Structure

The algorithm given above suggests that we want a data structure that has the following: