The University of Arizona / Computer Science / UAScience

MPD Tutorial and Sample Programs

This page contains a tutorial on the MPD language. We start with an overview of the main components of the language and a few general guidelines for its syntax and semantics. Then we introduce the language mechanisms by means of a series of examples. Every example is a complete program; if you save it in a file, then you can compile and execute it on your local installation of MPD.

"Hello world." Description. Program.
Summing the elements of an array. Description. Program.
Quadrature. Description. Iterative program. Recursive program. Parallel recursive program.
Quicksort. Description. Program.
Matrix multiplication. Description. Sequential program. Using co statements. Using processes.
Distributed matrix multiplication. Description. Program using message passing.
Finding patterns in a file. Description. Program.
Concurrent file search. Description. Program. Using virtual machines.
Critical section simulation using rendezvous. Description. Program.
Readers/writers simulation using "monitors" and semaphores. Description. Program.
Programming graphical user interfaces. Description. Program to draw a simple window.

Many of these applications are also discussed in the MPD book. Section references are given below.

The Big Picture

The most general MPD program consists of multiple virtual machines, each of which contains resources and globals. A virtual machine is an address space. A resource is similar to a Java class: It declares a pattern for a class of objects, each is of which is created dynamically. A resource exports operations; its body contains variables, procedures, and processes that implement the operations and/or perform background tasks. A global is a collection of variables and operations that are shared by all resources (and other globals) in the same virtual machine (address space).

Virtual machines are created dynamically and placed on physical machines, such as those in a local-area network. An MPD program that uses virtual machines is called a distributed program. Instances of resources (i.e., objects) are also created dynamically; they are placed on virtual machines. A global is created once when it is first imported.

Every MPD program contains at least one resource. The simplest MPD program consists of a single resource. One resource in every program is treated as the "main" resource. The main resource can have any name the programmer chooses; it does not have to be named main. The main resource is typically the last one that is compiled and must be the last one linked. It may not be imported by any other resource nor have any parameters.

Before an MPD program is executed, the MPD runtime system automatically creates one virtual machine and one instance of the main resource. The virtual machine is located on the physical machine that is used to start the MPD program. The instance of the main resource is placed in that virtual machine. The MPD program is then executed beginning with the first declaration or statement in the body of the main resource. After the program terminates, the final block in the main resource is executed, if there is one. (See the Matrix Multiplication section below for an example.)

Operations are the most important mechanism in MPD (and its predecessor SR). They give the language its power and uniqueness. An operation can be invoked synchronously by means of the call statement or asynchronously by means of the send statement. An operation can be implemented by a proc (procedure) or by an input statement. The four combinations of these mechanisms provide the following functionality:

Invocation	Implementation	Effect
`call`	proc (procedure)	Procedure call
`send`	proc (procedure)	Fork a new process
`call`	input statement	Rendezvous
`send`	input statement	Message passing

MPD also provides several mechanisms that are abbreviations of common combinations of the above mechanisms. A procedure declaration is simply an operation plus a proc. The fork statement is an abbreviation for send. The receive statement is an abbreviation for an input statement that merely receives a message. A semaphore is an abbreviation for a parameterless operation. Finally, the V() and P() operations on semaphores are abbreviations for parameterless send and receive statements, respectively. The example programs in this tutorial illustrate all of these language mechanisms.

General Guidelines

The syntax and semantics of MPD are based on a few simple principles, which help make it easy to read or to write MPD programs.

The main components, resources and globals, begin with resource identifier and global identifier, respectively. Both components end with end identifier. The ending identifier is optional (and often omitted).
Semicolons are required to separate declarations or statements that appear on the same line, but semicolons can always be replaced by newline characters.
Names (identifiers) must be declared before they are used.
Declarations and statements may be intermixed. Thus, declarations can appear close to where the declared variables are used and they can appear inside any block.
Declarations and statements are executed in the order they appear. Even declarations are executed in the sense that the space for a variable is allocated and initialized when the variable's declaration is encountered.
Blocks of declarations and statements are always enclosed by braces { ... }. This applies to the bodies of procedures, procs, and processes; the arms of if/else statements; the bodies of loops; and the bodies of final blocks. Unlike C or Java, MPD does not allow braces to be omitted when an arm of an if or else statement or the body of a loop consists of a single statement.
If a component may have parameters, then parentheses are required, even if the parameter list is empty. This applies to resources, procedures, and procs.
Commas are used uniformly as list separators in conjunction with formal parameters, actual parameters, declarations, subscripts, and quantifiers.
Array subscripts are enclosed by brackets, as in a[i]. Matrices (and other multiple-dimension arrays) can use either matrix notation, as in a[i,j], or vector-of-vector notation, as in a[i][j].
Quantifiers are also enclosed by brackets, as in [i = 1 to 100], because they are normally used in conjunction with arrays. Quantifiers are used in for loops and (optionally) in process declarations and co statements.

A Simple Example

The program below and in hello.mpd illustrates a single-resource program:

   # a simple MPD program with a single resource
   # it prints a string followed by a newline character

   resource hello()
     write("Hello world.")
   end

The first two lines are comments. The resource is named hello. The body of the resource contains a single statement that calls the pre-defined write() function.

The name of the resource is followed by parentheses, hello(), because in general resources can have parameters. MPD employs the consistent rule that whenever a component can have parameters, then the declaration contains parentheses, even if the parameter list is empty.

When the above program is executed, the MPD runtime system creates one instance of hello. The body of that instance writes the string "Hello world." (without quote marks) followed by a newline character; then the program terminates. In general write converts its arguments to strings, prints them on one line separated by blanks, and appends a newline.

MPD also provides C's printf statement, so the above program would have the exact same effect if the write statement were replaced by

   printf("Hello world.\n")

Notice that the write and printf statements above are not followed by semicolons. This is because semicolons are for the most part optional in MPD. The exact rule is that they are required to separate declarations and statements, but they can always be replaced by newline characters. Hence, semicolons are optional except when there is more than one declaration or statement on the same line, as in:

   writes("Hello "); write("world.")

The above line produces exactly the same output as the single write or printf statements above. Like write, writes converts its arguments to strings and prints them on one line; however, writes does not insert blanks between arguments and does not append a newline character.

Sequential Array Summation

The program below, and in file sum.mpd, illustrates command-line arguments, intermixed declarations and statements, dynamic arrays, and simple for loops.

  # iterative program to sum elements of an array

  # store in a file such as "sum.mpd"
  # compile by executing "mpd sum.mpd"
  # run by executing "a.out size"

  resource sum()
    # declare and read first command-line argument
    int size; getarg(1,size);

    # declare and initialize array a
    int a[1:size];         # same as int a[size]
    for [i = 1 to size] {
      a[i] = i;
    }

    # calculate then print a[1] + ... + a[size]
    int total = 0;
    for [i = 1 to size] {
      total = total + a[i];   # same as total += a[i]
    }
    write("the sum of 1 ...", size, "is", total);

  end

The comments at the start of the program indicate how to compile and execute it. The comments inside the "main" resource, sum(), describe the purpose of each component of the program. Here we have put a semicolon at the end of each declaration and statement, as in C; as described earlier, these could be replaced by newlines.

Declarations and statements can appear in any order in MPD. The only requirement is that an identifier be declared before it is used. By allowing declarations to follow statements, it is possible to have array sizes depend on input values. Above, the upper bound for array a is the value size of the first command-line argument. This value must be an integer, because the declared type of size is int.

The declaration of an array can specify both the lower and upper bounds for subscripts, as in a[1:size] above. The bounds can be values of any ordered type, such as int. The lower bound is optional; the default is 1, not 0 as in C. Hence, int a[1:size] is equivalent to int a[size].

The for statements in the above program are also different than those in C. In particular, an MPD for statement uses what is called a quantifier to specify a bound variable and the range of values over which to iterate. Above, the single quantifier [i = 1 to size] is used to iterate over all elements of array a. The bound variable, i above, is a new variable that is declared implicitly; its scope extends to the end of the for statement. Quantifiers are enclosed in brackets [ ... ] rather than parentheses because they are most commonly used in conjunction with arrays. In general, quantifiers can specify more than one bound variable and associated range, as in

  for [i = 1 to N, j = 1 to N] ...

The write statement in the above program prints four items. These are written on the same line separated by single blank characters, and followed by a newline character. The same effect could be achieved using the printf function as follows:

  printf("the sum of 1 ... %d is %d\n", size, total);

The Quadrature Problem

Now consider the quadrature problem described in Section 1.5 of the MPD book. The goal is to approximate the integral of a function f(x) from a to b, namely to approximate the area between f(x) and the x axis from x equals a to x equals b. Below we describe three programs that solve the problem in different ways, then we discuss the tradeoffs between the programs.

Iterative Program. The first program, quad.iter.mpd, uses an iterative algorithm. In particular, the program uses a fixed number of subintervals, approximates the area of each using a trapezoid, then sums the areas to approximate the total area.

The first two lines in the resource declare and read three command-line arguments, as follows:

  int intervals; getarg(1, intervals);
  real a, b; getarg(2, a); getarg(3, b);

Variables a and b are of type real, which is equivalent to C's type double.

Next, we have the declaration of the function to integrate.

  procedure f(real x) returns real fx {
    fx = sin(x) * exp(x);
  }

The formal parameter is declared as in C. However, the name and type of the function's return value follow the parameters, as in SR. Hence there is no need for a return statement; the function's value, fx, is returned automatically at the end of the function body. The body of the function calls two of MPD's predefined functions: sin(x) and exp(x).

The main part of the resource calculates the area as follows using the trapezoidal rule:

  real width = (b-a)/intervals;    # distance between intervals
  real fleft = f(a), fright, area = 0.0;
      # fleft is used to compute f(x) just once per value of x

  int start = age();    # start time, in milliseconds
  for [x = (a + width) to b by width ] {
      fright = f(x);
      area += (fleft + fright) * width / 2;  # trapezoidal rule
      fleft = fright;
  }
  int finish = age();   # finish time, in milliseconds

This code also illustrates how to use MPD's age() function to calculate the execution time of the main loop of the program. (Each call of age() returns the number of milliseconds since the MPD program started execution; this time includes the time taken by the MPD runtime system to initialize key data structures and to create an instance of the main resource.)

Finally, the resource writes two lines. The first contains the command-line arguments, the second contains the calculated area and elapsed time.

  write("intervals =", intervals," a =", a, " b = ", b);
  write(" area =", area, " time = ",finish-start);

Recursive Program. The second program, quad.recurse.mpd, uses recursion. It implements the adaptive quadrature algorithm using the divide-and-conquer paradigm as described in the MPD book.

The command-line arguments and function f(x) are declared as in the first program. However, the area is computed using the following recursive function:

  procedure quad(real left, real right, real fleft, real fright, real lrarea)
                          returns real area {
    real mid = (left+right)/ 2;
    real fmid = f(mid);
    real larea = (fleft + fmid) * (mid-left) / 2;     # left area
    real rarea = (fmid + fright) * (right-mid) / 2;   # right area
    if (abs((larea+rarea) - lrarea) > epsilon) {
      # recurse to integrate both halves
      larea = quad(left, mid, fleft, fmid, larea);
      rarea = quad(mid, right, fmid, fright, rarea);
    }
    area = larea + rarea;
  }

The remainder of the resource makes the first call to quad(), calculates the elapsed time, and prints the results:

  int start = age();    # start time, in milliseconds
  real area = quad(a, b, f(a), f(b), (f(a)+f(b))*(b-a)/2);
  int finish = age();   # finish time, in milliseconds

  write("epsilon =", epsilon," a =", a, " b = ", b);
  write(" area =", area, " time = ",finish-start);

Parallel Recursive Program. The third program, quad.co.mpd, uses recursive parallelism. It uses same algorithm as the second program, but the recursive calls are executed in parallel because they are independent. For this it uses the co statement as follows:

  co larea = quad(left, mid, fleft, fmid, larea)
  // rarea = quad(mid, right, fmid, fright, rarea)
  oc;

Performance and Accuracy Tradeoffs. The iterative program will execute faster than the recursive program for the same input -- unless the function is smooth -- because for loops are faster than function calls. However, the recursive program will in general compute a better result because it adapts to the shape of the curve defined by f(x).

On a single processor, the recursive program will be faster than the parallel recursive program, because it takes longer to fork a process than to call a function. Moreover, the parallel program might cause the MPD runtime system to run out of memory and hence to abort the program. This is because the parallel recursive program can create a large number of processes, each process needs its own stack, and the default stack size is 40,000 bytes. (Execute the MPD linker command "mpdl -l" to see the MPD runtime defaults for the maximum number of processes and the size of process stacks. These limits can be changed; see the mpdl man page for how to do so.)

Despite these limitations, the third program can run faster than the second program on a shared-memory multiprocessor. Or one could use a combination of the three programs. For example, try dividing the problem into a fixed number of intervals -- one per processor -- and using the recursive algorithm (program two) for each interval. Exercises 1.7 and 1.8 in the MPD book explore these tradeoffs and describe how to limit the number of processes in the parallel recursive program.

Quicksort

Quicksort is a classic sorting algorithm. It can be implemented in MPD by the recursive program in quick.mpd. That program also illustrates several additional features of MPD: operation headers, proc declarations, var parameters, file access, string variables, and string comparison.

The first line in the body of the resource declares two "constants", using the C convention of upper-case identifiers. The next line contains what is called an op declaration:

  op sort(var string[*] a[1:*])  # forward declaration for sort()

An op declaration specifies the header of a function (or of a message queue, as we shall see). The body is declared later using a proc declaration, as described below. The sort operation above has one formal parameter, which is an array of string arguments. The parameter is passed by value/result, which is indicated by the var keyword. (The other options are value (the default), result, and reference; these are denoted by val, res, and ref, respectively.)

In the op declaration above, the maximum size of the strings and the upper bound for the array are both unspecified, which is denoted by *; Each time sort is called, these will take on actual values as determined by the type of the actual parameter in a call.

The next few lines read the name of an input file and then open the file:

  # read input file name, then open the file
  string[20] fname; getarg(1,fname);
  file fd = open(fname, READ);
  if (fd == null)
    { write("cannot open file", fname); stop(1); }

A string variable has a maximum length and a current length. The maximum length of fname above is 20. The default initial value of a string is the empty string "", so the initial length of fname is zero. The current length of a string changes every time it is assigned a new value, so after getarg is called above, the length of fname will be the number of characters in the first command-line argument. An MPD program aborts if a string is assigned too long a value. The predefined function length(s) returns the current length of string s.

After reading the filename, the above code attempts to open the file in READ mode. The open() function returns a file pointer (descriptor) or a null pointer if the file cannot be opened. Assuming the file could be opened, it is read as shown in the third line of the following code:

  # declare input array and read input file
  int size = 1;
  string[MAXLINE] data[MAXFILE];
  while(size <= MAXFILE & read(fd, data[size]) != EOF) { size++ }
  if (size > 10000)
    { write("input file is longer than", MAXFILE, "lines"); stop(1); }
  size--;

The while loop reads the input file into array data, assuming the file is not too long. Boolean expressions in MPD use what is called "short circuit" evaluation; this means that the above while loop will terminate if size <= MAXFILE is false without evaluating the second condition. Stated differently, the condition in the above while loop ensures that there is space for the next line in array data before reading that line. The first argument to read above is the file pointer fd; read returns the predefined constant value EOF when the end of file is reached.

The next few lines in the program call the sort() procedure and then write the results to standard output:

  # sort the file, then print the result
  sort(data[1:size]);
  for [i = 1 to size] {
    write(data[i]);
  }

The actual parameter to sort is specified using what is called an array slice, data[1:size]. These are all the entries in array data that contain lines from the input file. The final part of the program is the quicksort procedure itself. This is written using a proc declaration, because the header of the procedure has already been declared (using an op declaration. Only the names of formal parameters are given in a proc declaration; their types and modes (var in this case) come from the op declaration.

  proc sort(a) {
    if (ub(a) <= 1) { return; }   # base case
    string[MAXLINE] pivot = a[1];
    int lx = 2, rx = ub(a);
    while(lx <= rx) {    # partition the array based on the pivot
      if (a[lx] <= pivot) { lx++ }
      else { a[lx] :=: a[rx]; rx-- }
    }
    a[rx] :=: a[1];      # swap pivot line into place
    sort(a[1:rx-1]);     # sort "left" half
    sort(a[lx:ub(a)]);   # sort "right" half
  }

The body of sort partitions the array based on the first (pivot) line, then recursively sorts the left and right halves. The call of predefined function ub(a) returns the upper bound of array a for the current call of sort. Strings in MPD can be directly compared, the same as "normal" values. Above, a[lx] <= pivot is true if string a[lx] occurs earlier than string pivot in lexicographic order based on ASCII values of characters from left to right in the two strings. Notice that array slices have again been used in the calls of sort.

Procedure declarations are closely related to op and proc declarations. In particular, a procedure in MPD is simply an abbreviation for an op immediately followed by a proc.

Matrix Multiplication

Section 1.4 of the MPD book defines the matrix multiplication problem and describes sequential and parallel programs for solving it. Here we present three programs: a sequential program using for statements, a parallel program using co statements, and a parallel program using process declarations. The programs illustrate both similarities and differences between for loops, co statements, and process declarations. The third process also illustrates the use of a final block, which prints results after the main part of the program has terminated.

Sequential Program. File mm.seq.mpd contains a complete sequential program for multiplying two n by n matrices. After reading a command-line argument for the size n of the matrices, the program declares and initializes the matrices as follows:

  real a[n,n] = ([n] ([n] 1.0)),
       b[n,n] = ([n] ([n] 1.0)),
       c[n,n];

The initializers for a and b each specify n² values of 1.0. In particular, they specify n vectors, each having n values of 1.0. We have initialized both source matrices to ones so that it is easy to verify that the output is correct (every element of c will be n).

Matrices can be declared and referenced as above, or they can be declared and referenced as vectors of vectors, as in C. For example, the following declaration is also legal in MPD:

  real m[n][n];         # same as:  real m[1:n][1:n]

However, recall that the default lower bounds in MPD are one, not zero, as specified in the above comment.

The next part of the program performs the computation proper:

  for [i = 1 to n, j = 1 to n] {  # compute n**2 inner products
    real sum = 0.0;
    for [k = 1 to n]
      { sum += a[i,k]*b[k,j]; }
    c[i,j] = sum;
  }

The outer for loop uses a double quantifier, which declares two new variables i and j and assigns each of them values from 1 to n. The values are assigned in row-major order, which means that j varies more rapidly than i.

The body of the outer for loop declares and initializes sum. The declaration could be placed outside the loop, but the initialization has to occur each time the loop body is entered. The inner for loop computes the inner product of row a[i,*] and column b[*,j].

The last piece of code in the program prints the results on the standard output file. It uses writes() to print a row on one line, then uses write() with no parameters to append a newline.

  # print result, by rows
  for [i = 1 to n] {
    for [j = 1 to n] { writes(c[i,j], " "); }  # one line per row
    write();   # append a newline
  }

Parallel Program Using co Statements. File mm.co.mpd contains a parallel program that computes each inner product in parallel using a co statement. The first and last parts of the program are identical to those for the sequential program. The difference between the two programs is that the nested for loops in the sequential program are replaced by a procedure declaration and a co statement, as follows:

  # compute inner product of a[i,*] * b[*,j]
  procedure inner(int i, int j) {
    real sum = 0.0;
    for [k = 1 to n]
      { sum += a[i,k] * b[k,j] }
    c[i,j] = sum;
  }

  # compute n**2 inner products concurrently
  co [i = 1 to n, j = 1 to n] inner(i,j) oc

The procedure computes one inner product. Its body is exactly the same as the body of the of the outer for loop in the sequential program.

The co statement calls the procedure n² times, in parallel. The quantifier in the co statement is exactly the same as the quantifier on the outer for loop in the sequential program.

The co statement causes n² processes to be created. Each gets a different pair of values for i and j and each executes one call inner(i,j). When the processes all quit, the co statement terminates, and the results are printed.

Parallel Program Using Processes. The third program, in mm.process.mpd, also computes n² inner products in parallel. However, it uses a process declaration that combines the procedure declaration and co statement of the previous program.

  process inner[i = 1 to n, j = 1 to n] {
    real sum = 0.0;
    for [k = 1 to n]
      { sum += a[i,k] * b[k,j] }
    c[i,j] = sum;
  }

This specifies an array of n² fine-grained processes: inner[1,1], inner[1,2], ..., inner[n,n]. The quantifier in the process heading is exactly the same as the quantifier with the co statement in the previous program. The body of the process is exactly the same as the body of procedure inner in the previous program.

Although an array of processes is very similar to a co statement plus the procedure called by the co statement, there is an important semantic difference. A process is a declaration. When the declaration is processed, the new processes are created and started, then the main process (the one which processes the declaration) continues to execute. In particular, the above process declaration is implemented essentially as follows:

  for [i = 1 to n, j = 1 to n] {
    fork inner(i,j);  # fork a new instance of procedure inner()
  }

Thus, if the code to print the results were to appear immediately after the process declaration, the printing would start while results are being computed! (Try this out to see what happens; the result in general is nondeterministic.)

If there is something to be done after all processes have been terminated, you can can either program synchronization code to have the outer process wait for all the new processes to terminate, or better yet, you can use a final block as in the third matrix multiplication program.

  # wait for processes to terminate, then print result, by rows
  final {
    for [i = 1 to n] {
      for [j = 1 to n] { writes(c[i,j], " "); }  # one line per row
      write();   # append a newline
    }
  }

A final block in the main resource is not executed until all processes in the program have terminated (or blocked). It is easy for the MPD runtime system to detect when this has occurred; it is hard for the programmer to do so. Hence, final blocks are often very useful.

In general, any resource may have a final block. That block is executed when the resource is destroyed. In fact, this is what happens for the main resource. Recall that the MPD runtime system implicitly creates an instance of the main resource. The MPD runtime system also destroys that instance when the program terminates; that destroy causes the final code, if any, to be executed. When that code terminates, the program ceases to exist.

Distributed Matrix Multiplication

Matrix multiplication can also be implemented by a distributed program in which processes do not share variables but instead communicate using message passing. Section 1.8 of the MPD book describes two programming styles that could be used. One has a coordinator process and several worker processes; the second uses a circular pipeline of worker processes.

File mm.dist.mpd contains a distributed program that uses the coordinator/worker paradigm. The program illustrates the use of message passing in MPD. We have put all the processes in the same resource for convenience. This means that the program as listed can only be executed on a single processor or a shared-memory multiprocessor. However, the program can readily be modified to use multiple resources and virtual machines and hence to execute on a distributed collection of machines, such as a cluster of workstations. We show how to construct a distributed program in the section below on concurrent file search.

The header comment in mm.dist.mpd uses MPD's multi-line comment /* ... */. The start of the resource declares and reads the command-line arguments; these values are global to the processes in the resource.

The message channels are created using op declarations as follows:

  op data[numWorkers] (real m[*,*]);      # channels to Workers
  op result[numWorkers] (real m[*,*]);    # channels to Coordinator

Each channel is an array. The messages sent to channels are matrices of reals having unspecified sizes; the actual size of each message is determined by the send statements in the program.

The program uses an array of Worker processes. Each Worker computes a strip of the result matrix c. The Coordinator process first sends strips of matrix a and all of matrix b to each worker, then it receives results from each worker. We use an array of result channels to ensure that each strip of results is stored in the correct strip of matrix c. Notice the use of array slices and * in the send and receive statements. For example,

  send data[w] (a[startRow:endRow,*]);

sends stripSize rows of a to Worker w. Similarly,

  receive result[w](c[startRow:endRow,*]);

receives a strip of results from Worker w and stores them in the appropriate strip of matrix c.

The matrices in each Worker are declared to be just as big as necessary. Hence, the send and receive statements in the Workers transfer entire matrices. In the case of local matrices a and c, these are strips of the corresponding full matrices in the Coordinator.

Finding Patterns in a File

Section 2.2 of the MPD book describes the problem of finding all instances of a pattern in a file. That section presents a sequential program and then discusses how to parallelize the program. File find.mpd contains an MPD program that uses a co statement in the manner discussed on pages 46 and 47 of the MPD book. The program also illustrates the use of a global component, which declares three constants as follows:

  global defs           # shared definitions
    const int MAXPATTERN = 20
    const int MAXFILENAME = 20
    const int MAXLINE = 120
  end

A constant -- indicated by the const prefix on each declaration -- is initialized when it is declared; later assignments to a constant are not allowed.

Program execution begins, as usual, with the main resource (named grep in this case). The first thing that resource does is to import defs. This causes the constants in global def to be initialized and makes their names visible in the remainder of the resource. A subsequent import of defs (on the same virtual machine) will be ignored, because a global is only created once per virtual machine. In particular, a global is created the first time that it is imported by a resource or another global.

The find procedure uses several of MPD's string processing operations.

  procedure find(string[*] pattern, string[*] line) {
    # if line contains pattern, write out line
    int i = 1, plen = length(pattern)
    while (i <= (length(line) - plen + 1)) {
      if (pattern == line[i:i+plen-1]) {
          write(filename || ":", line); return
      }
      i++
    }
  }

The length() function returns the length of a string. The expression line[i:i+plen-1] selects a substring, which in this case is compared to pattern. Finally, the write statement uses the string concatenation operator ||.

The co statement in the resource is programmed as follows:

  co find(pattern, line[ln])
  // result = doread(3-ln)
  oc

Note that each arm calls a procedure in the resource. In MPD an arm of a co statement may only call a procedure or send to an operation. For implementation reasons, MPD does not allow arms to contain statement lists or to make direct calls of pre-defined functions. Hence, the above co statement calls the doread function, which in turn calls the pre-defined read() function.

Concurrent File Search

The previous program uses fine-grain concurrency to read a new line in parallel with searching for the pattern in the previous line. The program is correct but not very efficient, because the granularity of concurrency is too small to be employed effectively on most machines. Here we consider a related problem that is much more amenable to effective parallelization.

Suppose we want to search for a pattern in multiple files. The files are independent, so they can be searched in parallel by using one process for each file. File cgrep.mpd contains an MPD program that implements this strategy. The program contains a global component and two resources. The global component defines constants as in the previous example.

The first resource, grep, finds all occurrences of string pattern in file filename. These values are parameters to the resource, and hence they are declared in the resource header.

  resource grep(string[*] pattern, string[*] filename)
    import defs
    file  fd = open(filename, READ)

    procedure find(string[*] pattern, string[*] line)  {
      # if line contains pattern, write out line
      int i = 1
      int plen = length(pattern)
      while (i <= (length(line) - plen + 1)) {
        if (pattern == line[i:i+plen-1]) {
	    write(filename || ":", line); return
        }
        i++
      }
    }

    process search  {
      # find all instances of pattern in filename
      string[MAXLINE] line
      while (read(fd, line) != EOF) {
        find(pattern, line)
      }
    }
  end grep

The resource body imports the global, opens filename, then starts process search. That process reads each line from the file and calls internal procedure find to determine if the line contains pattern.

The second resource, main, is the last one that is compiled (and linked), and hence it is the main resource in the program.

  resource main()
    import defs, grep
    # read command line arguments and create instances
    # of resource grep
    string[MAXPATTERN] pattern; getarg(1, pattern)
    string[MAXFILENAME] filename
    for [i = 2 to numargs()] {
        getarg(i, filename)
        create grep(pattern, filename)
    }
  end

The first line imports both defs and grep. The import of defs causes the global to be created and initialized. The import of grep makes its header visible in main. The for loop creates new instances of grep, passing each one the pattern and a file name. A create statement terminates as soon as the resource executes its top-level declarations and statements (or when the resource executes a return or reply statement). In this program, the top-level declarations and statements in grep are: (1) the import clause (which has no effect, because main has already created defs), (2) the call of open, (3) the procedure declaration (which has no effect), and (4) the process declaration (which starts process search). Hence, the create statement above terminates when each new instance of grep has begun a new search.

Program cgrep.mpd can readily be extended to use virtual machines and hence to allow the different resources to execute on different processors. File cgrep.vm.mpd contains a distributed version.

Only three changes are required to use virtual machines: (1) declare one or more capabilities for virtual machines, (2) create one or more virtual machines, and (3) create resources on those virtual machines. All three of these changes are in the main resource in program cgrep.vm.mpd. The first two are in the lines

  cap vm vc
  vc = create vm() on "paloverde"

The effect of this create statement is to create a new virtual machine on the local host named "paloverde". The third change is in the resource creation statement in the for loop:

   create grep(pattern, filename) on vc

This now creates a new instance of grep on the virtual machine pointed to by capability variable vc; above that virtual machine is on host "paloverde".

The program in cgrep.vm.mpd also illustrates the use of an external operation. In particular, we have added the following four lines to the top of resource grep:

  external gethostname(res string[*] s, int namelen)
  string[30] hname
  gethostname(hname, maxlength(hname))
  write("vm host:", hname)

The external operation, gethostname, is a Unix system call that returns the name of the host on which the function is called. The above code calls this function and then prints the result. Consequently, the MPD program prints the name of each host on which virtual machines are created (only "paloverde" in this case).

External operations can be used to access many functions in libraries that are written in C and that are accessible to the MPD linker. These include the Unix system libraries and window systems. For example, MPDWin is a very useful package for creating graphical user interfaces.

Critical Section Simulation

MPD can also be used to create interesting simulations. For example, students have used MPD's close relative SR to simulate bouncing balls, n-body problems, traffic flow, elevators, and many other physical systems. MPD can also be used to create simulations of synchronization protocols. We present a simple simulation of the critical section problem in this section. The next section presents a simulation of the readers/writers problem. The examples area of the MPD distribution contains solutions to the dining philosophers problem, as described in the SR book.

File cs.simulation.mpd contains a simulation of the critical section problem. (Figure 8.20 of the MPD book contains an SR version of the program.) The program illustrates one use of MPD's rendezvous primitives, operation restrictions, a global component with a body, and MPD's random() and nap() functions.

The global component exports two operations that are invoked when a "user" process enters and exits a critical section. The body of the global contains a background process that repeatedly waits for an invocation of CSenter() and then waits for an invocation of CSexit(). The operation restriction {call} indicates that CSenter() must be called. There is no restriction on the CSexit() operation, so it can be invoked by call or send statements. (The program uses send.)

  global CS
    op CSenter(int id) {call},    # must be called
       CSexit()    # may be invoked by call or send
  body CS
    process arbitrator {
      while (true) {
        in CSenter(id) by id ->
            write("user", id, "in its CS at", age())
        ni
        receive CSexit()
      }
    }
  end

The CSenter() operation is serviced by an input statement. The one above waits until there is at least one call; then the one that minimizes the value of parameter id is serviced. This gives preference to the lower numbered users. The CSexit() operation is serviced by a receive statement. Recall that receive is simply an abbreviation for a simple input statement, in this case in CSexit() -> skip ni.

The main resource reads the command line arguments, then forks an array of user processes.

  resource main()
    import CS
    int numusers, rounds; getarg(1, numusers); getarg(2, rounds)

    process user[i = 1 to numusers] {
      for [j = 1 to rounds] {
        call CSenter(i)
        nap(int(random()*100))
        send CSexit()
        nap(int(random()*1000))
      }
    }
  end

Each user process executes repeatedly calls CSenter() and sends to CSexit(). Before these invocations, a user process goes to sleep for a random interval. For example, the call

  nap(int(random()*100))

delays the caller for a number of milliseconds that is a (pseudo-)random value between 0 and 100. The random() function returns a real, and the nap() function requires an int, so the int() conversion function is used to cast the real to an int. There is a conversion function for each MPD type; its name is the same as the type's name. (MPD implicitly converts ints to reals in expressions, but the programmer has to do other type conversion, as above.)

Readers/Writers Simulation

File rw.simulation.mpd contains a simulation of the readers/writers problem. The program illustrates one way to implement a monitor in MPD, in this case using a global component and semaphores.

The global component encapsulates the "database" (here, simply an integer). The interface (spec) part of the global exports two operations. The body of the global implements these operations using proc declarations. The body also contains two internal procedures, startread() and endread(), which are called by the read() proc. Semaphores are used as shown in Figure 4.10 of the MPD book to synchronize reads and writes.

global DataBase
  op read(var int data)      # read a value
  op write(val int data)     # write a value
body DataBase
  int database = 0     # the "value" of the database

  sem rw = 1, mutexR = 1
  int nr = 0

  procedure startread() {
    P(mutexR); nr++; if (nr == 1) { P(rw) }; V(mutexR)
  }

  procedure endread() {
    P(mutexR); nr--; if (nr == 0) { V(rw) }; V(mutexR)
  }

  proc read(data) {
    startread(); nap(500); data = database; endread()
  }

  proc write(data) {
    P(rw); nap(1000); database = data; V(rw)
  }
end DataBase

The read() and write() procedures above use MPD's predefined function nap() to simulate the passage of time while reading and writing. This makes for more interesting output.

The main resource in the program is similar to the one in the critical section program described above. The main difference is that there are two arrays of background processes, one for readers and one for writers.

resource main()
  import DataBase

  int numReaders; getarg(1, numReaders)
  int numWriters; getarg(2, numWriters)
  int rounds; getarg(3, rounds)

  process Reader[i = 1 to numReaders] {
    int myvalue
    nap(int(random()*1000))   # to let everybody get created
    for [j = 1 to rounds] {
      DataBase.read(myvalue)
      write("reader", i, "read", myvalue)  
      nap(int(random()*1000))
    }
  }

  process Writer[i = 1 to numWriters] {
    int myvalue = i
    nap(int(random()*1000))   # to let everybody get created
    for [j = 1 to rounds] {
      DataBase.write(myvalue)
      write("writer", i, "wrote", myvalue)  
      nap(int(random()*1000))
    }
  }
end

Programming Graphical User Interfaces

The MPD distribution contains two powerful packages that allow one to do graphics programming or to create algorithm animations. The MPDWin package, contributed by Alex Zhao, provides an interface to the X11 graphics library. The MPDanimator package, contributed by Stephen Hartley, provides an interface to the XTANGO animation library. The interface to each package is defined by means of an MPD global component, which in turn calls external C functions. The MPD distribution contains manual pages and example programs for each package. The SR distribution contains reports that describe the original SR versions of these packages; the MPD versions are functionally identical.

Here we describe a simple use of the MPDWin package. The program in file window.hello.mpd opens a small window on the programmer's workstation and writes the message "Hello World" in that window. The header comment indicates how to compile and link the package into an MPD program.

  resource hello()
    import MPDWin;      # access the global defining MPDWin

    int interval; getarg(1, interval);   # seconds to display window

    winWindow mywin = WinOpen("", "Hello", null, UseDefault, 100, 30);
    if (mywin == null)
      { write("Sorry, cannot open window"); stop(1); }

    WinSetForeground(mywin, "yellow");   # set foreground color
    WinDrawString(mywin, winPoint(10,20), "Hello World");
    WinFlush(mywin);

    nap(interval*1000);     # sleep for interval seconds
    WinClose(mywin);
  end

The first line in the resource imports the global component that defines the interface to the MPDWin package. That global defines several types and operations. By convention, type names begin with win and operation names begin with Win. The above program opens a new window, sets the foreground color to yellow, draws the string "Hello World", flushes the window buffer so that the string gets printed, sleeps for interval seconds, and finally closes the window. The mpdwin man page describes each of these functions and their arguments.

The MPDWin package contains dozens of functions that support sophisticated graphics programming. The MPD distribution contains larger examples that illustrate the use of many of these. The implementation of MPDWin also has two interesting attributes that make it relatively easy to create complex interfaces. First, MPDWin makes the X11 library "thread safe", so that the MPD programmer can use multiple processes to control one or more windows. Second, MPDWin converts asynchronous external events, such as mouse button pushes, into a stream of messages; the MPD programmer services external events by receiving messages from the event channel.

Last updated February 8, 2001

MPD home page