Errata and Clarifications

Foundations of Multithreaded, Parallel, and Distributed Programming is now in its fifth printing. If you do not know which printing you have, follow this link.

Twenty-nine mistakes in the first printing were corrected in the second and later printings. However, careful readers have found several additional mistakes. Corrections and clarifications are listed below. The most significant corrections are flagged with three asterisks.

If you have a copy of the first printing, you also need to examine the errata for that printing.

p. 16, line 5 in program at top of page --- c[ij] should be c[i,j]

p. 26, first line of second paragraph of Section 1.9 --- a concurrent program has "two or more" processes, not "one or more".

p. 30, second program from the bottom --- in the printf statement in the comment, there should be a quote mark after \n

p. 40, line 9 --- replace "solved" by "solve"

p. 47, line 7 of second to last paragraph --- the first process in Figure 2.1 is the consumer, and the second process is the producer.

p. 47, last paragraph --- there are actually three buffers. The two buffers referred to in line 5 are the two that are local to the processes (one each); the third is the shared buffer.

p. 48, Figure 2.1, fourth line of process 1 --- there is a race condition that can cause the first process to break out of the loop before examining the last line. This happens if the second process fills the buffer with the last line, then reads the end of file and sets done to true. To fix this, change the fourth line of process 1 to:

     if (done and buffer not full) break;

p. 48, Figure 2.1, third line of process 2 --- I assume that read returns EOF after the last line of data is read (as in Unix), so change this line to "read next line of input into line2 or set EOF after last line"

*** p. 49, first sentence after first program fragment --- replace "smaller" by "larger"; the program is looking for a[i] that are larger than the current maximum

*** p. 51, first paragraph of Section 2.4.1 --- the description of an atomic action is incomplete. As stated, an atomic action must not have an intermediate state that is visible to other processes. In addition, no other process can be allowed to change the variables accessed by an atomic action while the action is executing. Both attributes are required in order for an atomic action to make an indivisible state change.

p. 51, program display at bottom --- to make the program clearer, change the declaration line to:

     int x, y = 0, z = 0;

p. 52, definition of At-Most-Once Property --- this definition is tricky and leads to lots of confusion. Although it is correct, it would probably be best to say in clause (a) that x is not referenced by another process. A statement such as x = x+1 is certainly not atomic if x is read by another process, and it can lead to confusion if x is written by another process.

*** p. 54, third sentence after display (2.3) --- The sentence is incomplete, in the same way that the discussion of atomic actions on page 51 is incomplete (see above). The sentence needs a third clause, as follows:
"In particular, B is guaranteed to be true when execution of S begins, no internal state in S is visible to other processes, and no other process can change the variables referenced in B or S while the await statement is executing."

p. 55, second line after the display while (not B); -- the loop spins until B becomes true rather than false.

*** p. 56, first paragraph of Section 2.5 --- replace the second and third sentences by the following: "In particular, the producer process repeatedly reads input lines and passes them on to the consumer process. The consumer determines those lines that contain the desired pattern and prints them."

p. 58, fourth line from bottom of page --- change "if formula" to "if a formula"

p. 61, first program fragment below Figure 2.3 --- the text talks about incrementing x, so change "x = 1;" to "x = x+1;"

*** p. 73, lines 2 and 3 of the second to last paragraph --- replace "where GOOD is equivalent to BAD" by "where GOOD is equivalent to not BAD."

p. 84, Exercise 2.12 --- there is a missing oc at the end of second line of the program.

*** p. 86, exercise 2.16 --- this problem cannot be solved using only the technique of weakened assertions. Developing noninterfering proofs also requires using auxiliary (history) variables to record for each process whether it has completed or not yet started it's await statement.

p. 101, second sentence below Figure 3.4 --- The two instances of "true" should be replaced by "false", so the sentence reads "Instead of simply spinning until a TS instruction returns false, we can increase the likelihood that it returns false by using the following entry protocol:".

p. 105, last line of second paragraph (line 13 from the top) --- change "to find that they are true" by "to find that they are false".

p. 112, predicate BAKERY at the top of the page --- delete "(turn[i] > 0) =>" from the third line. In words, the predicate should say that if process i is in its critical section, then turn[i]>0 and the value of turn[j] for all other j is either zero or greater than turn[i].

pp. 133-34, Section 3.6.1 --- The first paragraph of the section talks about using PR processes on PR processors. In Figure 3.20, the upper bound on the array of processors should be PR, not P. In addition, the display right below the figure should use PR, the number of processes, instead of n, the number of tasks.

p. 136, Figure 3.21 --- Delete the line "if (w == 1)" at the end of the Worker processes. Any worker could be the last one to become idle and hence could be the only one to break out of the while loop. As the program is written, the other n-1 workers are left waiting to remove a task from the bag.

*** p. 143, exercise 3.5 (and exercise 3.6) --- There is a significant oversight in the description of the SC instruction and its implementation. An SC can be allowed to succeed only if it follows an LL from the same location and there has not been any intervening store to the location by any processor. (It is OK if there are multiple LLs from the location, but only one of them can be allowed to have an SC that succeeds.)

This is implemented as follows. Each processor has a location register. The LL instruction sets this register to the address of the variable that is loaded. The SC instruction succeeds only if the store is to the same location used by the last LL executed by the processor. If the executing process is interrupted, location is set to zero.

In addition, every store instruction, including SC, that successfully writes to the address in location clears the address in every location register by setting all of them to zero. (This is implemented by having processor caches "watch" for stores to the address in their location register.)

To summarize, the implementations of LL, SC, and stores are as follows:

  LL(register, variable)  # load locked -- same as in text
    < register = variable; location = &variable; >

  SC(variable, value)     # store conditional
    < if (location == &variable)
          { variable = value;
            set ALL location registers to 0;
            return (1); }
      else
          return (0); >

  Other Store instructions:
    < if (store_address == location)
          set ALL location registers to 0 >   # some invalid address

*** p. 156, second paragraph of Section 4.2.1 --- the discussion is wrong, and there is no simple fix. Replace the paragraph by the following:
"Semaphores were conceived in part to make the critical section problem easy to solve. Figure 3.2 presented a solution in which lock is false when no process is in its critical section and lock is true otherwise. We could equally well have used a different variable free with the interpretation that free is true when no process is in its critical section and free is false otherwise. Then the entry protocol would wait for free to be true and set it to false, and the exit protocol would set free to true. Moreover, we could represent free by an integer and use 1 for true and 0 for false."

*** p. 158, paragraph 1 --- There is a major error in the discussion of using semaphores to implement an N-process barrier. There is a race condition if there is only one semaphore per process; the situation is almost identical to the problem discussed at the top of page 123 for flag barriers. In particular, each process needs to use an array of arrive semaphores, with one semaphore per stage of the barrier. Even though semaphore signals (V operations) are remembered, you still have to ensure that a signal intended for one process is is not seen by another.

p. 163, Figure 4.5 --- the invariant for empty and full is incorrect; the sum of their values can range from 0 to n.

p. 164, line 3 --- replace "Section 4.3" by "Section 4.4".

p. 185, line 3 --- delete the space between # and include

*** p. 216, 6 lines from the bottom --- replace "were zero" by "were not zero" in the phrase "a signaled writer would have to go back to sleep if nr were zero."

p. 231, Figure 5.13 --- in procedure request, the test position != -1 is redundant in the elseif clause.

p. 241, first sentence of second paragraph --- replace the sentence by "The notify method awakens one of the threads on the delay queue, if there is one. The choice of which thread is awakened is arbitrary and left to the implementation."

p. 241, last sentence of third paragraph --- replace the sentence by the following two sentences: "It can also lead to deadlock. If a thread holds the lock on one object and another thread holds the lock on a second object, the threads will deadlock if they both call methods in the object locked by the other." (A thread in Java can, however, call methods in objects for which it already holds the lock.)

p. 245, second paragraph after the code fragment --- replace startWrite by endRead.

p. 271, dispatcher() procedure --- it would be best to start the timer only when dispatching a new process (i.e., when executing is zero); otherwise a process could possibly run forever.

p. 282, line 2 --- replace "delay queue is empty" by "delay queue is not empty".

p. 304, Figure 7.4 --- there is a missing } before process Client.

p. 305, Figure 7.5 --- replace the two instances of res_type by result_type (on lines 5 and 8 of the Figure)

p. 317, Figure 7.14 --- there is a missing link between P0 and P4.

p. 344, 3 lines from bottom of page --- UDP stands for "User Datagram Protocol", not "Unreliable Datagram Protocol"

p. 346, line 21 of Figure 7.18 --- delete the space in !inputFile.ex ists()

p. 347, line 9 of Figure 7.19 --- change System.ex it(1); to System.exit(1);

p. 371, line 20 --- change in to in1

p. 377, Figure 8.6 --- In line 4 of the Waiter process, delete the parenthesis just before eating[left[i]]. In the Philosopher process, the operations in the call statements should be named Table.getforks and Table.relforks.

p. 381, Figure 8.10 --- the input statements use incorrect syntax. They should both be programmed as

     in deposit(v) -> othervalue = v; ni

p. 391, Figure 8.15 --- in the send statement at the end of procedure close(), replace FileServer by FileServer[i]

*** p. 392, last paragraph --- In the third and fourth sentences, the write weight should be n-1, not n-2.

*** p. 396, line 11 of Figure 8.16 --- delete or comment out the following line:

     System.setSecurityManager(new RMISecurityManager());

This line worked for early versions of Java but causes an exception for current versions, because the default security policy now prevents loading classes over the network.

p. 399, second line from bottom of page --- change "(Section 9.2)" to "(Section 8.2)"

p. 402, line 3 and Figure 8.17 --- An Ada requeue statement can appear only within an accept statement or entry body, not within a procedure. In lines 2 and 10 in Figure 8.17, the phrase "procedure Arrive" needs to be replaced by "entry Arrive."

p. 426, Figure 9.1(a) --- in the first arm of the input statement in process manager, add condition "and nextRow < n" so that getTask is accepted only when there is another task. Also, in the body of that arm of the input statement, subscript variable i should be replaced by row.

p. 433, third full paragraph -- on line three, change "then its sends" to "then it sends"; at the end of the third sentence, change "then it sends to the worker above" to "then it receives from the worker above".

*** p. 436, line 8 of Figure 9.4 --- the line if (p != q) should be if (p != i or q != j)

p. 447, line 8 of paragraph 1 --- change 2n(n-1) to n(n-1) (there are n(n-1)/2 nodes in a complete graph)

p. 450, line 20 of Figure 9.12 --- replace else # k == ECHO { by else { # k == ECHO

p. 456, next to last line of Figure 9.13 --- the sender in the condition in the if statement refers to the sender of the message that was just removed from the Helper's message queue, not the sender of the last message received by the Helper.

*** p. 464, lines 8-15 of Figure 9.18 --- the indentation is wrong in both the first and second printings of the book. The actions of T[i] upon receiving the token should be

     if (token == nc)
       announce termination and halt;
     if (color[i] == red)
       { color[i] = blue; token = 0; }
     else
       token++;
     set j to index of channel for next edge in cycle C;
     send ch[j](token);

p. 494, third line from the bottom --- remoteSend takes essentially the same actions as receiveChan, not sendChan. In fact, it would be best to rename remoteSend to remoteReceive because it processes an incoming message, not an outgoing one.

*** p. 537, program near top of page --- as written, maxdiff will always be 0.0. This is because grid and new are the same when the first for loop terminates. One way to fix the program is to iterate MAXITER-1 times in the first for loop, and then do one last computation of new grid points before computing the maximum difference.

*** pp. 539-40 --- To use the optimization described at the start of the paragraph that begins at the bottom of p. 539, you have to initialize the boundary values of new to four times the boundary values of grid. This is because the optimization assumes that all points in new are 4 times too large after the first loop, and it compensates by dividing by 16 (multiplying by 0.0625) instead of 4 in the second loop.

Acknowledgements

I gratefully thank the following individuals for finding many of the above errors:

Geoffrey Anderson, Jon Beck, Joseph Berrios, Gianni Campanile, Richard Carver, Alex Chernyak, Murray Cole, Kerry Davis, Rahul Desai, Abhijit Dixit, David Garcia Quintas, Veronica Gaspes, Lindsay Groves, Brian Hanley, Carl Hauser, David Hemmendinger, Martin Henz, Einar Broch Johnsen, Apache Le Phong, Stefano Marinoni, Anne Hierholtz, R. Huang, S. Krishnaprasad, Joshua Louie, David Lowenthal, Aleardo Manacero Jr., Govan Marounsi, Marcelo Moreno, Jude Nelson, Bjarte Oestvold, Dennis Peters, Daniel Sguillaro, Stephen Siegel, Guy Tremblay, Umberto Villano, Richard Walker, Qiong Zhang, Yang Zhou.

Last updated November 9, 2010