Lecture 4
Today: ways to parallelize programs
Terminology
heavy-weight process -- Unix process, SR virtual machine
light-weight process (SR language)
or thread (Java language & SR implementation)
these are within a heavy-weight (swappable) process and
are often multiplexed
Parallelizing Programs
how: identify independent parts
read set -- variables that are read (only)
write set -- variables that are written (and possibly read)
independence: the write set of each process is disjoint from
the read and write sets of other processes
[NB: the def. is wrong in the first printing of the book!]
granularity: coarse -- 1 process per processor
fine -- 1 process per independent part (action)
style: iterative -- data parallelism
recursive -- recursive parallelism
distinct -- task parallelism
activation method: static -- process declaration, once
dynamic -- co statement, possibly many times
examples of programs we have seen:
matrix mult. -- fine, iterative, static
quicksort and quadrature -- fine, recursive, dynamic
program 1 assignment and today -- coarse, independent, static or dynamic
Example -- Find Problem (Section 2.2)
goal: find all instances of pattern in filename (simple Unix grep)
sequential algorithm:
read a line
do not EOF ->
look for pattern and write line
read next line
od
what can be parallelized?
ask class for ideas
most feasible approach is to do look and read in parallel
what is required for independence?
two or more buffers (or one at a time access to one buffer)
this is mutual exclusion
take turns using buffers
this is condition synchronization:
read into empty buffer; find pattern in full buffer; then swap
Algorithm Styles
(a) "co inside do" -- dynamic processes
read first line
do not EOF ->
co find // read oc
od
(b) "do inside co" -- static processes
process Input this is equivalent to
... co Input() // Find () oc
end executed once in "main" thread
process Find
...
end
SR Program for style (a) -- transparency
Go over the code; it illustrates style (a) as well as several
additional aspects of SR: strings, file I/O, global components, timing
SR Program for style (b)
(i) one buffer: copy lines from Input to buffer and from buffer to Find
(ii) double buffering: two buffers, then swap roles
can access them in place as long as take turns
Approach (ii) is more efficient, so that is what we will do
process Input process Find
... ...
do true -> do true ->
SYNCH SYNCH
read into buffer look at buffer
SYNCH if EOF -> exit fi
if EOF -> exit fi SYNCH
od od
SYNCH are synchronization points; we will start looking at how
next time; your are to figure out the details for your first program
Summary of Issues for above program
(1) synchronization alternate access
(2) efficiency avoid copying lines by using two buffers
(3) termination need extra communication between Input and Find
(4) generality bounded buffers (Unix pipes)
we'll soon see how to handle all of these