Lecture 4

Today:  ways to parallelize programs


Terminology

   heavy-weight process  -- Unix process, SR virtual machine

   light-weight process (SR language)
             or thread (Java language & SR implementation)

      these are within a heavy-weight (swappable) process and
      are often multiplexed


Parallelizing Programs

   how:  identify independent parts

        read set -- variables that are read (only)
        write set -- variables that are written (and possibly read)

        independence:  the write set of each process is disjoint from
           the read and write sets of other processes
           [NB:  the def. is wrong in the first printing of the book!]

   granularity:  coarse -- 1 process per processor
                 fine -- 1 process per independent part (action)

   style:  iterative -- data parallelism
           recursive -- recursive parallelism
           distinct  -- task parallelism

   activation method:  static -- process declaration, once
                       dynamic -- co statement, possibly many times

   examples of programs we have seen:
      matrix mult. -- fine, iterative, static
      quicksort and quadrature -- fine, recursive, dynamic
      program 1 assignment and today -- coarse, independent, static or dynamic


Example -- Find Problem (Section 2.2)

   goal:  find all instances of pattern in filename (simple Unix grep)

   sequential algorithm:
      read a line
      do not EOF ->
         look for pattern and write line
         read next line
      od

   what can be parallelized?
      ask class for ideas
      most feasible approach is to do look and read in parallel

   what is required for independence?
      two or more buffers (or one at a time access to one buffer)
         this is mutual exclusion
      take turns using buffers
         this is condition synchronization:
            read into empty buffer; find pattern in full buffer; then swap


Algorithm Styles

   (a)  "co inside do"  -- dynamic processes

         read first line
         do not EOF ->
            co find // read oc
         od

    (b)  "do inside co" -- static processes

         process Input            this is equivalent to
            ...                      co Input() // Find () oc
         end                      executed once in "main" thread
         process Find
            ...
         end


SR Program for style (a) -- transparency

   Go over the code; it illustrates style (a) as well as several
   additional aspects of SR:  strings, file I/O, global components, timing


SR Program for style (b)

   (i)  one buffer: copy lines from Input to buffer and from buffer to Find

   (ii) double buffering:  two buffers, then swap roles
          can access them in place as long as take turns

   Approach (ii) is more efficient, so that is what we will do

   process Input                      process Find
      ...                                ...
      do true ->                         do true ->
        SYNCH                               SYNCH
        read into buffer                    look at buffer
        SYNCH                               if EOF -> exit fi
        if EOF -> exit fi                   SYNCH
      od                                 od

   SYNCH are synchronization points; we will start looking at how
   next time; your are to figure out the details for your first program


Summary of Issues for above program

   (1) synchronization     alternate access
   (2) efficiency          avoid copying lines by using two buffers
   (3) termination         need extra communication between Input and Find
   (4) generality          bounded buffers (Unix pipes)

   we'll soon see how to handle all of these