Lecture 12
Review -- Barriers and Data Parallel Algorithms
synchronous execution -- SIMD machines
every processors does the same thing at the same time; hardware barriers
asynchronous execution -- MIMD machines
divide up data
use SPMD programming style (define it; array of identical workers)
implement barriers in software (or use library)
note: barriers are NOT a part of the Pthreads library
Parallel Scientific Computing (intro to Part 3)
goal: speedup on LARGE problems (or solve an even larger problem)
speedup: T1 / Tp for p processors
start with a good algorithm and optimized sequential code
the challenge for a parallel program is to minimize overheads:
create processes once
have good load balancing
minimize the need for synchronization and use efficient algorithms
for critical sections and barriers
Scientific Modeling (Chapter 11)
simulation of phenomena
start with a model
step through time: for [t = start to finish] {
compute
BARRIER
update
BARRIER
}
models
PDEs -- grid computations
particles/bodies -- particle computations (n-body problems)
linear equations -- matrix computations
some models are deterministic; some are stochastic
Grid Computations (Section 11.1)
applications: weather, fluid (air) flow, plasma physics, etc.
example: Laplace's equation
diagram for grid (mesh) of points
boundary -- constant above
interior -- compute steady state
in practice: changing boundaries, multiple attributes
(e.g., think about weather modeling)
parallelization -- the idea of domain decomposition:
divide area into blocks or strips of points; assign a worker to each
iterative techniques (slow to fast)
Jacobi iteration
Gauss-Seidel
SOR (red/black)
multigrid
[this year I gave a BRIEF overview of the ideas; many years I say more
and have the students implement some of these for their parallel project.]
Particle Computations (Section 11.2)
applications: particle interactions due to chemical bonding,
gravity, electrical charge, fluid flow, etc.
gravitational N-body problem (your programming project)
initialize bodies
for time step {
calculate forces
move bodies
}
body: position p (x, y, z)
velocity v (x, y, z)
mass m constant
force f (x, y, z)
force:
F = (G * m[i] * m[j]) / r**2
direction: vector from one body to the other
magnitude: symmetric ("equal and opposite")
force on a body is the vector sum of the forces from all other bodies
change in position and velocity (moving a body) -- leapfrog scheme
F = m * a, so a = F / m
dv = a * DT
dp = v * DT + (a/2)*DT**2 [ approximation]
= (v + dv/2) * DT
N-Body Programs
transparency for Figure 11.9 -- sequential program
parallelization -- divide bodies among workers
use a barrier after each phase
basic algorithm: compute all pairs of forces (twice)
sketch the basic idea
minimizing overheads:
create workers once
compute forces into "private" arrays; add them when ready to move bodies
in other words, avoid critical sections by:
calculate phase: read p, m; compute f
move phase: read f, m; compute new p and v
load balancing: blocks, stripes, reverse stripes -- see Figure 11.10
better algorithms (see Section 11.2)
exploit symmetry -- 1/2 the work of the basic algorithm, but still O(n**2)
Barnes-Hut -- hierarchical, O(n log n)
Fast Multipole Method -- also hierarchical and O(n log n), but can be better
[Note: I'll be assigning your programming project on this next class.]
Matrix Computations (Section 11.3) [very brief mention]
take a look if you are interested
arise in things like optimization problems -- e.g., modeling the stock
market or the economy