CSc 520 - Principles of Programming Languages
10 : Memory Management -- GC -- Generational Collection
Christian Collberg
Department of Computer Science
University of Arizona
- Works best for functional
and logic languages (LISP, Prolog, ML, ...) because
- they rarely modify allocated cells
- newly created objects only
point to older objects ((CONS A B)
creates a new two-pointer
cell with pointers to old objects),
- new cells are shorter lived
than older cells, and old objects are
unlikely to die anytime soon.
- Generational Collection therefore
- divides the heap into generations,
1#1 is the youngest, 2#2 the oldest.
- allocates new objects in 1#1.
- GC's only newer generations.
- We have to keep track of back pointers (from
old generations to new).
Functional Language:
3#3
- A new object (created at time 4#4)
points to older objects.
Object Oriented Language:
5#5
- A new object (created at time 6#6)
is inserted into an older object,
which then points to the news object.
GenerationalCollection1
GenerationalCollection2
- Since old objects
(in
7#7) are rarely changed (to
point to new objects) they are
unlikely to point into 1#1.
- Apply the GC only to the youngest
generation (1#1), since it is most likely
to contain a lot of garbage.
- Use the stack and globals as roots.
- There might be some back pointers,
pointing from an older generation into
1#1. Maintain a special set of such pointers,
and use them as roots.
- Occasionally GC older (8#8)
generations.
- Use either mark-and-sweep or copying collection
to GC 1#1.
Remembering Back Pointers
Remembered List
After each pointer update
x.f := 9#9, the
compiler adds code to insert x in a
list of updated memory locations:
10#10
Remembered Set
As above, but set a bit
in the updated object so that it is
inserted only once in the list:
11#11
Card marking
- Divide the heap into ``cards''
of size 12#12.
- Keep an array dirty of
bits, indexed by card number.
- After a pointer update
x13#13.f := 9#9, set
the dirty bit for card c that x is on:
14#14
Page marking I
- Similar to Card marking, but let the
cards be virtual memory pages.
- When x is updated the VM system automatically
sets the dirty bit of the page that x is on.
- We don't have to insert any extra code!
Page marking II
- The OS may not let us read the VM system's
dirty bits.
- Instead, we write-protect the page x is
on.
- On an update x13#13.f := 9#9
a protection fault is generated.
We catch this fault and set a dirty bit manually.
- We don't have to insert any extra code!
Cost of Garbage Collection
- The size of the heap is 15#15, the amount of reachable
memory is 16#16, the amount of memory reclaimed is
17#17.
GCCost1
18#18
GCCost-Generational
- Assume the youngest generation (1#1) has 10% live data,
i.e. 19#19.
- Assume we're using copying collection for 1#1.
20#20
GCCost-Generational
20#20
- If 21#21 kilobytes in 1#1, then 22#22 megabyte.
- In other words, we've wasted about 900 kilobytes, to get 2.5
instruction/word GC cost (for 1#1).
Christian Collberg
2008-04-29