Section 20: Garbage Collection
In order to talk about garbage collection, we first need a memory model. We define the heap as a tuple containing the list of values in the heap and related information (we'll get to that later). We pass the heap as a parameter into the interpreter. The interpreter now returns a memory location along with the new state of the heap. This avoids complicating our interpreter with destructive operations to maintain the heap. We use big-step environment semantics.
For the sake of realism, we store all closures as well as their environments on the heap. Consider the following semantics on a memory cell:
type pointer = Lo of int type cell = M_St of pointer * pointer | M_Cn of int | M_Lm of pointer * debruijn | M_En of pointer * pointer | M_Nu
We need a deref operator to translate a memory location into its contents. We also need an allocate operator, which adds a memory cell to the heap, extending the heap by one.
(* deref: location -> (int * pointer * heap) -> cell *) let deref l (s,t,h) = match l with Lo i -> (List.nth h (s-i)) | _ -> raise (Error "Illegal reference") (* alloc: debValue -> (int * pointer * heap) -> (location * (int * pointer * heap)) *) let alloc v (s,t,h) = (* let (s,t,h) = gc (s,t,h) in *) (Lo(s+1),(s+1,t,v::h))
We also need an additional helper function to push stack frames onto the call stack, which is stored in the heap.
let push l (s,t,h) = let (l',(s',_,h')) = alloc (M_St(l,t)) (s,t,h) in (s',l',h')
This is where the "related information" mentioned earlier begins to play a role. The heap is a 3-tuple. The first item is the total size of the heap. The second is the top stack frame of the call stack. A stack frame (type M_St) consists of a pointer to the environment of the current function call and pointer to the stack frame beneath this one. This is necessary to keep the garbage collector from collecting the environments of earlier function calls. The final element in the tuple is the actual heap. At this point, we can examine the evaluator itself:
let rec evH env e (s,t,h) = match e with Cn i -> alloc (M_Cn i) (s,t,h) | Vr 0 -> let env = (deref env (s,t,h)) in (match env with M_En(v,_) -> (v,(s,t,h)) | _ -> raise (Error "Unbound variable")) | Vr n -> let env = (deref env (s,t,h)) in (match env with M_En(_,vs) -> evH vs (Vr (n-1)) (s,t,h) | _ -> raise (Error "Unbound variable")) | Lm e' -> alloc (M_Lm (env, e')) (s,t,h) | Ap (e1,e2) -> let (l',(s',_,h')) = evH env e1 (s,t,h) in (match (deref l' (s',t,h')) with (M_Lm (env',e)) -> let (l'',(s'',_,h'')) = evH env e2 (s',t,h') in let (env''',(s''',_,h''')) = alloc (M_En(l'',env')) (s'',t,h'') in evH env''' e (push env''' (s''',t,h''')) | _ -> raise (Error "Applying non-function")) let ev8 e = let (l,(s,t,h)) = evH (Lo 1) e (1, Lo 0, [M_Nu; M_Nu]) in deref l (s,t,h)
We now have a memory model that we can use for discussing garbage collection.
|Side Note: We don't necessarily need to keep the call stack in the heap if we guarantee that execution will never return to a function once it has called another function. How might we guarantee this? By converting to CPS first.|
|Caveat: This interpreter is also somewhat cheating. While temporary values (in particular anonymous closures) are stored in the heap, there is no record of them being stored anywhere. This would be a problem if you tried to garbage collect this heap at an inopportune time. CPS also solves this problem; how else might you choose to fix this problem?|
The simplest kind of memory management scheme is reference counting. In reference counting, a mutable integer (the reference count) is attached to each allocated memory cell. If anyone dereferences a cell, the reference count is incremented. If anyone deletes a reference to the cell, the reference count is decremented. If the reference count is ever zero, the cell can be deleted (i.e. cells with reference counts equal to zero can be treated as free cells).
type cellRC = M_St of pointer * pointer * ref int | M_Cn of int * ref int | M_Lm of pointer * debruijn * ref int | M_En of pointer * pointer * ref int | M_Nu * ref int
Which reference counting is a good solution for many applications, there are some problems:
The most common garbage collectors are tracing garbage collectors. That is, they trace through the heap following references from one memory cell to the next to determine which memory is live and which is dead.
One critical issue that must be addressed is where to start. That is, suppose the heap consists exclusively of two memory cells with no references to either. Which one is live? Are they both live? Are either of them alive?
To solve this problem, recall that the type for the language's memory model contained a M_St cell. The M_St cell is the stack frame of the current function call, and contains a pointer to the current environment and a pointer to the previous function call's stack frame (and consequently its environment). And so on. Because all state is kept in the environment (except as noted in the caveat above), this means that all live state in heap is reachable from the top stack frame. This is where we start.
The most basic tracing garbage collector is the mark-and-sweep collector. The mark and sweep collector performs garbage collection in two phases. In the first, the collector iterates over the pointer graph within the heap; as a cell is reached, it is marked as live. This necessitates adding a boolean reference to the tuple of each memory cell
In the second phase, we iterate over the entire heap (ie all cells, live, dead, and empty). If a cell is marked live, we unmark it (so it can be collected later if necessary). If the cell is not marked live, we delete it. We cannot simply remove the item from the heap, however; since the heap is represented as a list, removing an item would change the "memory location" of all subsequent memory cells. Instead, we replace the cell with a M_Em, or empty cell.
Each empty cell contains a pointer to the next empty cell in the heap. The system keeps track of where the first empty cell. This results in a linked list of empty cells in the heap. This list is called the free list. In memory allocation, the first item of the free list is allocated and the free list pointer is updated to reflect the new state of the free list. At program initialization, the entire heap is in the free list.
This type declaration expresses the changes required to the memory model to implement a mark-and-sweep collector:
type cellMS = M_St of pointer * pointer * ref bool | M_Cn of int * ref bool | M_Lm of pointer * debruijn * ref bool | M_En of pointer * pointer * ref bool | M_Nu * ref bool | M_Em of pointer
Mark-and-sweep collectors can result in a highly fragmented heap. That is, empty cells might be interspersed with live cells in no apparent pattern. This is a problem if the language supports non-uniform memory allocation. That is, if a program attempts to allocate a very large data structure, the allocator must search the free list for a contiguous block of memory to return (if such a contiguous block even exists). Mark-and-sweep collectors must also search the entire heap, even if very little memory is live.
Copying collectors (also known as Cheney collectors, after the researcher who devised the technique) solve both of these problems by splitting the heap. Memory is only ever allocated in one half of the heap. The garbage collection iterates over the graph of live cells, copying the live cells to the other half of the heap.
The one issue is, what to do if garbage collection reaches a cell which has already been migrated over. To solve this, when a cell has been migrated over, the old cell is replaced with a M_Fo, which contains a forwarding pointer to the location in the new heap. These forwarding pointers ensure that no links are lost during garbage collection.
This type declaration expresses the changes required to the memory model to implement a copying collector:
type cellCP = M_St of pointer * pointer | M_Cn of int | M_Lm of pointer * debruijn | M_En of pointer * pointer | M_Nu | M_Fo of pointer
One issue with both of these garbage collectors is that they require traversing the entire memory usage graph every time garbage collection is run. In the case of copying collectors in particular, a long-lived data structure will be copied back and forth from one heap to the other countless times.
Generational collectors are a meta-class of garbage collectors that solve this problem by providing multiple generations of heaps. Memory is initially allocated in the lowest level of heaps, called the nursery. As the memory survives multiple garbage collections, it graduates, and moves up to the mature heap. The nursery tends to be smaller than the mature heap, so gets garbage collected fairly often. However, since short-lived temporaries rarely graduate to the mature heap, the mature heap needs to be collected far less frequently. Note that the garbage collection techniques used to collect the nursery and mature heap are independent choices.
At this point we must revisit the issue of starting points. When garbage collecting the nursery, we don't want to devote any effort to examining the mature heap. However, pointers from the mature heap to the nursery should keep the nursery item live. We can solve this by maintaining a list of pointers from the mature heap into the nursery, and use this list as an additional starting point for garbage collection.
One issue that has not been addressed at all in class is concurrency (multiple threads apparently (or actually) executing at the same time). Concurrency is an especially critical issue with respect to garbage collection. In particular, the traditional view of garbage collection is that nothing else can be happening in parallel with the garbage collection. How might we loosen this restriction?
We've talked about garbage collection for LC. Garbage collection is standard in languages such as Java, O'Caml, Scheme, Perl, and Python, and garbage collection libraries exist for C.
The rest of these notes take a different approach towards describing garbage collection.
From our previous lecture, we can see that our machine requires five registers:
=M=: the program text
=env=: the lexical context [variable-value pairs]
=k=: the control context [list of frames]
=val=: the result value from evaluating the contents of
=param-val=: the value of a function's parameter
=M= is a pointer into the program code.
a stack, which can be implemented as a pointer into a separate array. The other
three are registers that (may) directly point to allocate data structures such
as closures and lists.
Let us name the following expressions
M1 = (lambda (x) (+ x 3)) M2 = (lambda (f) (+ (f 7) 4)) M3 = (lambda (z) (- z y))
and consider the evaluation of
(M1 (M2 (let (y 2) M3)))
We will study the evaluation of this expression by looking at ``snapshots'' of the machine at various stages.
We have evaluated
=k= = [appR -> <M1,empty>] =env= = empty =val= = <M1,empty>
We have evaluated the left and right terms from Snapshot 1, and are
about to apply the closure formed from
=k= = [appR -> <M2,empty> , appR -> <M1,empty>] =env= = empty =val= = <M3,[<y,2>]>
We are just done evaluating the subtraction inside
=k= = [+R -> 4 , appR -> <M1,empty>] =env= = [<z,7> , <y,2>] =val= = 5
Note that we have opened up the environment of the closure bound to
We are in the midst of the addition inside the closure
=k= = [+R -> 3] =env= = [<x,9>] =val= = 9
However, recall that there are several old fragments of environment still
to be found in memory, such as
If we look carefully in the final step, there are many items that were formerly in the environment that are unnecessary for the remaining evaluation. However, these unnecessary items are still present in memory and could potentially cause our program to exhaust available memory before finishing its task. Hence, we should try to recycle such unnecessary memory. Id est:
Assume we divide up available memory into two halves, called ``memory 1'' and
``memory 2''. Say we begin by allocating in memory 1, and hit the boundary. Then
we can switch our current half to memory 2, copy the tree of reachable memory
from the memory 1 into memory 2, and proceed with the computation. This copying
is done by picking a register, each one in turn, and walking pointers into memory
until we hit a
cons cell; we copy this into the new memory 2, and
repeat the procedure along each component of the cell. The process is repeated
when memory 2 is exhausted, switching the rôles of the two parts.
This method might make intuitive sense, but what if we have sharing in our
language? In LC, we currently have no way of checking sharing constraints (as
eq? in Scheme), but it is reasonable to assume we might be
called upon to do so. In addition, if we duplicated objects, we would in fact
use more space in the new half than in the old one, which would ruin the purpose
of our attempt at recycling memory. To prevent this, when we visit a cell, we
have to indicate that it has been forwarded; then, if it is visited
again, the appropriate sharing relationship can be mimicked in the new half.
Thus, with the help of this process, which is called garbage collection, if the two memory banks are of equal size, and if there are indeed unreachable objects in the exhausted space, then we will have space left over in the new bank, and we can proceed with our allocation. However, there are two problems:
A simple model of the garbage collector might look like this:
(define gc (lambda (ptr) (cond ((null? ptr) null) ((cons? ptr) (cons_mem1 (gc (car_mem2 ptr)) (gc (cdr_mem2 ptr)))))))
but this loses sharing. So we have to break
cons_mem1 up into its
two constituent parts: allocation and initialization.
((cons? ptr) (let ((new (alloc ...))) (make-as-forwarded ptr) (init_mem1 new (gc (car_mem2 ptr)) (gc (cdr_mem2 ptr))) new))
However, this still doesn't check for forwarding. A simple modification takes care of that:
((cons? ptr) (if (forwarded? ptr) ... (let ((new (alloc ...))) (make-the-orange-thing) (init_mem1 new (gc (car_mem2 ptr)) (gc (cdr_mem2 ptr))) new)))
In summary, the traditional view of garbage collection is roughly as follows:
Recently, a new view of garbage collection has been emerging. In this view,
The new view of gc has given rise to new gc algorithms. The new algorithms reconstruct
the types of all phrases, including memory cells, at run-time and use type information
to determine which cells are live or dead. For example, if an implicitly polymorphic
system is used and a cell has type
alpha, the program evaluation
will work for all possible values in that cell. In particular, it will work if
the cell's content is replaced by the null pointer. Doing so frees all memory
that the cell (may) point to.
The new view is logical: it distinguishes between truly live and provably live cells, between truth and provability. As always, the latter is an approximation of the former. It is another indication of how tightly logic and computation are intertwined.