COMP 311 Principles of Programming Languages (Fall 2004)

Section 20: Garbage Collection

LC With a Heap

In order to talk about garbage collection, we first need a memory model. We define the heap as a tuple containing the list of values in the heap and related information (we'll get to that later). We pass the heap as a parameter into the interpreter. The interpreter now returns a memory location along with the new state of the heap. This avoids complicating our interpreter with destructive operations to maintain the heap. We use big-step environment semantics.

For the sake of realism, we store all closures as well as their environments on the heap. Consider the following semantics on a memory cell:

type pointer = Lo of int

type cell =
    M_St of pointer * pointer
  | M_Cn of int
  | M_Lm of pointer * debruijn
  | M_En of pointer * pointer
  | M_Nu

We need a deref operator to translate a memory location into its contents. We also need an allocate operator, which adds a memory cell to the heap, extending the heap by one.

(* deref: location -> (int * pointer * heap) -> cell *)

let deref l (s,t,h) =
  match l with
      Lo i -> (List.nth h (s-i))
    | _ -> raise (Error "Illegal reference")

(* alloc: debValue -> (int * pointer * heap) -> (location * (int * pointer * heap)) *)

let alloc v (s,t,h) =
(* let (s,t,h) = gc (s,t,h) in *)
  (Lo(s+1),(s+1,t,v::h))

We also need an additional helper function to push stack frames onto the call stack, which is stored in the heap.

let push l (s,t,h) =
  let (l',(s',_,h')) = alloc (M_St(l,t)) (s,t,h) in
    (s',l',h')

This is where the "related information" mentioned earlier begins to play a role. The heap is a 3-tuple. The first item is the total size of the heap. The second is the top stack frame of the call stack. A stack frame (type M_St) consists of a pointer to the environment of the current function call and pointer to the stack frame beneath this one. This is necessary to keep the garbage collector from collecting the environments of earlier function calls. The final element in the tuple is the actual heap. At this point, we can examine the evaluator itself:

let rec evH env e (s,t,h) =
  match e with
      Cn i -> alloc (M_Cn i) (s,t,h)
    | Vr 0 -> let env = (deref env (s,t,h)) in
	(match env with 
	     M_En(v,_) -> (v,(s,t,h))
	   | _ -> raise (Error "Unbound variable"))
    | Vr n -> let env = (deref env (s,t,h)) in
	(match env with
	     M_En(_,vs) -> evH vs (Vr (n-1)) (s,t,h)
	   | _ -> raise (Error "Unbound variable"))
    | Lm e' -> alloc (M_Lm (env, e')) (s,t,h)
    | Ap (e1,e2) -> let (l',(s',_,h')) = evH env e1 (s,t,h) in
	(match (deref l' (s',t,h')) with
	     (M_Lm (env',e)) ->
               let (l'',(s'',_,h'')) = evH env e2 (s',t,h') in
	       let (env''',(s''',_,h''')) = alloc (M_En(l'',env')) (s'',t,h'') in
		 evH env''' e (push env''' (s''',t,h'''))
	   | _ -> raise (Error "Applying non-function"))

let ev8 e = let (l,(s,t,h)) = evH (Lo 1) e (1, Lo 0, [M_Nu; M_Nu]) in
  deref l (s,t,h)

We now have a memory model that we can use for discussing garbage collection.

Side Note: We don't necessarily need to keep the call stack in the heap if we guarantee that execution will never return to a function once it has called another function. How might we guarantee this? By converting to CPS first.

Caveat: This interpreter is also somewhat cheating. While temporary values (in particular anonymous closures) are stored in the heap, there is no record of them being stored anywhere. This would be a problem if you tried to garbage collect this heap at an inopportune time. CPS also solves this problem; how else might you choose to fix this problem?

Reference Counting

The simplest kind of memory management scheme is reference counting. In reference counting, a mutable integer (the reference count) is attached to each allocated memory cell. If anyone dereferences a cell, the reference count is incremented. If anyone deletes a reference to the cell, the reference count is decremented. If the reference count is ever zero, the cell can be deleted (i.e. cells with reference counts equal to zero can be treated as free cells).

type cellRC =
    M_St of pointer * pointer * ref int
  | M_Cn of int * ref int
  | M_Lm of pointer * debruijn * ref int
  | M_En of pointer * pointer * ref int
  | M_Nu * ref int

Which reference counting is a good solution for many applications, there are some problems:

Application-specific: In many cases, reference counting must be specifically tailored for a particular application. For example, reference counting schemes are generally coded into operating systems and used effectively to maintain internal data structures. Implementing reference counting as a general-purpose memory management mechanism for a language is another matter entirely.
Overhead: Reference counters must be updated whenever a pointer is set; either a counter has to be incremented, one has to be decremented, or both.
Cannot handle cycles: This is the big one. Suppose node a has a pointer to node b which has a pointer to node a. Will either of these be marked as garbage? Could they be garbage?

Tracing Garbage-Collection

The most common garbage collectors are tracing garbage collectors. That is, they trace through the heap following references from one memory cell to the next to determine which memory is live and which is dead.

One critical issue that must be addressed is where to start. That is, suppose the heap consists exclusively of two memory cells with no references to either. Which one is live? Are they both live? Are either of them alive?

To solve this problem, recall that the type for the language's memory model contained a M_St cell. The M_St cell is the stack frame of the current function call, and contains a pointer to the current environment and a pointer to the previous function call's stack frame (and consequently its environment). And so on. Because all state is kept in the environment (except as noted in the caveat above), this means that all live state in heap is reachable from the top stack frame. This is where we start.

Mark-and-Sweep Collector

The most basic tracing garbage collector is the mark-and-sweep collector. The mark and sweep collector performs garbage collection in two phases. In the first, the collector iterates over the pointer graph within the heap; as a cell is reached, it is marked as live. This necessitates adding a boolean reference to the tuple of each memory cell

In the second phase, we iterate over the entire heap (ie all cells, live, dead, and empty). If a cell is marked live, we unmark it (so it can be collected later if necessary). If the cell is not marked live, we delete it. We cannot simply remove the item from the heap, however; since the heap is represented as a list, removing an item would change the "memory location" of all subsequent memory cells. Instead, we replace the cell with a M_Em, or empty cell.

Each empty cell contains a pointer to the next empty cell in the heap. The system keeps track of where the first empty cell. This results in a linked list of empty cells in the heap. This list is called the free list. In memory allocation, the first item of the free list is allocated and the free list pointer is updated to reflect the new state of the free list. At program initialization, the entire heap is in the free list.

This type declaration expresses the changes required to the memory model to implement a mark-and-sweep collector:

type cellMS =
    M_St of pointer * pointer * ref bool
  | M_Cn of int * ref bool
  | M_Lm of pointer * debruijn * ref bool
  | M_En of pointer * pointer * ref bool
  | M_Nu * ref bool
  | M_Em of pointer

Mark-and-sweep collectors can result in a highly fragmented heap. That is, empty cells might be interspersed with live cells in no apparent pattern. This is a problem if the language supports non-uniform memory allocation. That is, if a program attempts to allocate a very large data structure, the allocator must search the free list for a contiguous block of memory to return (if such a contiguous block even exists). Mark-and-sweep collectors must also search the entire heap, even if very little memory is live.

Copying Collector

Copying collectors (also known as Cheney collectors, after the researcher who devised the technique) solve both of these problems by splitting the heap. Memory is only ever allocated in one half of the heap. The garbage collection iterates over the graph of live cells, copying the live cells to the other half of the heap.

The one issue is, what to do if garbage collection reaches a cell which has already been migrated over. To solve this, when a cell has been migrated over, the old cell is replaced with a M_Fo, which contains a forwarding pointer to the location in the new heap. These forwarding pointers ensure that no links are lost during garbage collection.

This type declaration expresses the changes required to the memory model to implement a copying collector:

type cellCP =
    M_St of pointer * pointer
  | M_Cn of int
  | M_Lm of pointer * debruijn
  | M_En of pointer * pointer
  | M_Nu
  | M_Fo of pointer

Generational Collector

One issue with both of these garbage collectors is that they require traversing the entire memory usage graph every time garbage collection is run. In the case of copying collectors in particular, a long-lived data structure will be copied back and forth from one heap to the other countless times.

Generational collectors are a meta-class of garbage collectors that solve this problem by providing multiple generations of heaps. Memory is initially allocated in the lowest level of heaps, called the nursery. As the memory survives multiple garbage collections, it graduates, and moves up to the mature heap. The nursery tends to be smaller than the mature heap, so gets garbage collected fairly often. However, since short-lived temporaries rarely graduate to the mature heap, the mature heap needs to be collected far less frequently. Note that the garbage collection techniques used to collect the nursery and mature heap are independent choices.

At this point we must revisit the issue of starting points. When garbage collecting the nursery, we don't want to devote any effort to examining the mature heap. However, pointers from the mature heap to the nursery should keep the nursery item live. We can solve this by maintaining a list of pointers from the mature heap into the nursery, and use this list as an additional starting point for garbage collection.

Garbage Collection Esoterica

Concurrency Issues in Garbage Collection

One issue that has not been addressed at all in class is concurrency (multiple threads apparently (or actually) executing at the same time). Concurrency is an especially critical issue with respect to garbage collection. In particular, the traditional view of garbage collection is that nothing else can be happening in parallel with the garbage collection. How might we loosen this restriction?

Conservative Garbage Collection

We've talked about garbage collection for LC. Garbage collection is standard in languages such as Java, O'Caml, Scheme, Perl, and Python, and garbage collection libraries exist for C.

C??

How To Eat Your Memory and Have It, Too

The rest of these notes take a different approach towards describing garbage collection.

From our previous lecture, we can see that our machine requires five registers:

=M=: the program text
=env=: the lexical context [variable-value pairs]
=k=: the control context [list of frames]
=val=: the result value from evaluating the contents of =M=
=param-val=: the value of a function's parameter

=M= is a pointer into the program code. =k= holds a stack, which can be implemented as a pointer into a separate array. The other three are registers that (may) directly point to allocate data structures such as closures and lists.

Let us name the following expressions

M1 = (lambda (x) (+ x 3))
M2 = (lambda (f) (+ (f 7) 4))
M3 = (lambda (z) (- z y))

and consider the evaluation of

(M1 (M2 (let (y 2) M3)))

We will study the evaluation of this expression by looking at ``snapshots'' of the machine at various stages.

Snapshot 1:

We have evaluated M1 and are in the process of evaluating the argument to the resulting closure.
=k= = [appR -> <M1,empty>] =env= = empty =val= = <M1,empty>

where =val= shares its contents with =k=.

Snapshot 2:

We have evaluated the left and right terms from Snapshot 1, and are about to apply the closure formed from M2.
=k= = [appR -> <M2,empty> , appR -> <M1,empty>] =env= = empty =val= = <M3,[<y,2>]>

Snapshot 3:

We are just done evaluating the subtraction inside M3, which is bound to f.
=k= = [+R -> 4 , appR -> <M1,empty>] =env= = [<z,7> , <y,2>] =val= = 5

Note that we have opened up the environment of the closure bound to f in showing the value of =env=.

Snapshot 4:

We are in the midst of the addition inside the closure <M1,empty>; the x has just been evaluated.
=k= = [+R -> 3] =env= = [<x,9>] =val= = 9

However, recall that there are several old fragments of environment still to be found in memory, such as [<z,7> , <y,2>] from Snapshot 3.

If we look carefully in the final step, there are many items that were formerly in the environment that are unnecessary for the remaining evaluation. However, these unnecessary items are still present in memory and could potentially cause our program to exhaust available memory before finishing its task. Hence, we should try to recycle such unnecessary memory. Id est:

Memory is a forest, rooted in registers.
As the computation progresses, some portions of it become unreachable.
Therefore, memory is reusable.

Assume we divide up available memory into two halves, called ``memory 1'' and ``memory 2''. Say we begin by allocating in memory 1, and hit the boundary. Then we can switch our current half to memory 2, copy the tree of reachable memory from the memory 1 into memory 2, and proceed with the computation. This copying is done by picking a register, each one in turn, and walking pointers into memory until we hit a cons cell; we copy this into the new memory 2, and repeat the procedure along each component of the cell. The process is repeated when memory 2 is exhausted, switching the rôles of the two parts.

This method might make intuitive sense, but what if we have sharing in our language? In LC, we currently have no way of checking sharing constraints (as with eq? in Scheme), but it is reasonable to assume we might be called upon to do so. In addition, if we duplicated objects, we would in fact use more space in the new half than in the old one, which would ruin the purpose of our attempt at recycling memory. To prevent this, when we visit a cell, we have to indicate that it has been forwarded; then, if it is visited again, the appropriate sharing relationship can be mimicked in the new half.

Thus, with the help of this process, which is called garbage collection, if the two memory banks are of equal size, and if there are indeed unreachable objects in the exhausted space, then we will have space left over in the new bank, and we can proceed with our allocation. However, there are two problems:

What if everything is reachable? Then we are forced to signal an error and halt computation. (Note that this doesn't mean there aren't ``unusable'' objects in memory, just that our notion of reachability isn't strong enough to distinguish these objects. The objects that are truly necessary are said to be live.)
The collector itself needs space for recursion and computations. We know we can get rid of recursion using CPS, which also tells us how many registers we need (which is fixed). The remaining variable is the depth of the stack, but this is proportional to the depth of the tree being copied. Using these insights, it is possible to write the collector that uses a small, fixed amount of additional memory.

A simple model of the garbage collector might look like this:

(define gc
  (lambda (ptr)
    (cond
      ((null? ptr) null)
      ((cons? ptr) (cons_mem1 (gc (car_mem2 ptr))
		     (gc (cdr_mem2 ptr)))))))

but this loses sharing. So we have to break cons_mem1 up into its two constituent parts: allocation and initialization.

((cons? ptr)
  (let ((new (alloc ...)))
    (make-as-forwarded ptr)
    (init_mem1 new (gc (car_mem2 ptr)) (gc (cdr_mem2 ptr)))
    new))

However, this still doesn't check for forwarding. A simple modification takes care of that:

((cons? ptr)
  (if (forwarded? ptr)
    ...
    (let ((new (alloc ...)))
      (make-the-orange-thing)
      (init_mem1 new (gc (car_mem2 ptr)) (gc (cdr_mem2 ptr)))
      new)))

Perspective

In summary, the traditional view of garbage collection is roughly as follows:

Reachability corresponds to liveness.
Non-reachable memory can be garbage collected.
The registers are the roots from which to perform the sweep.

Recently, a new view of garbage collection has been emerging. In this view,

Every program evaluation state in the register machine corresponds to memory, registers and the program text.
It is impossible to decide whether any given cell in some machine state is live or dead.
Every algorithm that conservatively identifies live cells (ie, does not mistakenly claim some cell to be dead when it is useful) is a garbage collection algorithm.

The new view of gc has given rise to new gc algorithms. The new algorithms reconstruct the types of all phrases, including memory cells, at run-time and use type information to determine which cells are live or dead. For example, if an implicitly polymorphic system is used and a cell has type alpha, the program evaluation will work for all possible values in that cell. In particular, it will work if the cell's content is replaced by the null pointer. Doing so frees all memory that the cell (may) point to.

The new view is logical: it distinguishes between truly live and provably live cells, between truth and provability. As always, the latter is an approximation of the former. It is another indication of how tightly logic and computation are intertwined.

Back to course website