Lecture 6: Domains Supporting Recursive Definitions

## A Thumbnail Sketch of Computability Theory

For the moment, let us confine ourselves to monolithic programming languages: languages that can only express partial functions mapping the natural numbers N (bitstrings) to the natural numbers. (We could restrict Jam to such a form for example by omitting all operations involving lists and either banning functions as arguments or prohibiting them in top level function definitions.)

Consider any such programming language P. To qualify as a legitmate programming language, the programs must have a "surface syntax" representation consisting of sequences of symbols over a finite alphabet A. Can it express all total functions mapping N to N? The answer is no. We will present two proofs of this fact. The first proof is a simple counting argument. How many different total functions are there mapping N to N? An infinite number. How many programs are there in P? An infinite number. (We will assume programs in P can be arbitrarily long; otherwise, it is obvious that there aren't enough programs to compute all the total functions on the natural numbers.) But infinite sets are not necessarily comparable in size. You are presumably aware that there are far more real numbers than rationals (or integers), while the set of rational numbers and the set of integers are the same size (cardinality). Two infinite sets A and B are the same size if there can be put into one to one correspondence, i.e., if there exists a function f:A->B that is that is one-to-one (for all x,y in A [f(x)=f(y) -> x=y]) and onto (for every z in B, there exists x in A such that f(x)=z).

 Exercise: Show that the rationals and the natural numbers can be put into one-to-one correspondence. Hint: depict the set of rational as a two dimensional table of unbounded width and height and traverse the table along each finite length diagonal.

It is easy to show that the set of all programs P has the same cardinality (has the same size) as the set of natural numbers. Simply sort the programs in P by length and then alphabetize them (using any ordering you choose on the alphabet A for P). Such an enumeration of programs places them in one to one correspondence with the natural numbers N.

Now consider how many functions that there are mapping N to N. We know that this set must be at least as big as the set characteristic functions for all subsets of the natural numbers (the power set of N). The characteristic function fS for subset S is defined by the equation

```fS(n) = if n in S then 1 else 0
```

One of the most basic results of set theory is the fact the the power set of N is larger than N. It is not difficult to prove this fact by a diagonalization argument. We will present the proof for the natural numbers N but it easily generalizes to any infinite set S. Assume that the power set Pow(N) can be put into one-to-one correspondence with the natural numbers. Let S0, S1, S2, ... be an enumeration of N that establishes that correspondence. Then define S' as {i | i not in Si}. In other words, natural number i belongs to S' iff it does not belong to Si. Since S' in a member of Pow(N), it must appear in the enumeration S0, S1, ... somewhere, say Sj. Is j a member of S' = Sj. Assume that j is an element Sj. Then by the definition of S', j is not a member of S', yielding a contradiction. Similarly, if j is not a member of S, then j is not a member of Sj, implying that j is a member of S, again yielding a contradiction. Hence, no such enumeration S0, S1, ... can exist. Now we have all of the machinery at hand required to prove that there are more total functions mapping N to N. The number of functions is at least as large as the power set of N which is strictly larger than N. But the set of programs in P is countable, has the same size as the natural Numbers N. Hence there are more functions than programs.

## A Diagonalization Construction of a Function that is Not Computable

Enumerate the programs of P in order of ascending length and alphabetically among programs of the same length. Let Pi be the partial function (mapping N to N) computed by the ith program in this enumeration. Consider a table listing all the functions Pi. This listing can be visualized as a two dimensional infinite table listing all possible programs on the vertical axis and and all possible inputs (elements of N) to each Pi on the horizontal axis:

```    0  1  2  3 ...
P0  x  x  x  x
P1  x  x  x  x
.
.
.
```

Define P' as the function that returns Pi(i)+1 if Pi(i) terminates and 0 if Pi(i) diverges. If we identify 0 with false and non-zero values with true then P' reports whether or not the computation Pi(i) converges. Can P' be computed by some program Pj in our enumeration of all programs? It is easy to prove that the answer is no by contradiction. Assume j exists such that Pj = P'. Since P' terminates for all inputs, the computation P'(j) terminates. Hence, the computation Pj(j) terminates implying that P'(j) = Pj(j)+1 = P'(j)+1, a contradiction. Hence, P' is not a computable function.

## Computation over General Domains

Most real programs are not monolithic. They incrementally produce outputs in response to incrementally gathering information about outputs. This model of computation is more general than the model on which conventional computability theory is based. It is aplty characterized by the following slogan.

Slogan: Computation is successive approximation

Let us explore what properties domains (sets of data values) and functions on those domains must have to support recursive defintions. In the detour in Lecture 6, we informally showed how to construct the fixed-point of a function mapping sets to sets. In this lecture, we study this issue more rigorously in a general setting. Why does this question matter? It shows the boundaries of what can be expressed computationally. Much of mathematics lies outside the reach of computation.

We begin by reviewing a standard definition from discrete mathematics.

A partial order is a set D together with a relation <= (which we will read as "approximates") on D (a subset of D x D) with the following three properties:

• (reflexivity) For all d in D: d <= d.
• (anti-symmetry) For all d1, d2 in D: d1 <= d2 and d2 <= d1 implies d1 = d2.
• (transitivity) For all d1, d2, d3 in D: d1 <= d2 and d2 <= d3 implies d1 <= d3.

Some examples include:

• 2^N (the set of all subsets of the natural numbers N) under the subset ordering.
• the set of finite alphabetic strings under the prefix ordering.
• the natural numbers 0, 1, ... under the usual <= ordering.

A partial order does not have enough structure to support the general solution of fixed point equations. We need to add the two restrictions: chain-completeness and consistent-completeness. A chain C in a partial order (D,<=) is a (possibly empty) denumerable sequence c1, c2, ..., cn, ... such that

```c1 <= c2 <=  ... <= cn <= ...
```

An upper bound for a chain C in (D,<=) is an element d in D such that ci <= d for all ci in C. A partial order (D,<=) is chain-complete (or simply complete) iff every chain C has a least upper bound. (An upper bound c* for C is least if c* approximates every upper bound for C.) We typically abbreviate the term complete partial order as cpo. Note that chain can be empty, implying that every cpo has a least element. We call this least element "bottom" and typically denote it bot. As we will see, bot corresponds to divergence.

Of the three examples of partial orders given above, only the first one is complete. In the partial order of finite alphabetic strings, a chain of the form

```"a" <= "aa" <= "aaa" <= ...
```

has no upper bound, much less a least upper bound. Similarly, in the partial order (N,<=), the chain

```0 <= 1 <= 2 <= ...
```

has no upper bound. In essence, these spaces are missing the "limit" points of chains of approximations.

The notion of cpo described above is still too broad to characterize potentially domains for incremental computation. The problem is that such domains do not have enough finite approximations to express all possible finite snapsnots of the results of computations. Consider the following cpo (nicknamed "pentagon" or "home-plate")

```      o       o
|\     /|
| \   / |
|  \ /  |
|   X   |
|  / \  |
| /   \ |
|/     \|
A o       o B
\     /
\   /
\ /
o
```

where each circle ("o") represents a domain element. The two elements A and B are labeled because they have an upper bound but no least upper bound. A computation yielding a result in this domain could describe an answer by outputting both A and B since these elements are "consistent", i.e. have an upper bound. But what element of the domain would would be the meaning of such computation if it produces no additional output? There are two distinct upper bounds for {A,B}. We can rule out this problem by forcing the domain D to include least upper bounds for all bounded subsets (subsets such that every member is less than a particular element of D). In essence, this property asserts that given any set of approximations to the answer to a computation, there is unique element of the domain that summarizes all of this approximation information.

The pentagon domaih fails this test because the set {A,B} is bounded but has no least upper bound.

We formalize this property as consistent completeness. A cpo [D,<=] is consistently complete iff every bounded subset S of D has a least upper bound. A subset S of D is bounded iff there exists d in D such that forall s in S, s <= d.

So we assert that a domain for incremental computation is a consistently complete cpo. In fact, we will use the term domain as a synonym for consistently complete cpo.

## Topological Properties of Computable Functions

In the preceding section, we identified the topological (approximation structure) properties that a data domain for incremental computation must satisfy. Now let us examine some of the properties that any computable function on such a domain must satisfy. First, any function performing incremental computation on an input from domain A producing output in domain B must be monotonic: for inputs a1, a2 in A, a1 <= a2 implies f(a1) <= f(a2).

Example: Consider a function f on lazy streams of integers that maps `lazy-cons(1,bottom)` to `1`, `lazy-cons(1,2)` to `0`, and everything else to `bottom`. This function is not monotonic because `bottom` (the meaning of an infinite loop) approximates 2 yet `f(lazy-cons(1,loop()))` does not approximate `f(lazy-cons(1,2))`.

Monotoncity is not the only topological constraint on incrementally computable functions. Consider a function f that on infinitely streams of integers that maps every finite approximation

```lazy-cons(n1, lazy-cons(n2, ... lazy-cons(nk, bottom) ...))
```

to bottom, but maps every infinite stream

```lazy-cons(n1, lazy-cons(n2, ... lazy-cons(nk, ...) ...))
```

to `true`. A computable function cannot produce this behavior because it can never determine that its input is infinite; all it sees are progressively better finite approximations. For this reason, computable functions must be continous: given any chain of approximations a0 <= a1 < = ..., f(lub{ai| i = 0,1, ...}) = lub{ f(ai) | i = 0,1, ...}. In other words, the behavior of a continuous function on an infinite element is simply its cumulative behavior on the finite approximations to the infinite element. Example Consider a function f on lazy streams of integers that maps all elements except

```zeroes = lazy-cons(0,lazy-cons(0, ... (lazy-cons(0, ...))))
```

(the infinite stream of zeroes) to `bottom` and maps zeroes to null.

This function is clearly monotonic since it differs from `bottom` only on a maximal element (`zeroes`). Yet it is not continuous because the function is bottom for every element of the the chain of finite approximations to `zeroes` (finite streams of zeroes ending in `bottom`). By the definition of continuity, the function would have to yield `bottom` on `zeroes` to be continuous at `zeroes`, but it doesn't.

## Common CPO's in Programming Languages

The domain of data values supported by a programming language forms a cpo. Assume that we prohibit functions as arguments and return values in LC. Then the only kinds of semantic objects in the language are numbers and unary functions mapping numbers to numbers. We can view this domain as a cpo as follows:

• The numbers form a "flat" cpo where all distinct numbers m, n are incomparable, i.e., `m <= n` implies `m = n`. To satisfy both the chain completeness and consisteny completeness properties (since both definitions apply to the empty subset), we must add an extra element bot such that bot ` <= n` for every integer `n`. The element bot represents a non-terminating computation.
• The strict partial functions mapping integers to integers are ordered as follows: F <= G iff F is a subset of G (when both F and G are interpreted as sets of ordered pairs). A function mapping cpo A to cpo B is strict iff it maps bot in A to bot in B.

The everywhere undefined function (empty set) obviously approximates all other functions. Numbers and functions are incomparable. The special element bot approximates all functions as well as all numbers. Why? An attempt to define a function can diverge, which is different from defining the everywhere divergent function. (The latter produces a closure which can be identified using the {\tt function?} primitive. It diverges only when it is applied to an argument.)

## The Meaning of Recursive Definitions

Consider the function definition written in Scheme:

```(define fact
(lambda (n)
(if (zero? n)
1
(* n (fact (sub1 n))))))
```

What meaning in the space of strict partial functions mapping integers to integers should we assign to `fact`. We need to find a function fact in this space such that

```F(fact) = fact
```

We can construct a functional corresonding to this function that maps an estimate for the meaning of `fact` into a better estimate:

```(define FACT
(lambda (fact)
(lambda (n)
(if (zero? n)
1
(* n (fact (sub1 n))))))
```

Given the empty function, `FACT` constructs the partial function {(0,1)} (that diverges everywhere except for returning 1 when the input is 0). All computable functions including `FACT` are monotonic: if ```x <= y``` then `FACT(x)` approximates `FACT(y)`. Given this property and the property that `FACT` is continuous (all computable functions are continuous), we can deduce (by repeatedly appealing to the monotonicity of FACT):

```empty <= FACT(empty) <= ... FACT(...FACT(emtpy)) <= ...
```

The preceding sequence is clearly a chain. Hence, it must have a least upper bound which we write `fix(FACT)` because it is least fixed fixed-point of the functional `FACT`. This function the meaning of `fact` determined by the recursive definition above. In general, every computable function mapping a cpo D to D has a least fixed-point. Hence, we can interpret any recursive definition as the least fixed-point of the corresponding function.

## The Full Domain of LC

We made the simplifying assumption that functions are never passed as arguments or returned as values in describing the domain of LC above. What happens if we relax these restrictions? Let D denote (stand for) the domain of LC. It must satisfy the following equation

```D = Zbot + (D -> D)
```

where `Zbot` is the integers augmented by bot, + means the standard union operation except bottom elements are coalesced, and `(D -> D)` means the set of all strict continuous functions mapping `D` into `D`. (An function f mapping the cpos ` D1, D2, ... Dn` into the cpo `D0` is strict iff ` f(d1,d2,...,dn) = ` bot if any `di` = bot.)

This equation can be interpreted as a recursive definition of the domain D. We can solve this equation by constructing a functional on domains and taking its least fixed-point. But a lot of technical machinery must be developed. First we have to show that the set of domains forms a cpo. Then we have show that the `+` and `->` operations (functions) on domains are continuous. This deeper inquiry into the semantics of LC is the subject matter of Comp 511.