Section 15: Implicit Polymorphism
In the previous lecture, we introduced polymorphism and distinguished between its explicit and implicit forms. When confronted with the former, the programmer writes down the types of all variables and explicitly abstracts over both values and types. For the latter, the programmer simply omits all type information from the program and tells the language implementation to derive the types. However, we left this process of derivation unspecified.
Consider the following examples:
(lambda (x _) x): we can use any type in the type language in place of
(lambda (x _) (+ x 0)): we can only put in
int, since the argument is used in an addition.
(lambda (x _) (x 0)): since the argument is applied, we know its type must be of the form
a -> b. The type accepted by the argument,
a, must be a number; its result type can still be any legal type. Hence, we can assign the type
(int -> _)for this expression.
Is it still true that
(lambda (x _) (x x)) cannot be typed? Yes.
The type _ must be some type definable in our type language.
|We can work around the restriction on self application by introducing
(datatype circtype ([circproc circproc? ((circtype -> T) proc)]) ...)
((proc x) x)
Of course, the latter is cheating in some sense since we no longer have
true self application anymore. But we can then type
(circproc (lambda (x circtype) <any value of type T>))
obviously works. In general, what can we do if we are given an expression
of the form
|First, we show the typing of |
T |- (lambda (x alpha) x) : alpha -> beta
Now we type the body of the procedure. Since
T [ x/alpha ] |- x : alpha
However, since the type of the result of the procedure is that of
T [ x/alpha ] |- x : beta
Combining these facts, we see that the type of the procedure is
This example illustrates the process of deriving a type. We use the
We now consider a more involved case:
((lambda (x _) x) (lambda (y _) y))
We begin by assuming a new type for the whole expression:
T |- ((lambda (x _) x) (lambda (y _) y)) : gamma
Now we type the sub-terms. Each one is a procedure, so each is given an ``arrow'' type.
T |- (lambda (x alpha) x) : delta -> epsilon T |- (lambda (y beta) y) : phi -> psi
From this, we derive two equations:
gamma = epsilon delta = phi -> psi
The former holds since the type of the overall expression is that returned by the applied procedure, the latter by the typing of arguments to procedures.
Next, we can conclude that
T [ x/alpha ] |- x : alpha T [ y/beta ] |- y : beta
Each of these induces several more equations:
From these, we can in turn conclude that
delta = phi -> psi = beta -> beta epsilon = alpha = delta = beta -> beta gamma = epsilon = beta -> beta
which gives the type
The process of solving symbolic constraints is similar to other forms of linear equation solving including Gaussian elimination. In fact, solving symbolic constrains is a surprisingly easy because the algebraic structure of abstract syntax (called free algebras by mathematicians) is so simple. The general algorithm for solving symbolic constraints is called unification and was invented by Robinson at Rice (in the Philosophy department) in the mid-60s.
In spite of all this machinery, expressions like
(let ((f (lambda (x _) x))) (if (f true) (f 5) (f 6)))
still cannot be typed. The problem is that the rule we have adopted for
requires us to make a guess before we check the body; no matter which type we
choose, we will run afoul. So we propose that, instead, we use the copying rule
let that was proposed in the previous lecture. What is the property
of this rule that enables us to type this expression? It is that it copies
the blanks. The ``copying''
let typing rule copies places for
However, this new rule has one very deleterious effect. There could be a
in the binding for
f; there could be another
its binding; and so forth. When we copy the code, this leads to an
exponential blow-up in the resulting (copied) code that needs to be typed. Thus,
in practice, we want this typing rule, but would like to avoid having to duplicate
code. In effect, this means that we want to put the
_'s in the
environment, ie, we want to bind type variables at
and free them at uses.
Idea: Introduce type schemas with close and open operations.
Our type schemas will take the following shapes:
ts = t_alpha | (forall (alpha ... beta) t_alpha)
In the above expression, we would get the type
alpha -> alpha for
f, which is
(forall (alpha) (alpha -> alpha)). This
also shows that type inference for implicit polymorphism can be thought of as
a translation process from TLC syntax without types into XPolyLC.
In the following example, we quantify over multiple type variables:
(let ((k (lambda (x alpha) (lambda (y beta) x)))) ...)
The shape of type inferred for
k is then
(forall (alpha beta)
(alpha -> (beta -> alpha))).
With type schemas in hand, we try to produce a better
T |- E : t T[ x/CLOSE(t) ] |- B : s -------------------------------------- T |- (let (x E) B) : s
When we look up identifiers in the type environment, we might sometimes get a
CLOSEd object instead of a regular type. Hence, we need another typing
OPEN, to cover these cases:
T |- x : t [ alpha.../beta... ]
This rule is read as follows. Say an identifier
x is looked up in
the type environment
T, and it has the type
is a type schema, inserted into the environment through a
then we replace the type variables quantified over in the schema (the
with fresh type variables (the
beta...), then proceed as
before. Since a new set of type variables is instantiated at each use, the same
code can be given different types in different uses, so our canonical example
can be typed.
Unfortunately, these simple versions of
are flawed. Suppose we had this program:
(lambda (y _) (let ((f (lambda (x _) y))) (if (f true) (+ (f true) 5) 6)))
y is assigned type
x is assigned
beta. Then the type of
(alpha -> beta).
Our typing rule has naïvely closed over all the type variables lexically
visible. At each use of
f, new type variables are created. In the
beta_1 = bool and
alpha_1 = bool; in the second,
alpha_2 = int and
beta_2 = bool. The type checker is
satisfied, and an expression that ``type-faults'' at run-time has passed the type
In the rule for
let, if we restrict
CLOSE to only
close over variables introduced by the polymorphic
let, then we
get the desired behavior. The introduction rule is now
T |- E : t T[ x/CLOSE(t,T) ] |- B : s ---------------------------------------- T |- (let (x E) B) : s
with the same elimination rule, and we can formally write down a specification
CLOSE (t,T) = (forall (alpha...) t)
alpha... = ftv(t) - ftv(T) (
ftv computes the set
of free type variables in expressions and type environments). With this new typing
rule, the type schema for
(forall (beta) (beta -> alpha)),
so we now have to unify
alpha with both
which fails and results in an error.
A more detailed discussion of implicit parametric polymorphism appears in
the PowerPoint presentations entitled