COMP 311 Principles of Programming Languages (Fall 2004)

Programming Assignment 5:
Typed Jam

Assigned: Friday, October 22, 2004
Due: 12:00 noon, Wednesday, November 10, 2004

Files

No supporting code. Use your own interpreter from Assignment 4.

Overview

Preparation Create a programs/5 directory within your comp311 directory. All of your files for this assignment should be stored in the programs/5 subdirectory. Do not put files in subdirectories below programs/5.

Teamwork You are required to do this assignment and all other programming assignments using pair programming. When you submit your assignment, indicate the composition of your team (names, student ids, and email addresses) in your README file.

If you cannot find a partner and would like one, send an email to comp311@rice.edu and we will try to find one for you. Teams of more than two members are not permitted.

Task Your assignment is to convert Jam to a typed programming language.

Step 1 As the first step in the project, you will strip your solution to Assignment 4 down to a family of two interpreters: call-by-value/eager (value/value) and call-by-value/lazy (value/need).

In Java, the public interface to the interpreter is the same as in Assignment 4 except for the fact the the Interpreter class will only support two public methods: eagerEval() and lazyEval() which correspond to valueValue() and valueNeed() in Assignment 4.

In Scheme, the public interface will be same as in Assignment 4 except that the Scheme functions make-eval-file and make-eval-string which return a function of one argument that is either ’eager or ’lazy.

make-eval-file: string -> (cons-mode symbol -> jam Value)
make-eval-string: string -> (cons-mode symbol -> jam Value)

In OCaml, the public interface will be the same as in Assignment 4 except that the OCaml functions eval_string and eval_file take a string and just one mode, either EAGER or LAZY, and return a value:

type mode = EAGER | LAZY;;
 
eval_file: string -> (mode -> value)
eval_string: string -> (mode -> value)

Step 2 As the second step of the assignment, you will modify the parser and AST representation to

Hence, the grammar looks like this:

Expressions

Exp               ::= Term { Binop Exp }
                    | if Exp then Exp else Exp
                    | let Def+ in Exp
                    | map TypedIdList to Exp
                    | "{" PropStatementList "}"
Term              ::= Unop Term
                    | Factor { ( ExpList ) }
                    | Prim ( ExpList )
                    | Null
                    | Int
                    | Bool
Factor            ::= ( Exp ) | Id
ExpList           ::= { PropExpList }
PropExpList       ::= Exp | Exp , PropExpList
TypedIdList       ::= { PropTypedIdList }
PropTypedIdList   ::= TypedId | TypedId, PropTypedIdList
TypedId           ::= Id : Type
PropStatementList ::= Exp | Exp ; PropStatementList

Definitions

Def ::= TypedId := Exp ;

Primitive Constants, Operators, and Operations

Null  ::= null : Type
Bool  ::= true | false
Unop  ::= Sign | ~ | ! | ref
Sign  ::= "+" | -
Binop ::= Sign | "*" | / | = | != | < | > | <= | >= | & | "|" | <-
Prim  ::= null? | cons? | cons | first | rest

Identifiers

Id ::= AlphaOther {AlphaOther | Digit}*

Please note that Id does not contain the anything that matches Prim or the keywords if, then, else, map, to, let, in, null, true, and false.

Types

Type         ::= unit
               | int
               | bool
               | list Type
               | ref Type
               | (TypeList -> Type)
TypeList     ::= { PropTypeList }
PropTypeList ::= Type | Type, PropTypeList

Numbers

Int ::= Digit+

Note that there are three type constructors list, <-, and ref. You will need to define an abstract syntax for types.

You are not responsible for producing slick error diagnostics for syntax errors. You may still prohibit the omitted primitive operations recognized by the lexer/parser from being used as variable names.

Step 3 Before you can interpret a typed program, you must type check it.

In Java, define a static method in the Interpreter class called checkTypes that takes an expression (an AST) as an argument, type checks it, and returns an expression suitable for interpretation. You can simply use inheritance to define an AST representation for typed program syntax that extends the untyped AST representation. If your type checker encounters a type error, it should abort the computation by throwing a TypeException with an error message explaining the type error. TypeException looks like this:

class TypeException extends RuntimeException {
    TypeException(String msg) { super(msg); }
}

In Scheme, define a top-level function check-types that takes an expression (AST) as its only argument, type checks it, and returns an expression suitable for interpretation. To represent ASTs with types, you can either add type fields to your untyped AST struct definitions or you can introduce new typed AST struct definitions and convert the AST back to untyped form after your type check it. If your type checker encounters a type error, it should abort the computation by invoking error with a message explaining the type error. The error message must contain the word "type".

In OCaml, define a top-level function check_type that takes an expression (AST) as its only argument, type checks it, and returns an expression suitable for interpretation. To represent ASTs with types, you can either add type fields to your untyped AST struct definitions or you can introduce new typed AST struct definitions and convert the AST back to untyped form after your type check it. If your type checker encounters a type error, it should abort the computation by throwing a TypeException with an error message explaining the type error. TypeError is:

exception EvalError of string;;

In all three languages, the program should first invoke the parser, then the context-sensitive syntax checker from Assignment 3, then the type checker, and then the interpreter.

The type-checking method or function should rely on the tree-walking functions or methods that you developed in Assignment 3. It behaves like an "abstract" interpreter that takes a type environment (mapping variables to types) as input and returns a type instead of a value as output. Note that the primitive operations and operators all have types. Moreover, the primitives cons, first, rest, null?, and cons? and and the operators !, ref, <-, = have schematic types that must be matched against their contexts to determine
how they are instantiated:

cons      : 'a , list 'a -> list 'a
first     : list 'a      -> 'a
rest      : list 'a      -> list 'a
ref       : 'a           -> ref 'a
null?     : 'a           -> bool
cons?     : 'a           -> bool
!         : ref 'a       -> 'a
<-        : ref 'a , 'a  -> unit
=         : 'a , 'a      -> bool
!=        : 'a , 'a      -> bool

In these type declarations, the symbol 'a stands for any specific type (a specific type cannot contain the special symbol 'a). Only these primitives and binary operators have polymorphic type; every program-defined function must have a specific (non-polymorphic) type. The symbol 'a designating an undetermined type never appears in actual program text.

To simplify the type checking process, the revised syntax prohibits these polymorphic operations from being used as data values (stored in data structures, passed as arguments, or returned as results). To use cons as a data value, you must wrap it in a map which forces it to have a specific type. Hindley-Milner polymorphism (used in ML) is slightly more flexible; it allows you to use polymorphic operations as data values, but their reconstructed types must not be polymorphic!

From the perspective of type-checking, the if-then-else construct behaves like a ternary polymorphic function with type

if-then-else : (bool, 'a, 'a -> 'a)

The remaining primitives and operators have the following specific types:

(unary) -  : (int -> int)
~          : (bool -> bool)
+          : (int, int -> int)
(binary) - : (int, int -> int)
*          : (int, int -> int)
/          : (int, int -> int)
<          : (int, int -> bool)
<=         : (int, int -> bool)
&          : (bool, bool -> bool)
|          : (bool, bool -> bool)

Some type correct applications of primitives can still generate run-time errors because the collection of definable types and type-checking machinery are too weak to capture the domain of primitive functions precisely. For example, the following expressions are all type-correct:

1/0
first(null : int)
rest(null : int)

To type the applications of polymorphic primitives and operators, your type checker will need to match their schematic types against the types the the checker produces for their inputs. Every program sub-expression other than one of the schematic primitives or operators has a specific (non-schematic) type. We augmented null with type annotation to obtain this property.

Consider the following example. Assume that the type environment assigns x the type int, then the unary operator ref in the application ref x has type ( int -> ref int ).

Testing You are expected to write unit tests for all of your non-trivial methods or functions. For more information, refer back to Assignment 1.

Test that your type checker works and produces code suitable for your interpreter. As usual we will provide a sample Unit testing suite to help you get started on writing your unit tests and sample grading scripts so you can test that your program supports the proper interfaces before submitting it for grading.

Choice of Language Generic Java, Scheme, and O'Caml are the only languages permitted for this assignment.

Extra Credit (100 points)

The extra credit part of this assignment is a complete extra assignment with the identifying project name 5xc. Solutions must be submitted using this project name rather than 5. As preparation for the extra credit part, create a programs/5xc directory within you comp311 directory. The extra credit assignment is not a substitute for the regular assignment!

The input language to the extra credit assignment is the same as for Assignment 4 except for the elimination of the primitives list?, number?, function?, ref?, and arity. The public interface supported by the interpreter is the same as for Assignment 5 except for the fact that input programs do not have type annotations.

Your assignment is to implement a type checker that performs Hindley-Milner type inference using unification. The types for variables are determined by this inference process rather than program annotations. Type errors may be manifested as unification failures.

Testing and Submitting Your Program

The directory tests will contain Java, O'Caml and Scheme test files Assign5Test.java, assign5test.ml, and assign5test.ss containing some very simple unit tests for your program. While these files are being prepared, you can add type annotations to the test files Assign4Test.java, assign4test.ml and assign4test.ss to create test files for this assignment.

Your program directory should only contain the files that you want to submit. We will subsequently provide you with test scripts similar to the ones we provided for Assignment 4. Running these scripts will alert you to any problems with your directory configuration. We might also place a few addtional test programs in the directory tests.

Make sure that your program passes the sample unit tests and works with our grading script before submitting it. You must provide a README file in the directory programs/5 with the following format:

The first two lines must contain the email addresses, one per line, of the two members of your team. If you are working alone, the second line must be blank. For example, if the two team members have email addresses cork@rice.edu and camus@rice.edu then the first two lines of the README file must be:
```
cork@rice.edu
camus@rice.edu
```
(in either order) and nothing else.
The remainder of the file must give the external documentation for the program including (i) a short contract for each of the methods in the intended user interface, (ii) an outline of the program, and (iii) a short description of the testing process you used to confirm the correctness of your program.

Any test data files for your program must be stored in the programs/5 directory. Do not store anything in subdirectories below programs/5.

Each procedure or method in your program should include a short comment stating precisely what it does. For routines that parse a particular form of program phrase, give grammar rules describing that form.

To submit your program, make sure that everything that you want to submit is located in the directory programs/5 and type the command

turnin311 -t[scheme|java|ocaml] 5

from within the comp311/programs directory. For example, if you have an O'Caml version you would execute

turnin311 -tocaml 5

The command will inform you whether or not your submission succeeded. Only submit one copy of your program per team. If you need to resubmit an improved version your program, submit it from the same account as the original so that the old version is replaced.

Note if turnin311 is not on your path, you will need to add /home/comp311/bin to your path:

export PATH=$PATH:/home/comp311/bin

Back to course website

Assigned: Friday, October 22, 2004 Due: 12:00 noon, Wednesday, November 10, 2004

Files

Overview

Extra Credit (100 points)

Testing and Submitting Your Program

Assigned: Friday, October 22, 2004
Due: 12:00 noon, Wednesday, November 10, 2004