Programming Assignment 5:
Typed Jam

 

Assigned: Friday, October 22, 2010
Due: 12:00 noon, Monday, November 8, 2010

Files

No supporting code. Use your own interpreter from Assignment 4.

Some Simple Test Inputs

Overview

Preparation   Create a programs/5 directory within your comp311 directory. All of your files for this assignment should be stored in the programs/5 subdirectory. Do not put files in subdirectories below programs/5.

Teamwork   You are strongly encouraged to do this assignment and all other programming assignments using pair programming. When you submit your assignment, indicate the composition of your team (names, student ids, and email addresses) in your README file.

If you cannot find a partner and would like one, send an email to comp311@rice.edu and we will try to find one for you. Teams of more than two members are not permitted.

Task   Your assignment is to convert Jam to a typed programming language.

Step 1   As the first step in the project, you will strip your solution to Assignment 4 down to a family of two interpreters: call-by-value/eager (value/value) and call-by-value/lazy (value/need).

The public interface to the interpreter is the same as in Assignment 4 except for the fact the the Interpreter class will only support two public methods: eagerEval() and lazyEval() which correspond to valueValue() and valueNeed() in Assignment 4.

Save you solution to this step in a separate directory. You will use it as the starting point for Project 6.

Step 2  As the second step of the assignment, you will modify the parser and AST representation to

Hence, the grammar looks like this:

Expressions

Exp               ::= Term { Binop Exp }
                    | if Exp then Exp else Exp
                    | let Def+ in Exp
                    | map TypedIdList to Exp
                    | "{" PropStatementList "}"
Term              ::= Unop Term
                    | Factor { ( ExpList ) }
                    | Prim ( ExpList )
                    | Null
                    | Int
                    | Bool
Factor            ::= ( Exp ) | Id
ExpList           ::= { PropExpList }
PropExpList       ::= Exp | Exp , PropExpList
TypedIdList       ::= { PropTypedIdList }
PropTypedIdList   ::= TypedId | TypedId, PropTypedIdList
TypedId           ::= Id : Type
PropStatementList ::= Exp | Exp ; PropStatementList

Definitions

Def ::= TypedId := Exp ;

Primitive Constants, Operators, and Operations

Null  ::= null : Type
Bool  ::= true | false
Unop  ::= Sign | ~ | ! | ref
Sign  ::= "+" | -
Binop ::= Sign | "*" | / | = | != | < | > | <= | >= | & | "|" | <-
Prim  ::= null? | cons? | cons | first | rest

Identifiers

Id ::= AlphaOther {AlphaOther | Digit}*

Please note that Id does not contain the anything that matches Prim or the keywords if, then, else, map, to, let, in, null, true, and false.

Types

Type         ::= unit
               | int
               | bool
               | list Type
               | ref Type
               | (TypeList -> Type)
TypeList     ::= { PropTypeList }
PropTypeList ::= Type | Type, PropTypeList

Numbers

Int ::= Digit+

Note that there are three type constructors list, <-, and ref. You will need to define an abstract syntax for types.

You are not responsible for producing slick error diagnostics for syntax errors. You may still prohibit the omitted primitive operations recognized by the lexer/parser from being used as variable names.

Step 3   Before you can interpret a typed program, you must type check it.

Define a static method in the Interpreter class called checkTypes that takes an expression (an AST) as an argument, type checks it, and returns an expression suitable for interpretation. You can simply use inheritance to define an AST representation for typed program syntax that extends the untyped AST representation. If your type checker encounters a type error, it should abort the computation by throwing a TypeException with an error message explaining the type error. TypeException looks like this:

class TypeException extends RuntimeException {
    TypeException(String msg) { super(msg); }
}

The top-level methods eagerEval() and lazyEval() should first invoke the parser, then the context-sensitive syntax checker from Assignment 3, then the type checker, and finally the appropriate evaluation method.

The type-checking method should rely on the tree-walking methods that you developed in Assignment 3. It behaves like an "abstract" interpreter that takes a type environment (mapping variables to types) as input and returns a type instead of a value as output. Note that the primitive operations and operators all have types. Moreover, the primitives cons, first, rest, null?, and cons? and and the operators !, ref, <-, = have schematic types that must be matched against their contexts to determine
how they are instantiated:

cons      : 'a , list 'a -> list 'a
first     : list 'a      -> 'a
rest : list 'a -> list 'a
ref : 'a -> ref 'a null? : 'a -> bool cons? : 'a -> bool ! : ref 'a -> 'a
<- : ref 'a , 'a -> unit
= : 'a , 'a -> bool != : 'a , 'a -> bool

In these type declarations, the symbol 'a stands for any specific type (a specific type cannot contain the special symbol 'a). Only these primitives and binary operators have polymorphic type; every program-defined function must have a specific (non-polymorphic) type. The symbol 'a designating an undetermined type never appears in actual program text.

To simplify the type checking process, the revised syntax prohibits these polymorphic operations from being used as data values (stored in data structures, passed as arguments, or returned as results). To use cons as a data value, you must wrap it in a map which forces it to have a specific type. Hindley-Milner polymorphism (used in ML) is slightly more flexible; it allows you to use polymorphic operations as data values, but their reconstructed types must not be polymorphic!

From the perspective of type-checking, the if-then-else construct behaves like a ternary polymorphic function with type

if-then-else : (bool, 'a, 'a -> 'a)

The remaining primitives and operators have the following specific types:

(unary) -  : (int -> int)
~          : (bool -> bool)
+ : (int, int -> int)
(binary) - : (int, int -> int)
* : (int, int -> int)
/ : (int, int -> int)
< : (int, int -> bool)
<= : (int, int -> bool)
& : (bool, bool -> bool)
| : (bool, bool -> bool)

Some type correct applications of primitives can still generate run-time errors because the collection of definable types and type-checking machinery are too weak to capture the domain of primitive functions precisely. For example, the following expressions are all type-correct:

1/0
first(null : int)
rest(null : int)

To type the applications of polymorphic primitives and operators, your type checker will need to match their schematic types against the types the the checker produces for their inputs. Every program sub-expression other than one of the schematic primitives or operators has a specific (non-schematic) type. We augmented null with type annotation to obtain this property.

Consider the following example. Assume that the type environment assigns x the type int, then the unary operator ref in the application ref x has type ( int -> ref int ).

Testing   You are expected to write unit tests for all of your non-trivial methods or functions. For more information, refer back to Assignment 1.

Test that your type checker works and produces code suitable for your interpreter. As usual we will provide a sample Unit testing suite to help you get started on writing your unit tests and sample grading scripts so you can test that your program supports the proper interfaces before submitting it for grading.

Extra Credit (100 points)

The extra credit part of this assignment is a complete extra assignment with the identifying project name 5xc. Solutions must be submitted using the project name 5xc rather than 5. As preparation for the extra credit part, create a programs/5xc directory within you comp311 directory. The extra credit assignment is not a substitute for the regular assignment!

The input language to the extra credit assignment is the same as for Assignment 4 except for the elimination of the primitives list?, number?, function?, ref?, and arity (which were also eliminated in Assignment 5). Note that the other changes in input syntax for Assignment 5 do not apply to the extra credit assignment.

The public interface supported by the interpreter is the same as for Assignment 5 except for the fact that input programs do not have type annotations.

Your assignment is to implement a type checker that performs Hindley-Milner type inference using unification. The types for variables are determined by this inference process rather than program annotations. Type errors may be manifested as unification failures.

Testing and Submitting Your Program

The directory tests will contain Java test file Assign5Test.java containing some very simple unit tests for your program. While these files are being prepared, you can add type annotations to the test files Assign4Test.java to create test files for this assignment.

Your program directory should only contain the files that you want to submit.

Make sure that your program passes the sample unit tests and works with our sample grading script before submitting it. Create a README file in the your directory program/5 that

Make sure that your README file is structured correctly:

  1. It has to be named README (all upper case, no .txt or anything)
  2. It has to be in the assignment's "root directory", e.g. ~/comp311/programs/5
  3. The first two lines should only contain your usernames and nothing else. The third line should be blank (if you're working alone, leave lines 2 and 3 blank), e.g.:
    mgricken
    dmp

    This is where the readme text starts.

Any test data files for your program must be stored in the programs/5 directory. Do not store anything in subdirectories below programs/5.

Each procedure or method in your program should include a short comment stating precisely what it does. For routines that parse a particular form of program phrase, give grammar rules describing that form.

Please make sure to remove or disable all debug output before submitting. Excessive use of print statements considerably slows your interpreter and thus testing your project down. If your interpreter generates a timeout error because debug output slows it down, you may not get credit for the test cases that took too long.

To submit your program, make sure that everything that you want to submit is located in the directory comp311/programs/5 on CLE@R and type the command

/coursedata/courses/comp311/bin/turnin311 5

from within the comp311/programs directory.

The command will inform you whether or not your submission succeeded. Only submit one copy of your program per team. If you need to resubmit an improved version your program, submit it from the same account as the original so that the old version is replaced.

Back to course website