COMP 311 Principles of Programming Languages (Fall 2004)

Programming Assignment 3:
Lazy Evaluation and Recursive Definitions

Assigned: Friday, September 24, 2004
Due: 12:00 noon, Friday, October 8, 2004

Files

Supporting Code for

Overview

Preparation Create a programs/3 directory within your comp311 directory. All of your files for this assignment should be stored in the programs/3 subdirectory.

Teamwork You are required to do this assignment and all other programming assignments using pair programming. When you submit your assignment, indicate the composition of your team (names, student ids, and email addresses) in your README file.

If you cannot find a partner and would like one, send an email to comp311@rice.edu and we will try to find one for you. Teams of more than two members are not permitted.

Task Your task is to extend and modify your interpreter from Assignment 2 or one of the solutions for Assignment 2 to:

support the complete set unary and binary operators in JAM,
perform the context-sensitive checking required to confirm that the input program is a legal program;
enforce safety in Jam programs by detecting illegal operations before they are executed triggering a Jam or Scheme run-time error.
change the semantics of the let construct to support recursive bindings ("recursive let"); and
and support the optional lazy evaluation of the cons constructor using both the call-by-name and the call-by-need implementation of lazy evaluation.

The details of the each of these extensions is in the following paragraphs.

Support All Jam unary and binary operators In the last assignment the numerical unary operator + (plus) and the binary operators > (greater than),>= (greater than or equal) were excluded. The last assignment also excluded support for != (not equal), which follows the same basic semantics as =, except it is the negation (i.e. if one of the operands is a closure, the result is true). In this assignment you will need to add support for all of these operators.

Context-Sensitive Checking The context sensitive constraints on Jam syntax are most easily by performing a separate "checking pass" over the abstract syntax right after it has been constructed. There are two context-sensitive conditions that Jam programs must satisfy to be well-formed:

There are no free variables.
The same variable name cannot appear twice in the same parameter list (of a map) or the same collection of let names (introduced in a let binding).

If you find either of these errors, throw a SyntaxException or SyntaxError, whatever is appropriate in your language of implementation, with a descriptive error message and abort execution. Please refer to the section about the language you use for details about exceptions/errors.

The context-sensitive checker should be run before the expression is evaluated. Note that the context-sensitive checker does not care which unary or binary operator was used, or how many arguments were passed to a primitive function.

Safety Make your interpreter responsible to catching all Jam run-time errors. Do not rely on Java, O'Caml, or Scheme run-time checking. All Jam run-time error messages should be of type EvalException or EvalError, whichever is appropriate, generated directly by your interpreter, not by the underlying Scheme, O'Caml, or Java language implementation.

Lazy cons For all three of your interpreters from Assignment 3, you will add two lazy cons evaluation options that defer the evaluation of both arguments of a cons construction. The data object representing the deferred evaluation is called a suspension. Given a cons node, the first and rest operations evaluate the corresponding argument expression in the environment where the expression would have been evaluated without laziness.

Lazy cons obeys the following equations:

first(cons(M,N)) = M
rest(cons(M,N)) = if list?(N) then N else error

for all Jam expresions M and N, including expressions that diverge or generate run-time errors. Recall that list? is simply the disjunction of null? and cons?.

Note: R5RS Scheme regrettably defines list? as a predicate that must scan the entire list to ascertain that it is terminated by the empty list. Since R5RS does not force the second argument of cons to be a list, the “end” of an R5RS “list” can be any data value. Interpreterting an innocous looking predicate (akin to null? and cons?) as an operation that must traverse the entire list (O(N) time) is bad language design choice. In past versions of the DrScheme teaching languages, list? was defined as (lambda (o) (or (null? o) (cons? o))). I don’t know what the current version does.

The lazy versions of cons postpone evalution of argument expressions until they are demanded by a first or rest operation. You can use exactly the same suspension mechanism to support call-by-name and call-by-need lazy cons that you might have used to support call-by-name and call-by-need bindings in the previous assignment. In this case, the suspensions are stored in a cons structure rather than a binding structure the environment. An embedded suspension is not evaluated until it is "probed" by a first or rest operation. Call-by-name lazy evaluation re-evaluates suspensions every time that they are probed. Call-by-need lazy evaluation evaluates a given suspension only once and caches the resulting value in the data constructor (cons cell in the case of Jam).

Please change the function that converts Jam values to a string representation (toString, string_of_value, etc.) so that lists are still displayed as lists in the form (1 2 3) as before, regardless of the cons mode used. You may want to abort processing lists longer than, say, 150 elements and have them displayed ending in 148 149 150...). However, we will not test this behavior. We will just require that lists up to 150 elements are displayed the same way as in the last assignment.

Recursive Let The recursive version of the let construct introduces a collection of recursive bindings just as we discuseed in class. We will not restrict the right hand sides of recursive let constructions to maps because many useful recursive definitions in Jam with lazy cons violate this restriction. Hence, the implementation of recursive let closely follows the implementation of the generalized version of rec-let construct in LC.

The only difference is that Jam recursive let may introduce several mutually recursive definitions where the corresponding closures all close over the new environment. Since we want invalid forward references in a Jam program to produce a meaningful error message rather than diverging, we will use an explicit list to represent the environment of bindings and destructively update it to create the self-references.

In a call-by-value application, we will initially bind each variable introduced in a let to a special value ((void) in Scheme, () in O'Caml ("unit"), and null in Java) that is not a legal Jam value representation and will abort the computation if it is used as the input to any Jam operation. The interpreter evaluates the right-hand sides of the definitions in a recursive let in left-to-right order and destructively modifies each new binding in the environment as soon as the right-hand side has been determined.

In a call-by-name application, we bind each variable to a suspension (a closure of no arguments or thunk) for each right-hand side. Since the suspension wraps the right hand side of each let binding inside a lambda, the recursive environment can be constructed directly using Scheme letrec or define. No mutation is required.

Note that the validity of definitions in a recursive let depends on which semantics is used for Jam. Definitions that are valid for call-by-name or call-by-need lazy evaluation may be invalid in other evaluation modes. By this measure, lazy evaluation is better than conventional ("eager") evaluation and call-by-name/call-by-need is better than call-by-value.

Testing You are expected to write unit tests for all of your non-trivial methods or functions. For more information, refer back to Assignment 1.

Scheme Version The interface for this assignment is an extension of the interface for the preceding assignment. The evaluator produced by invoking (make-eval-file ...) or (make-eval-string ...) or is a procedure p that takes two symbols as arguments. The first symbol specifies whether procedures argument are passed by ’value, ’name, or ’need and the second symbol specifies whether the arguments to the cons data constructor are passed by ’value, ’name, or ’need.

make-eval-file: string -> (argument-mode symbol cons-mode symbol -> jam Value)
make-eval-string: string -> (argument-mode symbol cons-mode symbol -> jam Value)

In the case of an evaluation or error, you must error with a message containing 'eval' for evaluation errors or 'syntax' for syntax errors. For example: "(error 'biop-eval "too many arguments")" Also, lexer errors must contain 'lexer', parser errors must contain 'parser'.

Java Version In Assignment 2, the Interpreter supported three evaluation methods

callByValue()
callByName()
callByNeed()

In this assigment, the Interpreter class must support nine evaluation methods:

valueValue()
nameValue()
needValue()
valueName()
nameName()
needName()
valueNeed()
nameNeed()
needNeed()

where the first word in the method name specifies the policy for evaluating program-defined procedure applications (Jam maps) and the second word specifies the policy for evaluating applications of the data constructor cons. Hence the first three methods in the preceding list correspond to the three evaluation methods in Assignment 2.

As in Assignment 2, the Interpreter class must support two constructors: Interpreter(String filename) which takes input from the specified file and Interpreter(Reader inputStream) which takes input from the specified Reader. A given Interpreter object should support the repeated invocation of any of the nine public evaluator methods. As a result, the same Interpreter object can be used to test all forms of evaluation (as long as none of the evaluations diverges).

In summary:

/** file Interpreter.java **/
class EvalException extends RuntimeException {
    EvalException(String msg) { super(msg); }
}
 
class SyntaxException extends RuntimeException {
    SyntaxException(String msg) { super(msg); }
}
 
class Interpreter {
   Interpreter(String fileName) throws IOException;
   Interpreter(Reader reader);
   
   public JamVal valueValue();
   public JamVal nameValue();
   public JamVal needValue();
   
   public JamVal valueName();
   public JamVal nameName();
   public JamVal needName();
   
   public JamVal valueNeed();
   public JamVal nameNeed();
   public JamVal needNeed();
}

O'Caml Version If you do this assignment in O'Caml, your program should be stored in the file eval.ml. It should support two publicly visible function: eval_file and eval_string. eval_file takes in a a filename specifying the input file. eval_string takes in a string representing the Jam program to evaluate. Both procedures then take in the argument evaluation mode and the cons mode and return a value. You will need to extend the definition of value below to include any other Jam values you will use in the assignment. In summary:

exception EvalError of string;;
 
exception SyntaxError of string;;
 
type mode = VALUE | NAME | NEED;;
type value =
   VNum of int
 | VList of value list
 | VBool of bool
 | VPrim of string
 | ... (* add your own types here *)
;;
 
eval_file: string -> mode -> mode -> value
eval_string: string -> mode -> mode -> value

let rec string_of_value = fun v ->
  match v with
  | VNum(i) -> string_of_int i
  | VBool(true) -> "true"
  | VBool(false) -> "false"
  | VList([]) -> "()"
  | VList(v::vlr) ->
    (* change this so that your lazy lists are displayed identically *)
    let rec string_of_list =
      fun l -> (
                 match l with
                   | [] -> ""
                   | f::r -> " " ^ (string_of_value f) ^ (string_of_list r)
                 )
    in "(" ^ (string_of_value v) ^ (string_of_list vlr) ^ ")"
  | VPrim(s) -> s
  | _ -> "Output not supported."
;;

We use the string_of_value function in the unit testing of your solutions, so please make sure that you support this function. It does not output maps; if you want to do that for testing purposes, please create a different function. Please do not add it to string_of_value.

Choice of Language Generic Java, Scheme, and O'Caml are the only languages permitted for this assignment.

General Implementation Issues

To produce intelligible output for computations involving lazy cons, your interpreter must force the complete evaluation of the answer up to a large depth bound (which is a constant in your program). In Java, your toString() method for your lazy cons classes should use this depth bound to determine what to print. In Scheme and O'Caml, you will need to write a separate function applied only at the top-level to do this forcing.

If a computation produces an infinite answer, the large bound (on the order of 1000-10000) on forcing depth prevents the forcing computation from diverging. An interactive user interface that allowed the user to request the expansion of an unevaluated node (to some depth) in an answer would be more satisfactory, but building such an interface is beyond the scope of a routine programming assignment.

For call-by-need lazy evaluation, the forcing operation can be implemented simply as a traversal that inspects the first and rest fields of every lazy cons node up to the specified depth and then returns its (modified) input. For call-by-name lazy evaluation, however, you will have to construct a new list using conventional "strict" (call-by-value) cons nodes.

Testing and Submitting Your Program

The directory tests will contain Java, O'Caml and Scheme test files Assign3Test.java, assign3test.ml, and assign3test.ss containing some very simple unit tests for your program. While these files are being prepared, you can make minor modifications to the test files Assign2Test.java, assign2test.ml and assign2test.ss to create corresponding (and exceedingly incomplete) test files for this assignment.

Your program directory should only contain the files that you want to submit. We will subsequently provide you with test scripts similar to the ones we provided for Assignment 2. Running these scripts will alert you to any problems with your directory configuration. We will also place a few addtional test programs in the directory tests.

Make sure that your program passes the sample unit tests and works with our grading script before submitting it. You must provide a README file in the directory programs/3 with the following format:

The first two lines must contain the email addresses, one per line, of the two members of your team. If you are working alone, the second line must be blank. For example, if the two team members have email addresses cork@rice.edu and camus@rice.edu then the first two lines of the README file must be:
```
cork@rice.edu
camus@rice.edu
```
(in either order) and nothing else.
The remainder of the file must give the external documentation for the program including (i) a short contract for each of the methods in the intended user interface, (ii) an outline of the program, and (iii) a short description of the testing process you used to confirm the correctness of your program.

Any test data files for your program must be stored in the programs/3 directory. Please make sure you submit your unit tests! You cannot get credit for parts not submitted!

Each procedure or method in your program should include a short comment stating precisely what it does. For routines that parse a particular form of program phrase, give grammar rules describing that form.

To submit your program, make sure that everything that you want to submit is located in the directory programs/3 and type the command

turnin311 -t[scheme|java|ocaml] 3

from within the comp311/programs directory. For example, if you have an O'Caml version you would execute

turnin311 -tocaml 3

The command will inform you whether or not your submission succeeded. Only submit one copy of your program per team. If you need to resubmit an improved version your program, submit it from the same account as the original so that the old version is replaced.

Note if turnin311 is not on your path, you will need to add /home/comp311/bin to your path:

export PATH=$PATH:/home/comp311/bin

Back to course website

Assigned: Friday, September 24, 2004 Due: 12:00 noon, Friday, October 8, 2004

Files

Overview

General Implementation Issues

Testing and Submitting Your Program

Assigned: Friday, September 24, 2004
Due: 12:00 noon, Friday, October 8, 2004