COMP 311 Principles of Programming Languages (Fall 2004)

Programming Assignment 2: Three Interpreters for Jam

Assigned: Monday, September 10, 2004
Due: 12:00 noon, Wednesday, September 22, 2004

Files

Supporting Code for

Some Simple Test Inputs

Overview

Preparation Create a programs/2 directory within your comp311 directory. All of your files for this assignment should be stored in the programs/2 subdirectory.

Teamwork You are required to do this assignment and all other programming assignments using pair programming. When you submit your assignment, indicate the composition of your team (names, student ids, and email addresses) in your README file.

If you cannot find a partner and would like one, send an email to comp311@rice.edu and we will try to find one for you. Teams of more than two members are not permitted.

Task Your task is to write three interpreters for Jam: one that performs call-by-value (big-step) evaluation, one that performs call-by-name evaluation, and one that performs call-by-need evaluation. The input to your interpreter will be a Jam program in the concrete syntax defined in Assignment 1. We are providing parsers written in Java, O'Caml and Scheme that accept this syntax, but you are encouraged to use your solution to Assignment 1 as the parser for this assignment. If you use your own parser, make sure that the map construct accepts an empty parameter list. The Jam grammar is easier to read if the the last line of the definition of <exp> is revised to read

map { <prop-id-list> } to <exp>

where the curly braces { } state that the <prop-id-list> is optional. You may use the same lexer with your parser that you used in Assignment 1.

Call-by-value and call-by-name We will defer the discussion of call-by-need evaluation until the next section. Call-by-need evaluation is a simple an "optimization" call-by-name evaluation, so we must explain call-by-name evaluation first. The difference between call-by-value evaluation call-by-name evaluation is straightforward. Consider a substitution-based (syntactic) semantics for Jam analogous to the one we gave in Lecture 4 for the language LC. In call-by-value evaluation, the arguments in a procedure application (class App in the Java representation of Jam abstract syntax) are reduced to values before the arguments are substituted into the procedure body. In call-by-name evaluation, the argument expressions are substituted into the procedure body without being evaluated! These argument expressions are evaluated if and when their values are demanded during the evaluation of the body. This difference is codified by the differences in the reduction rules for procedure applications.

In call-by-value,

(map to

where $V_1 \; \dots \; V_n$ are values. Hence the procedure body in an application is not evaluated until the arguments are reduced values.

In call-by-name,

(map to

where $M_1 \; \dots \; M_n$ are expressions. This rule performs the substitution immediately rather than reducing $M_1, \dots, M_n$ to values first.

Since call-by-name defers the evaulation of arguments in procedure applications, some of those arguments may never be evaluated in some cases. As a result, there are Jam programs that converge to an answer for call-by-name but diverge for call-by-value (because some argument in a procedure application diverges that is never demanded in call-by-name evaluation).

For the sake of efficiency, your interpreter should maintain an environment of variable bindings instead of performing explicit substitutions. In the environment, each variable must be bound to a representation that can be evaluated on demand. Warning: beware of scoping problems in your call-by-name interpreter. Make sure that you evaluate each argument expression in the proper environment. Hint: it may be helpful to think of each argument expression is a procedure of no arguments.

If you are doing the assignment in Java, we recommend that you use the generic PureList class in PureList.java to represent an environment. If you are doing the assignment in Ocaml or Scheme, you may represent environments either as lists or as functions.

Call-by-need The call-by-name approach to evaluation is wasteful because it re-evaluates an argument expression every time the corresponding parameter is evaluated. By using the list representation for environments and performing destructive operations on data, it is possible to eliminate this wasteful practice and only evaluate each argument expression once, the first time its value is demanded. This evaluation protocol is called call-by-need. For functional languages like Jam, there is no difference in the observable behavior or call-by-name and call-by-need, except that the latter is generally more efficient.

Testing You are expected to write unit tests for all of your non-trivial methods or functions. For more information, refer back to Assignment 1. Since their is no straight-forward way to inspect closures or environments, the direct output of environments or closures will not be tested. We can still however test aspects of a closure. For example: arity(map x to x).

Scheme Version If you do the assignment in Scheme, your program should be stored in the file eval.ss. It should support two publicly visible functions: make-eval-string and make-eval-file. make-eval-string takes in a string representing the JAM program. make-eval-file takes in a filename specifying the input file. Both return a procedure that when applied to the argument 'value evaluates the program using call-by-value, when applied to the argument 'name evaluates the program using call-by-name, and when applied to the argument 'need evaluates the program using call-by-need.

make-eval-file: string -> (mode symbol -> jam Value)
make-eval-string: string -> (mode symbol -> jam Value)

In the case of an evaluation error, you must error with a message containing 'eval'. For example: "(error 'biop-eval "too many arguments")" Also, lexer errors contain 'lexer', parser errors container 'parser'.

Java Version If you do the assignment in Java, the visible part of your program should include the toplevel class: Interpreter containing the methods JamVal callByValue(), JamVal callByName(), and JamVal callByNeed(). The interface JamVal is defined in the provided parser library files.

The Interpreter class should support two constructors: Interpreter(String filename) which takes input from the specified file and Interpreter(Reader inputStream) which takes input from the specified Reader. Store your source program in a file named Interpreter.java. In summary:

/** file Interpreter.java **/
class EvalException extends RuntimeException {
    EvalException(String msg) { super(msg); }
}

class Interpreter {
   Interpreter(String fileName) throws IOException;
   Interpreter(Reader reader);

   public JamVal callByValue();
   public JamVal callByName();
   public JamVal callByNeed();
}

O'Caml Version If you do this assignment in O'Caml, your program should be stored in the file eval.ml. It should support two publially visible function: eval_file and eval_string. eval_file takes in a a filename specifying the input file. eval_string takes in a string representing the JAM program to evaluate. Both procedures then take in the mode evaluation and return a value. You will need to extend the definition of value below to include any other JAM values you will use in the assignment. In summary:

exception EvalError of string;;

type mode = VALUE | NAME | NEED;;
type value =
   VNum of int
 | VList of value list
 | VBool of bool
 | VPrim of string
 | ... (* add your own types here *)
;;

eval_file: string -> (mode -> value)
eval_string: string -> (mode -> value)

let rec string_of_value = fun v ->
  match v with
  | VNum(i) -> string_of_int i
  | VBool(true) -> "true"
  | VBool(false) -> "false"
  | VList([]) -> "()"
  | VList(v::vlr) ->
    let rec string_of_list =
      fun l -> (
                 match l with
                   | [] -> ""
                   | f::r -> " " ^ (string_of_value f) ^ (string_of_list r)
                 )
    in "(" ^ (string_of_value v) ^ (string_of_list vlr) ^ ")"
  | VPrim(s) -> s
  | _ -> "Output not supported. If you want to test output of other things, please use a function with a name other than string_of_value."
;;

We use the string_of_value function in the unit testing of your solutions, so please make sure that you support this function. It does not output maps; if you want to do that for testing purposes, please create a different function. Please do not add it to string_of_value.

Choice of Language Generic Java, Scheme, and O'Caml are the only languages permitted for this assignment.

Input Language Specification

The following paragraphs define the Jam subset that your interpreter should handle and the new parser's input language.

Your interpreter should accept the Jam language except:

the unary operators + (leaving - and ~), and
the binary operators >, >=, != (leaving +, -, /, *, =, <,<=,&, | ).

If you encounter an unsupported operator, you may raise an evaluation exception/error. Note that all of the primitive Jam operations, namely cons?, null? number?, function?, arity, list?, cons, first, and rest are included in the subset.

The supported operators have exactly the same meaning as their OCaml counterparts (where the unary operator ~ is identified with the OCaml not function, and the binary operator & and | are identified with the OCaml && and || operators) with three exceptions:

/ means the integer division,
= and != means equality on arbitrary objects. If either value contains a function, then return false. By default, equality means structural equality, which is what is meant here.

The meanings of the primitive functions are as follows:

cons?: p -> whether p was created by cons().
cons?(null) = false
null?: p -> whether p is null.
number?: p -> whether p is a number
function?: p -> whether p is a function (ie a map or a primitive op)
arity: p -> how many parameters does the function/primitive p take.
Evaluation error if a non-function is passed in.
list?: p -> whether p is a list (anything constructed by cons(), or null)
cons: p, q -> a cons (a list entry).
p can be anything.
q must be a list (something created by cons() or null); Evaluation error otherwise.
first: p -> The first element of list p
Evaluation error if p was not created by cons() (ie p is not a list or p is null)
rest: p -> The list p with the first element removed
Evaluation error if p was not created by cons() (ie p is not a list or p is null)

The language constructs if, map, let, and application in JAM have the same meaning as if,fun,let, and application in OCaml with one exception. The call-by-name version of Jam uses call-by-name semantics for procedure application. When Jam if is given a non-boolean input in the test position, it aborts.

Your interpreter should abort by raising the OCaml exception EvalError, calling the Scheme procedure error or throwing an EvalException in Java. The Scheme error procedure is documented in the online DrScheme Help facility. You can search for the term "error" in the dialog box at the bottom of the Help Desk. The "error" procedure in Full Scheme (which takes 3 arguments) is incompatible with the "error" procedure in the beginning/intermediate/advanced subsets supported by DrScheme. A good strategy is use Full Scheme (with case-sensitivity) and include the line

(require-library "core.ss")

at the beginning of your program. This library defines support for all the primitives in the dialect of Scheme taught in Comp 210.

The Jam let construct is simply an abbreviation for the corresponding lambda application. Hence

let := ; := ; in

has exactly the same meaning as

map to )()

Hence, a let is a combination of a map and an application. Note that the semantics of let depend on whether the interpreter is using call-by-value or call-by-name.

Lists are created using cons and null, where null is the empty list. The one-element list (1) is created using cons(1,null). Since the constructs returned by cons(...) and null are lists, list? returns true for both of them. The following program evaluates to (true true):

cons(list?(cons(5,null)),cons(list?(null),null))

and and or short-circuit even in call-by-value. That means they look just at their first operand and try to determine the result; only if they cannot determine it from just the first operand do they look at the second at the second operand. For example, true | x evaluates to true regardless of what value x has.

if works in a similar way: It evaluates the test expression. If it is true, it evaluates the consequence expression; if it is false, it evaluates the alternative expression. if never evaluates both the consequence and alternative expressions.

Representation of Program Values

A Jam program manipulates data values that are either numbers, booleans, lists, or procedure representations (closures). To standardize the visible behavior of Jam interpreters, we stipulate that your interpreters must observe the following conventions concerning the representation of Jam values in Java, O'Caml, and Scheme: Jam values must be represented using the JamVal interface and subclasses as defined in each of the three parser library files.

In Scheme and OCaml, you must represent Jam integers and booleans as Scheme and OCaml integers and booleans.

Testing and Submitting Your Program

The files tests/in1 and tests/in2 contain sample input programs for call-by-value. tests/name.in1 and tests/name.in2 contain samples for call-by-name and call-by-need. The files tests/out1 and tests/out2 contain the corresponding answer computed by the program as formatted by the standard Scheme printer in the DrScheme interactions window.

Make sure that your program produces the same output (ignoring output formatting differences in Scheme, O'Caml, and Java) on this sample input before submitting it. Create a README file in the your directory program/2 that

gives the names and userids of both team members (followed by an empty line)
outlines the organization of your program, and
specifies what testing process you used to confirm the correctness of your program.

Your test data files should be stored in the programs/2 directory.

Each procedure or method in your program should include a short comment stating precisely what it does. For routines that parse a particular form of program phrase, give grammar rules describing that form.

To submit your program, make sure that everything that you want to submit is located in the directory programs/2 and type the command

turnin311 -t[scheme|java|ocaml] 2

from within the comp311/programs directory. For example, if you have an O'Caml version you would execute

turnin311 -tocaml 2

The command will inform you whether or not your submission succeeded. Only submit one copy of your program per team. If you need to resubmit an improved version your program, submit it from the same account as the original so that the old version is replaced.

Note if turnin311 is not on your path, you will need to add /home/comp311/bin to your path:

export PATH=$PATH:/home/comp311/bin

Implementation Hints

In your call-by-name interpreter, the argument expressions must be evaluated in the environment corresponding to their lexical context in the program. The free variables in argument expressions behave exactly like the free variables in functions (procedures).

The form of your interpreters should be very similar to the environment-based interpreters presented in class. In OCaml and Scheme, you should represent ground values (non-functions) by their Ocaml or Scheme equivalents, and functions either as explicit closure representations or OCaml or Scheme procedures. In Java, you should represent ground values by their Oject equivalents (where lists are represented using the PureList class in the parser library file) and functions either as closure objects or as anonymous inner classes (akin to Scheme procedures). Implement each of the primitives using the corresponding OCaml, Scheme or Java primitive when possible. If you represent functions as explicit closures in Ocaml or Scheme, or either form in Java, then you will have to implement the function? and arity primitives on your own (which is straightforward).

Your interpreters should return the Scheme, O'Caml, or Java data values representing the answers to the specified Jam computations.

Note that unary and binary operators are not values while primitives are. Unary and binary operators are not legal expressions; hence, they can never be used as values. In addition, note that the binary operators & and | are special: they do not evaluate their second arguments unless it is necessary.

In writing your interpreters: follow the definition of the data. Your interpreters should contain a separate clause for each abstract syntax constructor. If you write the interpreters in Java, you should express it using the visitor pattern.

Remember to "factor out" repeated code patterns as you learned in Comp 210 and Comp 212.

Implement a very small subset of the primitives and operators first and get everything debugged. Then add support for the remaining ones.

Back to course website