Multi-stage Programming Course #:  COMP 511 Instructor:  Walid Taha (DH 3103) and Emir Pasalic (DH 3075) Class time:  MWF 1:00AM – 1:50AM (Office Hours:  M 2:00 – 3:00) Class room: DH 1075

## Introduction

Multi-stage programming is a new approach to building generic programs that do not pay a runtime overhead for their generality. Languages that support multi-stage programming allow us to perform symbolic computation and partial evaluation, making it easier to achieve the goals of multi-stage programming. Such languages are particularly useful for building efficient implementations of other programming languages, including domain-specific ones (DSLs).  Over the last few years, there has significant research into such languages, including issues such as their implementation, type systems, semantics, their use to define macros, and their applications.

The goal of this course is to introduce you to this new technology and to research in this area.  The course includes an introduction to the applications of these languages, the theoretical foundations, and implementation issues. Course work will include reading research papers (25%), discussions (10%), homework exercises (25%), a term project (25%), and a final (15%). Because of the fast pace of the course, late submissions will not be accepted.

Students are encouraged to discuss the material covered in class in homeworks on class mailing list (comp-511-L). Please make sure you subscribe to this list, as any important announcements to the class will be posted there.

## Lectures

 # Date Day Topic Details Reading (Review due this class) Exercises (Due this class) Project activity (Due this class) 1 1/12/05 W Course overview Admin.  Staging, power ex. Lecture notes. 2 1/14/05 F Basics of multi-stage programming Why generation.  Why MSP notation and type systems. Lecture notes. MSP Ch1 Read OCaml Ch1 1/17/05 M No class. 3 1/19/05 W Q & A (mostly on ML). Lecture notes. MSP Ch2 Read OCaml Ch2. HWK1 (inc. OCaml problems) 4 1/21/05 F Renaming, staging, small examples. Lecture Notes. Read OCaml Ch3 5 1/24/05 M Bigger example:  term rewriting. Lecture Notes. MSP Ch3 6 1/26/05 W Lecture Notes. 7 1/28/05 F Language support for implementing domain-specific languages. Lecture notes. MetaOCaml, Template Haskell, and C++ 8 1/31/05 M 9 2/2/05 W Abstract interpretation. Lecture notes. Generating verifiable circuits 10 2/4/05 F 11 2/7/05 M Implementing languages using MSP. Lecture notes. 12 2/9/05 W (Discussion of projects) Lecture notes. Survey of literature 13 2/11/05 F Related systems C-based 14 2/14/05 M C 15 2/16/05 W Lecture notes. Cyclone 16 2/18/05 F 17 2/21/05 M 18 2/23/05 W 19 2/25/05 F Java-based 20 2/28/05 M 21 3/2/05 W 22 3/4/05 F Preliminary report 3/7/05 M No class. 3/9/05 W No class. 3/11/05 F No class. 23 3/14/05 M 24 3/16/05 W 25 3/18/05 F 26 3/21/05 M 27 3/23/05 W Foundations of MSP 28 3/25/05 F 29 3/28/05 M 30 3/30/05 W MSP type systems MSP Ch5, Classifiers 31 4/1/05 F 32 4/4/05 M 33 4/6/05 W RAP type systems Final report 4/8/05 F No class. 34 4/11/05 M Macros 35 4/13/05 W 36 4/15/05 F 37 4/18/05 M 38 4/20/05 W 39 4/22/05 F Reflections on program generation 40 4/25/05 M 41 4/27/05 W 42 4/29/05 F

## Reviews

• What's expected in a paper review/critique?
• The goal of the review/critique is to show that you have carefully read the paper (spend no more than 3 hours on any given paper, including writing the review), that you have understood it, and that you have learnt something from reading it. The best reviews show that you have clearly understood "the moral the story " that the paper tells, and have an appreciation for the details of the work.
• Finally, after you've considered all the above points for your critique, go back and make sure that it is written coherently and in a readable form. Unlike this unsorted list of points that should be contained in your review, your review should not read like a list. It should communicate your judgement about the work, and should use the above aspects to argue the validity of your judgment.
• What's not acceptable in a paper review/critique?
• The review is NOT a summary of the paper. It is NOT a list of ideas in the paper. The review should NOT quote verbatim from the text being read. If you feel you must quote the paper, then limit it to no more than three sentences, and put them in quotation marks ("..."). Your review should contain comments about which parts you found difficult to understand, but you must also include specific questions with such remarks.

## Project Deliverables

Survey of literature

• Make a web-page for your project, with a title, your name, a link to the class web-page. Use it to post all of the deliverables for the class. Make the deliverables for each phase available as one tar file.
• What artifact would you like to stage?
• Why would you expect staging to be useful/novel?
• Have similar applications been considered in the partial evaluation OR runtime code generation literature? If so, it is crucial that you find and discuss these related works carefully, and explain the novelty in your proposed work.
• The references we suggest for each project are only starting points. Your are expected to make extensive use of google, citeseer, and other search tools to find your way in the literature, read several papers, and develop an understand of what's out there that's related to what you are trying to do.
• This document should be prepare as a latex file, with a separate *.bib file containing the papers that you refer to in the body of the document.

Preliminary report

• This report should be an extension of survey, and should incorporate any feedback you received on the survey. The report should be one *.tex file and one *.bib file.
• Complete OCaml code for the unstaged program. This should be in a single file. Code must be executable. Code should be structured to be self explanatory:
• Use simple, short names.
• Use carefully designed datatypes.
• Use carefully chosen function/object abstraction boundaries.
• Use the least amount of in-code documentation possible. In addition, the text should include a concise and self-contained description of the code. MSP and "Gentle Intro" can be used as a style guide. This means that code snippets can be inserted into the document, and all key parts of the code must be clearly explained.
• Is there a natural baseline for this application?
• What will be the early parameters?
• What will be the later parameters?
• What kind of work will staging eliminate?
• Give concrete examples of static inputs, and write expected results of staged program.
• Are there places where you think BTA's may be needed? If so, list these places and explain which BTA's are needed or may be useful.

Final report

• Submit complete MetaOCaml code for unstaged and staged program. As for the previous report, ALL code should be in a single file, and the documentation in the report should follow the guidelines above. Additionally, now all staging issues that arise in the staging the functions must be carefully explained. The file containing the code must be executable under both bytecode and native code modes. For the native code mode, you should include a small make file that will produce the image. When executed, the program should function in its normal mode, but should offer a clear option for the programmer for running all the performance benchmarks that you describe in your report.
• Extensive test cases and benchmarks (one file). The report should contain concrete performance measurements gathered using one of the OCaml benchmarking functions, and a clear explanation of what each of the benchmarks does. The report should also report figures on appropriate baselines, at least for some of the benchmarks. The report should put forth a clear proposal for how these performance measurements should be interpreted.
• The root directly at the URL for the project should contain a file called "your-user-id-here.tar" containing everything in that directory. We will download this file after the end of the semester. Please do not remove this file until we have notified you that we have downloaded it.

## List of Possible Projects

• The Knuth-Morris-Pratt string matching algorithm. (Seth Fogarty)

Suggested starting point:

Mads Sig Ager, Olivier Danvy, Henning Korsholm Rohde: On obtaining Knuth, Morris, and Pratt's string matcher by partial evaluation. ASIA-PEPM 2002: 32-46

Suggested starting points:

C. Consel and S.C. Khoo. Semantics-directed Generation of a Prolog Compiler, In 3rd International Symposium, PLILP'91, August 1991, Germany, LNCS 528, 135 - 146.

Hassan Ait-Kaci. Warren's Abstract Machine: A tutorial reconstruction.

Olivier Danvy, Bernd Grobauer, Morten Rhiger. A Unifying Approach to Goal-Directed Evaluation. SAIG 2001: 108-125

Suggested starting points:

Philip Wadler. GJ, Pizza, and Java.

• A Java virtual machine interpreter and benchmarking it.

Suggested starting point:

S. Thibault, C. Consel, J. Lawall, R. Marlet, and G. Muller. Static and Dynamic Program Compilation by Interpreter Specialization. Higher-Order and Symbolic Computation, 13(3):161--178, September 2000.

• A network packet filter.

Suggested starting point:

D. McNamee, J. Walpole, C. Pu, C. Cowan, C. Krasic, A. Goel, P Wagle, C. Consel, G. Muller, and R. Marlet. Specialization tools and techniques for systematic optimization of system software. ACM Transactions on Computer Systems, 19(2):217--251, May 2001.

• A protocol stack.

Suggested staring point:

S. Bhatia, C. Consel, A.-F. Le Meur, and C. Pu. Automatic Specialization of Protocol Stacks in OS kernels. In Proceedings of the 29th Annual IEEE Conference on Local Computer Networks, Tampa, Florida, November 2004.

• A ray-tracing algorithm.

Suggested starting points:

Peter Holst Andersen. Partial Evaluation Applied to Ray Tracing.

See section on ray tracing in Jones, N.D., Gomard, C.K, and Sestoft, P. Partial Evaluation and Automatic Program Generation. Prentice-Hall. 1993.

## Homework Assignments

### Homework 1

Look up the definitions of the list map and list fold functions. Stage map two times, once assuming the function arrives early and the second assuming the list arrives early. Stage fold assuming the function and constant arrive early, and then assuming the list arrives early. Collect timeing results, and explain carefully the effect of staging on these functions.

Make sure that you use different input values for your timing experiment. Come up with an small but interesting function that takes two inputs and stage it with respect to one of the inputs. Hand in staged program and performance data.

### Homework 2

1) Consider the big-step semantics defined on page 75.

1.a) Write the full derivation of the evaluation of the term

e == ((\lambda f .\lambda x. f x ((\lambda z. f x z) x)) (\lambda a. \lambda b. a)) (\lambda r. r)

That is, find some e' and build the full derivation

e \hookrightarrow e'

It's best to do write this by hand. Don't waste time typesetting any of
this assignment.

1.b) We can define the set of values as follows:

v \in V ::= i | \lambda x.e

Prove that if e \hookrightarrow e' then e' \in V. Give full details
of each step in your proof.

2) Consider the type system in Figure 5.2 on page 80.

2.a) For the term "e" you evaluated above, find a typing assignment. That is,
find one (specific) \Gamma and one (specific) type t such that

\Gamma |- e : t

2.b) Consider only the subset without <>, ~, and run. (Note that this
means you can ignore levels -- they are always zero when there are no
multi-stage constructs). Prove the following substitution lemma:

- if \Gamma |- e_1 : t_1

and \Gamma x:t_1 |- e_2 : t_2

then \Gamma |- e_2 [x:=e_1] : t_2`

Note that this proof only involves the type system.
Again, give full details in your proof.

3) Discuss the type systems for MetaML presented in Chapter 4 and 5.
For each type system, give two new examples term (that is, not discussed in
the class or the chapter). One term should be typeable, and the other
not typable. Explain why such a term would be useful, and explain why it
is either typable or untyable (that is, give typing derivations, or show
that no such typing derivations can exist).

4) (Optional) You may revise your review for Chapter 5 and submit it with
this homework.

### Homework 3

Write the power function in Cyclone, and show what it compiles into.

### Homework 4

Now you that have writen the power function in Cyclone, and show what it compiles into,
mistakes, also include the corrected version). Compare your compiled version based
on what you did by hand to the compiled version generated by the compiler.

We now know that Cyclone is called Popcorn. Popcorn can

### Homework 5

Implement the compilation strategy in the Popcorn paper for the full toy language described in the paper.
A set of examples will be issued during the week. Your compiler will be tested using these examples.

## References

All primary references can be found via links in the course schedule.