Multi-stage Programming

Course #:  COMP 511
Instructor:  Walid Taha (DH 3103) and Emir Pasalic (DH 3075)
Class time:  MWF 1:00AM – 1:50AM (Office Hours:  M 2:00 – 3:00)
Class room: DH 1075

 
   

Introduction

Multi-stage programming is a new approach to building generic programs that do not pay a runtime overhead for their generality. Languages that support multi-stage programming allow us to perform symbolic computation and partial evaluation, making it easier to achieve the goals of multi-stage programming. Such languages are particularly useful for building efficient implementations of other programming languages, including domain-specific ones (DSLs).  Over the last few years, there has significant research into such languages, including issues such as their implementation, type systems, semantics, their use to define macros, and their applications.

The goal of this course is to introduce you to this new technology and to research in this area.  The course includes an introduction to the applications of these languages, the theoretical foundations, and implementation issues. Course work will include reading research papers (25%), discussions (10%), homework exercises (25%), a term project (25%), and a final (15%). Because of the fast pace of the course, late submissions will not be accepted.

Students are encouraged to discuss the material covered in class in homeworks on class mailing list (comp-511-L). Please make sure you subscribe to this list, as any important announcements to the class will be posted there.

Lectures

#

Date

Day

Topic

Details

Reading (Review due this class)

Exercises (Due this class)

Project activity (Due this class)

1

1/12/05

W

Course overview

Admin.  Staging, power ex. Lecture notes.

     

2

1/14/05 F

Basics of multi-stage programming

Why generation.  Why MSP notation and type systems. Lecture notes.

MSP Ch1

Read OCaml Ch1

 
  1/17/05 M No class.  
   

3

1/19/05 W  

Q & A (mostly on ML). Lecture notes.

MSP Ch2

Read OCaml Ch2. HWK1 (inc. OCaml problems)

 

4

1/21/05 F  

Renaming, staging, small examples. Lecture Notes.


Read OCaml Ch3

 

5

1/24/05 M  

Bigger example:  term rewriting. Lecture Notes.

MSP Ch3

   

6

1/26/05 W   Lecture Notes.      

7

1/28/05 F  

Language support for implementing domain-specific languages. Lecture notes.

MetaOCaml, Template Haskell, and C++

   

8

1/31/05 M   Lecture notes.      

9

2/2/05 W  

Abstract interpretation. Lecture notes.

Generating verifiable circuits


 

10

2/4/05 F   Lecture notes.      

11

2/7/05 M  

Implementing languages using MSP. Lecture notes.

A Gentle Intro


 

12

2/9/05 W

(Discussion of projects) Lecture notes.

    Survey of literature

13

2/11/05 F

Related systems

C-based


   

14

2/14/05 M     `C    

15

2/16/05 W   Lecture notes.

Cyclone


 

16

2/18/05 F          

17

2/21/05 M    

 

   

18

2/23/05 W   Lecture notes.      

19

2/25/05 F  

Java-based

 

   
20 2/28/05 M          

21

3/2/05 W   Lecture notes.

Tempo, Jumbo


 

22

3/4/05 F         Preliminary report
  3/7/05 M No class.        
  3/9/05 W No class.        
  3/11/05 F No class.        

23

3/14/05 M    

 

   

24

3/16/05 W     Metaphor, MetaAspectJ    

25

3/18/05 F

 

 

 


 

26

3/21/05 M          

27

3/23/05 W Foundations of MSP

MSP Ch4, Sound Reductions for Untyped CBN MetaML

   

28

3/25/05 F          

29

3/28/05 M  

 

 


 

30

3/30/05 W   MSP type systems MSP Ch5, Classifiers    

31

4/1/05 F  

 

   

32

4/4/05 M          

33

4/6/05 W  

RAP type systems

Generating heap-bounded programs  

  Final report
  4/8/05 F No class.        

34

4/11/05 M Macros        

35

4/13/05 W     An Empirical Analysis of C Preprocessor Use.,    

36

4/15/05 F      
 

37

4/18/05 M          

38

4/20/05 W    

Programmable Syntax Macros, Type-Safe, Generative, Binding Macros in MacroML

   

39

4/22/05 F

Reflections on program generation

       

40

4/25/05 M    

Beyond Templates,   A New Approach to Data Mining for Software Design

   

41

4/27/05

W

         

42 

4/29/05

         
               
               
           
 

Reviews

Project Deliverables

Survey of literature

Preliminary report

  • Use simple, short names.
  • Use carefully designed datatypes.
  • Use carefully chosen function/object abstraction boundaries.
  • Use the least amount of in-code documentation possible. In addition, the text should include a concise and self-contained description of the code. MSP and "Gentle Intro" can be used as a style guide. This means that code snippets can be inserted into the document, and all key parts of the code must be clearly explained.

Final report

List of Possible Projects

Suggested starting point:

Mads Sig Ager, Olivier Danvy, Henning Korsholm Rohde: On obtaining Knuth, Morris, and Pratt's string matcher by partial evaluation. ASIA-PEPM 2002: 32-46

Suggested starting points:

C. Consel and S.C. Khoo. Semantics-directed Generation of a Prolog Compiler, In 3rd International Symposium, PLILP'91, August 1991, Germany, LNCS 528, 135 - 146.

Hassan Ait-Kaci. Warren's Abstract Machine: A tutorial reconstruction.

Olivier Danvy, Bernd Grobauer, Morten Rhiger. A Unifying Approach to Goal-Directed Evaluation. SAIG 2001: 108-125

Suggested starting points:

Philip Wadler. GJ, Pizza, and Java.

Walid Taha. Method Invocation in Object-Oriented Languages.

Suggested starting point:

S. Thibault, C. Consel, J. Lawall, R. Marlet, and G. Muller. Static and Dynamic Program Compilation by Interpreter Specialization. Higher-Order and Symbolic Computation, 13(3):161--178, September 2000.

Suggested starting point:

D. McNamee, J. Walpole, C. Pu, C. Cowan, C. Krasic, A. Goel, P Wagle, C. Consel, G. Muller, and R. Marlet. Specialization tools and techniques for systematic optimization of system software. ACM Transactions on Computer Systems, 19(2):217--251, May 2001.

Suggested staring point:

S. Bhatia, C. Consel, A.-F. Le Meur, and C. Pu. Automatic Specialization of Protocol Stacks in OS kernels. In Proceedings of the 29th Annual IEEE Conference on Local Computer Networks, Tampa, Florida, November 2004.

Suggested starting points:

Peter Holst Andersen. Partial Evaluation Applied to Ray Tracing.

See section on ray tracing in Jones, N.D., Gomard, C.K, and Sestoft, P. Partial Evaluation and Automatic Program Generation. Prentice-Hall. 1993.

Homework Assignments

Homework 1

Download MetaOCaml, install it, read the README-META file, read and run the examples in the mex directory.

Look up the definitions of the list map and list fold functions. Stage map two times, once assuming the function arrives early and the second assuming the list arrives early. Stage fold assuming the function and constant arrive early, and then assuming the list arrives early. Collect timeing results, and explain carefully the effect of staging on these functions.

Make sure that you use different input values for your timing experiment. Come up with an small but interesting function that takes two inputs and stage it with respect to one of the inputs. Hand in staged program and performance data.

Homework 2

1) Consider the big-step semantics defined on page 75.

1.a) Write the full derivation of the evaluation of the term

e == ((\lambda f .\lambda x. f x ((\lambda z. f x z) x))
(\lambda a. \lambda b. a))
(\lambda r. r)

That is, find some e' and build the full derivation

e \hookrightarrow e'

It's best to do write this by hand. Don't waste time typesetting any of
this assignment.

1.b) We can define the set of values as follows:

v \in V ::= i | \lambda x.e

Prove that if e \hookrightarrow e' then e' \in V. Give full details
of each step in your proof.

2) Consider the type system in Figure 5.2 on page 80.

2.a) For the term "e" you evaluated above, find a typing assignment. That is,
find one (specific) \Gamma and one (specific) type t such that

\Gamma |- e : t

2.b) Consider only the subset without <>, ~, and run. (Note that this
means you can ignore levels -- they are always zero when there are no
multi-stage constructs). Prove the following substitution lemma:

- if \Gamma |- e_1 : t_1

and \Gamma x:t_1 |- e_2 : t_2

then \Gamma |- e_2 [x:=e_1] : t_2

Note that this proof only involves the type system.
Again, give full details in your proof.

3) Discuss the type systems for MetaML presented in Chapter 4 and 5.
For each type system, give two new examples term (that is, not discussed in
the class or the chapter). One term should be typeable, and the other
not typable. Explain why such a term would be useful, and explain why it
is either typable or untyable (that is, give typing derivations, or show
that no such typing derivations can exist).

4) (Optional) You may revise your review for Chapter 5 and submit it with
this homework.

Homework 3

Write the power function in Cyclone, and show what it compiles into.

Homework 4

Now you that have writen the power function in Cyclone, and show what it compiles into,
download the Cyclone system and check the correctness of your source-level definition (if it had
mistakes, also include the corrected version). Compare your compiled version based
on what you did by hand to the compiled version generated by the compiler.

We now know that Cyclone is called Popcorn. Popcorn can
be downloaded from: http://www.cs.cornell.edu/talc/releases.html

Homework 5

Implement the compilation strategy in the Popcorn paper for the full toy language described in the paper.
A set of examples will be issued during the week. Your compiler will be tested using these examples.

References

All primary references can be found via links in the course schedule.

Additional References

Resources

Current Class Projects


Previous Year’s Class Projects

Accomodations for Students with Special Needs

Students with disabilities are encouraged to contact me during the first two weeks of class regarding any special needs. Students with disabilities should also contact Disabled Student Services in the Ley Student Center and the Rice Disability Support Services.