Comp 515—Advanced Compilation for High Performance Computers
Syllabus
Prof. Ken Kennedy
Spring 2006
Introduction
Parallel and vector architectures. The problem of parallel programming. Bernstein's conditions and the role of dependence. Compilation for parallel machines and automatic detection of parallelism.
Dependence Theory
Fundamentals of dependence, dependence in loops, types of dependences. Direction vectors and distance vectors. Fundamental theorem of parallelization.
Dependence Testing
Testing for dependence: separable, gcd and Banerjee tests. Exact dependence testing. Construction of direction and distance vectors.
Preliminary Transformations
Loop normalization, scalar data flow analysis, induction variable substitution, scalar renaming.
Fine-Grain Parallel Code Generation
Loop distribution and its safety. The Kuck vectorization principle. The layered vector code-generation algorithm and its complexity. Loop interchange.
Coarse-Grain Parallel Code Generation
Loop Interchange. Loop Skewing. Scalar and array expansion. Forward substitution. Alignment. Code replication. Array renaming. Node splitting. Pattern recognition. Threshold analysis. Symbolic dependence tests. Parallel code generation and its problems.
Control Dependence
Types of branches. If conversion. Control dependence. Program dependence graph.
Memory Hierarchy Management
The use of dependence in scalar register allocation and management of the cache memory hierarchy. Topics include scalar replacement, unroll-and-jam, loop alignment, cache blocking, and prefetching.
Scheduling for Superscalar and Parallel Machines Machines
Role of dependence. List Scheduling. Software Pipelining. Work scheduling for parallel systems. Guided Self-Scheduling
Interprocedural Analysis and Optimization
Side effect analysis, constant propagation and alias analysis. Flow-insensitive and flow-sensitive problems. Side effects to arrays. Inline substitution, linkage tailoring and procedure cloning. Management of interprocedural analysis and optimization.
Compilation of Other Languages.
C, Verilog, Fortran 90, HPF.
Text
Allen and Kennedy, Optimizing Compilers for Modern Architectures, Morgan-Kaufmann, Second Printing, 2005.
References
Banerjee, Dependence Analysis, Kluwer Academic Publishers.
Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley, 1996.
Wolfe, Optimizing Supercompilers for Supercomputers, MIT Press.
Zima and Chapman, Supercompilers for Parallel and Vector Computers, ACM Press.
Course Requirements
Class Presentations (50%). Each student will be required to present at least one course topic and participate in the discussions of presentations by others.
Oral Examination (50%). There will be a final oral exam.