Overview of the dHPF Compiler


Project Leaders:
Ken Kennedy, John Mellor-Crummey and Vikram Adve

Current Participants:
Arun Chauhan, Chen Ding, Katherine Fletcher, Robert Fowler, Charles Koelbel, Bo Lu, Collin McCurdy, Nat McIntosh, Monika Mevenkamp, Dejan Mircevski, Mike Paleczny, Ajay Sethi, Lisa Thomas, Lei Zhou


Parallel Compiler and Tools Group
Center for Research on Parallel Computation
Rice University


Group Mission:
Develop integrated compiler and tools to support effective machine-independent parallel programming

-last updated May 1996 by johnmc@cs.rice.edu




Contents:


Motivation

HPF compiler technology must be expanded in 3 directions:



Compiler Goals

  • Extensible platform for experimental research on compilation techniques and programming tools

  • Hand-coded (or better) performance
  • Optimization for multiple architecture classes
  • Programming tools that fully support abstract, high-level, programming models



    Fortran D95 Language

    Fortran D95 is designed to support research on data-parallel programming in High Performance Fortran (HPF) and to explore extensions that would broaden HPF's applicability or enhance performance.


    Features




    dHPF Compiler Organization

                                    Front End
    v Parallelism Preliminary Communication Placement
    v and Computation Partitioning
    Communication v ^ Placement Communication Refinement
    v Code Generation


    Front End

    Purpose: parsing, semantic checking of HPF directives, and preprocessing code for further analysis

    Limitations (May 1996)



    Preliminary Communication Placement

    Purpose: provide feedback to the computation partitioner about where (conservatively) communication might be needed

    Strategy

    Limitations (May 1996)

  • no inspector placement for irregular data accesses



    Computation Partitioning Selection

    Purpose: a framework to evaluate and select from several computation partitioning alternatives, not restricted to the owner-computes rule.

    Limitations (May 1996)



    Communication Refinement

    Purpose: given a computation partition choice, CP, determine and optimize the communication required [example]

    Limitations (May 1996)



    Code Generation Overview

    Purpose: generate SPMD node program (F77 or F90)
    [source for running example]

    Partitioning and communication based on Omega Library for integer set manipulation



    Code Generation for Computation Partitioning



    Omega Library (University of Maryland)



    Omega-Based Framework for Data Parallel Code Generation



    Omega-Based Framework: Example

    Example: SEND set from processor MYPID to processor Q:
    
      #   !HPF$ distribute A(BLOCK) on P(4)      ! sic
      #   do i = 1, 100
      #      ... = A(i-1) + ...                  ! non-local read
      #   enddo
    	
      symbolic MYPID, Q
      
      Loop	         := { [i] : 1 <= i <= 100 }
      RefSubscript     := { [i] -> [i-1] }
      MyArraySection   := { [i] : 25 * MYPID <= i <= 25 * MYPID + 25 }
      
      LocalIterSetForQ := { [i] : iter i is executed by processor `Q'}
      ReadSectionForQ  := RefSubscript(LocalIterSetForQ)    # composition
    
      SendToQ          := ReadSectionForQ intersection MyLocalArraySection
           
      codegen SendToQ
    



    Optimizations using the Omega Framework - I

    Loop-splitting to minimize guards for buffer access on potentially non-local references (if iterations can be re-ordered):



    Optimizations using the Omega Framework - II

    Loop-splitting for computation-communication overlap (if iterations can be re-ordered):



    Compiled Examples




    http://www.cs.rice.edu/~dsystem/dhpf/dhpf-overview-96/index.html