Ram Rajamony: Optimally synchronizing loops

Optimally Synchronizing Loops


When a compiler parallelizes a sequential loop for execution on a shared memory multiprocessor, it must add synchronization to enforce data dependences. As synchronization is a high overhead operation, a compiler should insert as little of it as possible. In the process of developing the algorithms that I use in my dissertation research to detect excess synchronization, I discovered that current methods, originating from the mid-80s, add more synchronization than is necessary.

Minimizing the synchronization that must be added to a loop, while preserving the parallelism that can be extracted from it, is an NP-Complete problem. After establishing this, I went on to develop a set of algorithms for inserting synchronization. My first algorithm uses 0-1 integer programming to insert the optimal amount of synchronization. Advances in integer programming technology (see CPLEX for instance) have made this approach feasible for most cases. However, since there is a possibility that the solution may take exponential time, I have developed a second algorithm. This algorithm approximates the minimization problem as a simple problem on interval graphs, a class of graphs with special properties. Although this algorithm may not insert the minimal amount of synchronization, it does better than existing methods. It is also very fast, taking time close to linear in the number of dependences. Since it is fast and improves upon the state-of-the-art, it is a prime candidate for inclusion in commercial shared memory compilers.

Relevant Publications:


<- Back to the research summary page
Ramakrishnan Rajamony
E-mail: (MyLastName) at us.ibm.com [please do not e-mail me at cs.rice.edu]
Last updated at 14:08 CST on Wednesday, February 04, 1998