Minimizing the synchronization that must be added to a loop, while
preserving the parallelism that can be extracted from it, is an NP-Complete
problem. After establishing this, I went on to develop a set of algorithms
for inserting synchronization. My first algorithm uses 0-1 integer
programming to insert the optimal amount of synchronization. Advances in
integer programming technology (see CPLEX
for instance) have made this approach feasible for most
cases. However, since there is a possibility that the solution may take
exponential time, I have developed a second algorithm. This algorithm
approximates the minimization problem as a simple problem on
interval
graphs, a class of graphs with special properties. Although this
algorithm may not insert the minimal amount of synchronization, it does
better than existing methods. It is also very fast, taking time close to
linear in the number of dependences. Since it is fast and improves upon the
state-of-the-art, it is a prime candidate for inclusion in commercial shared
memory compilers.
Relevant Publications: