Guohua Jin and John Mellor-Crummey. Experiences Tuning SMG98 --- a Semicoarsening Multigrid Benchmark based on the hypre Library Proceedings of the International Conference on Supercomputing, June 22-26, 2002, New York, New York, USA. [pdf]

LLNL's hypre library is an object-oriented library for the solution of sparse linear systems on parallel computers. While hypre facilitates rapid-prototyping of complex parallel applications, our experience is that without careful attention to temporal data locality, node performance of applications developed using hypre will fall significantly short of peak performance on architectures based on modern microprocessors. In this paper, we describe our experiences analyzing and tuning the performance of SMG98, a benchmark that exercises hypre's semicoarsening multigrid solver. In the original code, the lack of temporal data reuse in the registers and caches significantly hurts performance. We describe a variety of techniques we applied to hand-tune the performance of hypre's semicoarsening multigrid solver. We expect that similar strategies will be applicable to other solvers and codes based on hypre as well. We present performance measurements of SMG98 on both SGI Origin and Compaq Alpha platforms. Overall, our optimizations improve the node performance of SMG98 by nearly a factor of two on large problems.