Top blue bar image Department of Computer Science

News

Mellor-Crummey Wins $2M DOE ECP Award

November 10, 2016

John Mellor-Crummey

John Mellor-Crummey, Professor of Computer Science and Electrical and Computer Engineering, recently won a Department of Energy (DOE) grant to fund research and development of performance measurement and analysis tools for forthcoming exascale supercomputers.

Exascale computers are needed to solve complex problems related to the DOE's research priorities, which include enhancing renewable energy resources, developing advanced materials, and designing experimental facilities. The goal for exascale systems is to perform one billion billion calculations in a second. (There are nine zeroes in one billion: 1,000,000,000. The number of calculations per second in an exascale system will contain 18 zeroes.)

No single processor could ever compute this fast, so researchers depend on parallel computing to partition calculations across multiple processors working together. Such systems require a complex software stack to provide the instructions that direct each of the processors to solve specific parts of large-scale problems.

"Understanding how to tailor applications to exploit massive parallelism and complex hardware resources within and across nodes is a problem that must be solved if we are to harness the power of exascale systems," said Mellor-Crummey.

One of daunting challenges facing future generations of supercomputers is energy consumption. To make exascale supercomputers possible, computer scientists are struggling to balance the need for increased computational power against the need for energy efficiency. The DOE created the Exascale Initiative to meet the growing needs for computing power requirements with the aim of deploying capable exascale computing systems by 2023. To this extent, the DOE's Exascale Computing Project (ECP) is funding efforts like Mellor-Crummey's to develop the technologies necessary to make exascale computing possible.

Mellor-Crummey said that his group's work as part of the exascale computing initiative is focused on extending Rice University's HPCToolkit software, which is used to analyze application performance on systems around the world ranging from desktops to supercomputers. "Today, HPCToolkit is used to pinpoint performance and scalability bottlenecks in a diverse set of open science and national security applications," said Mellor-Crummey. "Our proposal seeks to enhance HPCToolkit to meet the needs of developers working at all levels of the software stack on emerging and future extreme-scale systems."

The $2M DOE ECP award will be used by Mellor-Crummey and his collaborators at the University of Wisconsin-Madison to develop a new generation of tools that can cope with the growing complexity of architectures used for extreme-scale computing as well as dramatic changes to the software stack. "Providing insights into the behaviors of these emerging components will require integration of performance monitoring interfaces to support observation and interpretation," he said.

Meeting the goal of a thousand-fold increase in processing power (from the current petascale to projected exascale) can be accomplished only by changing approaches to both hardware and software. Mellor-Crummey said, "The extreme parallelism of exascale systems both within and across nodes will require redesign of tool capabilities for monitoring, data collection, data analysis, and presentation. Preparing tools for exascale must begin immediately, otherwise tools will never be able to manage the complexity and scale of the hardware and software of next-generation systems."