John Mellor-Crummey's Publications by Area
[Note: If you are interested in a copy of a paper or publication that
is not available for download below, email me and I'll be happy to
provide it.]
Table of Contents
Data Parallel Compiler and Run-time Technology
Parallel Programming Environments
Memory Hierarchy Management
Grid Computing
Compiler Technology for Domain-Specific Languages
Parallel Debugging
Multiprocessor Synchronization
Performance Modeling
Performance Analysis
Compiler Technology
Parallel Applications
Operating Systems
Architectures
Data Parallel Compiler and Run-time Technology
-
Cristian Coarfa, Yuri Dotsenko, John Mellor-Crummey. Experiences with Sweep3D Implementations in Co-arr
ay Fortran. In Proceedings of the Los Alamos Computer Science Institute 5th Annual Symposium (LACSI
2004), Santa Fe, New Mexico, October 2004.
[abstract]
[pdf]
-
Yuri Dotsenko, Cristian Coarfa, John Mellor-Crummey, Daniel Chavarria-Miranda. Experiences with Co-array Fortran on Hardware Shared Memory Platforms. In Proceedings of the 17th International Workshop on
Languages and Compilers for Parallel Computing (LCPC 2004), West Lafayette, Indiana, September 2004
.
[abstract]
[pdf]
-
Yuri Dotsenko, Cristian Coarfa, and John Mellor-Crummey.
A Multi-platform Co-Array Fortran Compiler.
In Proceedings of the 13th International Conference on
Parallel Architecture and Compilation Techniques (PACT 2004),
Antibes Juan-les-Pins, France, September, 2004.
[abstract]
[pdf]
-
Cristian Coarfa, Yuri Dotsenko, Jason Eckhardt, and John Mellor-Crummey.
Co-Array Fortran Performance and Potential: An NPB Experimental Study.
Proceedings of the 16th Intl. Workshop on Languages and Compilers for Parallel Computing,
College Station, TX, Oct 2003.
[abstract]
[pdf]
-
Alain Darte, John Mellor-Crummey, Robert Fowler, and Daniel
Chavarria-Miranda.
Generalized multipartitioning of multi-dimensional arrays for parallelizing
line-sweep computations.
Journal of Parallel and Distributed Computing
63(9), September 2003, Pages 887-911.
[abstract]
[pdf]
-
Daniel Chavarria-Miranda and John Mellor-Crummey.
An Evaluation of Data-Parallel Compiler Support for Line-Sweep Applications.
The Journal of Instruction-Level Parallelism, vol. 5, February 2003
(http://www.jilp.org/vol5).
Special issue with selected papers from:
The Eleventh International Conference on Parallel Architectures and Compilation Techniques, September 2002.
Guest Editors: Erik Altman and Sally McKee. [pdf].
-
Cristian Coarfa, Yuri Dotsenko, Daniel Chavarria-Miranda, and John Mellor-Crummey
An Emerging Co-Array Fortran Compiler.
Extended poster abstract.
Proceedings of the Los Alamos Computer Science Institute Third Annual Symposium
October, 2002, Santa Fe, NM. Published on CD-ROM.
[abstract]
[pdf]
-
Daniel Chavarria-Miranda and John Mellor-Crummey.
An Evaluation of Data-Parallel Compiler Support for Line-Sweep Applications.
Proceedings of PACT'02: Eleventh International Conference on Parallel
Architectures and Compilation Techniques , September 2002,
Charlottesville, VA.
Selected as Best Student Paper
[pdf]
-
Daniel Chavarría-Miranda, Alain Darte,
Robert Fowler, and John
Mellor-Crummey. Generalized Multipartitioning
for Multi-dimensional Arrays. In
Proceedings of International Parallel and Distributed Processing
Symposium, Fort Lauderdale, FL, April 2002. Selected as Best
Paper in Algorithms.
[abstract],
[ps],
[pdf]
-
John Mellor-Crummey, Vikram Adve, Bradley Broom, Daniel
Chavarria-Miranda, Robert Fowler, Guohua Jin, Ken Kennedy, and Qing Yi. Advanced
optimization strategies in the Rice dHPF compiler.
Concurrency and Computation: Practice
and Experience. 14:741-767, 2002.
[abstract],
[ps],
[pdf]
-
Daniel Chavarría-Miranda, Alain Darte, Robert Fowler, and John Mellor-Crummey.
On efficient parallelization of line-sweep computations. Research Report
2001-45, Laboratoire de l'Informatique du Parallélisme, École
Normale Supériore de Lyon, November 2001.
-
Alain Darte, John Mellor-Crummey, Robert Fowler, and Daniel Chavarrí
a-Miranda. Generalized multipartitioning. In Proceedings of the Los
Alamos Computer Science Institute Second Annual Symposium, Santa Fe,
NM, October 2001.
-
Daniel Chavarría-Miranda, John Mellor-Crummey, and Trushar Sarang.
Data-parallel compiler support for multipartitioning. In European Conference
on Parallel Computing (Euro-Par), Manchester, United Kingdom, August
2001.
-
Alain Darte, John Mellor-Crummey, Robert Fowler, and Daniel Chavarrí
a-Miranda. On efficient parallelization of line-sweep computations. In
9th Workshop on Compilers for Parallel Computers, Edinburgh, Scotland,
June 2001.
-
Bradley Broom, Daniel Chavarría-Miranda, Guohua Jin, Rob Fowler,
Ken Kennedy, and John Mellor-Crummey. Overpartitioning with the Rice dHPF
compiler. In Proceedings of the 4th Annual HPF User Group meeting,
Tokyo, Japan, October 2000.
-
Kai Zhang, John Mellor-Crummey, and Robert Fowler. Compilation and runtime
optimizations for software distributed shared memory. In Proceedings
of the Fifth Workshop on Languages, Compilers, and Runtime Systems for
Scalable Computers, Lecture Notes in Computer Science 1915, pages 182-191,
Rochester, NY, May 2000. Springer-Verlag.
-
Daniel Chavarría-Miranda and John Mellor-Crummey. Towards compiler
support for scalable parallelism. In Proceedings of the Fifth Workshop
on Languages, Compilers, and Runtime Systems for Scalable Computers,
Lecture Notes in Computer Science 1915, pages 272-284, Rochester, NY, May
2000. Springer-Verlag.
-
John Mellor-Crummey and Vikram Adve. Simplifying control flow in compiler-generated
parallel code.
International Journal of Parallel Programming, 26(5),
1998.
-
Collin McCurdy and John Mellor-Crummey. An evaluation of computing paradigms
for n-body simulations on distributed memory architectures. In Proceedings
of the Seventh ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming, May 1999.
-
Vikram Adve, Guohua Jin, John Mellor-Crummey, and Qing Yi. High Performance
Fortran Compilation Techniques for Parallelizing Scientific Codes. In Proceedings
of SC98: High Performance Computing and Networking, Orlando, FL, Nov
1998.
-
Vikram Adve and John Mellor-Crummey. Using Integer Sets for Data-Parallel
Program Analysis and Optimization. In Proceedings of the SIGPLAN '98
Conference on Programming Language Design and Implementation, Montreal,
Canada, June 1998.
-
Bo Lu and John Mellor-Crummey. Compiler optimization of implicit reductions
for distributed memory multiprocessors. In Proceedings of the 12th International
Parallel Processing Symposium, Orlando, FL, March 1998.
-
John Mellor-Crummey and Vikram Adve. Simplifying control flow in compiler-generated
parallel code (extended abstract). In Proceedings of the Tenth International
Workshop on Languages and Compilers for Parallel Computing, Lecture
Notes in Computer Science 1366, Minneapolis, MN, August 1997. Springer-Verlag.
A full version of this paper was selected for publication in a special
issue of the International Journal of Parallel
Programming.
-
Vikram Adve and John Mellor-Crummey. ``Advanced Code Generation for High
Performance Fortran''.
Languages, Compilation Techniques and Run Time
Systems for Scalable Parallel Systems (Recent Advances and Future Perspectives)
(D. P. Agrawal and S. Pande, editors), Lecture Notes in Computer Science.
Springer-Verlag, Berlin, 1997.
-
G. Roth, J. Mellor-Crummey, K. Kennedy, and R. G. Brickner. Compiling stencils
in High Performance Fortran. In Proceedings of SC'97: High Performance
Networking and Computing, San Jose, CA, November 1997.
-
K. Kennedy, J. Mellor-Crummey, and G. Roth. Optimizing Fortran 90 shift
operations on distributed-memory multicomputers. In Languages and Compilers
for Parallel Computing, Eighth International Workshop, Columbus, OH,
August 1995. Springer-Verlag.
-
John Mellor-Crummey, Vikram Adve, and Charles Koelbel. The compiler's role
in analysis and tuning of data parallel programs. In Proceedings of
the Workshop on Environments and Tools for Parallel and Scientific Computing,
pages 211-220, Townsend, TN, May 1994.
-
Seema Hiranandani, Ken Kennedy, John Mellor-Crummey, and Ajay Sethi. Compilation
techniques for block-cyclic distributions. In Proc. of the 1994 International
Conference on Supercomputing, Manchester, England, July 1994.
-
Uli Kremer, John Mellor-Crummey, Ken Kennedy, and Alan Carle. Automatic
data layout for distributed-memory machines in the D programming environment.
In Proc. of AP'93 Intl. Workshop on Automatic Distributed Memory Parallelization,
Automatic Data Distribution, and Automatic Parallel Performance Prediction,
pages 108-123, Saarbrücken, Germany, March 1993.
-
Seema Hiranandani, Ken Kennedy, John Mellor-Crummey, and Ajay Sethi. Advanced
compilation techniques for FORTRAN D. Technical Report CRPC-TR93338, Center
for Research on Parallel Computation, Rice University, October 1993.
Parallel Programming Environments
-
Vikram Adve, Jhy-Chun Wang, John Mellor-Crummey, Dan Reed, Mark Anderson,
and Ken Kennedy. An integrated compilation and performance analysis environment
for data parallel programs. In Proceedings of Supercomputing '95,
San Diego, CA, November 1995.
-
Vikram Adve, Jhy-Chun Wang, John Mellor-Crummey, Dan Reed, Mark Anderson,
and Ken Kennedy. Integrating compilation and performance analysis for data
parallel programs. In Proceedings of the Workshop on Debugging and Performance
Tuning of Parallel Computing Systems, Chatham, MA, October 1995.
-
Vikram Adve, Alan Carle, Elana Granston, Seema Hiranandani, Ken Kennedy,
Charles Koelbel, Ulrich Kremer, John Mellor-Crummey, and Scott Warren.
Requirements for data parallel programming environments.
IEEE Transactions
on Parallel and Distributed Technology, 2(3):48-58, Fall 1994.
-
Keith D. Cooper, Mary W. Hall, Robert T. Hood, Ken Kennedy, Kathryn S.
McKinley, John M. Mellor-Crummey, Linda Torczon, and Scott K. Warren. The
ParaScope parallel programming environment. In Proceedings of the IEEE,
volume 81, February 1993. Submmitted by invitation.
Memory Hierarchy Management
-
John Mellor-Crummey and John Garvin.
Optimizing Sparse Matrix Vector Multiply using Unroll-and-jam.
International Journal of High Performance Computing Applications,
18(2), Summer 2004.
[pdf]
-
Guohua Jin and John Mellor-Crummey.
On Reducing Storage Requirement of Scientific Applications.
Proceedings of the Los Alamos Computer Science Institute Fourth Annual Symposium
October, 2003,
Santa Fe, NM. Published on CD-ROM.
[abstract]
[pdf]
-
Apan Qasem, Guohua Jin, and John Mellor-Crummey.
Improving Performance with Integrated Program Transformations
Technical Report TR03-419, Dept. of Computer Science, Rice University,
October, 2003.
[abstract]
[pdf]
-
John Mellor-Crummey and John Garvin.
Optimizing Sparse Matrix Vector Multiply using Unroll-and-jam.
Proceedings of the Los Alamos Computer Science Institute Third Annual Symposium
October, 2002, Santa Fe, NM. Published on CD-ROM.
[abstract]
[pdf]
-
Robert Fowler, John Mellor-Crummey, Guohua Jin and Apan Qasem.
A Source-to-source Loop Transformation Tool.
Extended poster abstract.
Proceedings of the Los Alamos Computer Science Institute Third Annual Symposium
October, 2002, Santa Fe, NM. Published on CD-ROM.
-
Guohua Jin and John Mellor-Crummey.
Experiences Tuning SMG98 --- a Semicoarsening Multigrid Benchmark based
on the hypre Library
Proceedings of the International Conference on Supercomputing,
June 22-26, 2002, New York, New York, USA.
[abstract]
[pdf]
-
Guohua Jin, John Mellor-Crummey, and Robert Fowler. Increasing temporal locality with skewing and
recursive blocking. In Proceedings of SC2001, Denver, CO, Nov
2001. Distributed on CD-ROM.
[abstract],
[ps],
[pdf]
-
John Mellor-Crummey, David Whalley, and Ken Kennedy. Improving memory hierarchy
performance for irregular applications using data and computation reorderings.
International Journal of Parallel Programming, 29(3), June 2001.
-
John Mellor-Crummey, David Whalley, and Ken Kennedy. Improving memory hierarchy
performance for irregular applications. In Proceedings of the 13th ACM
International Conference on Supercomputing, pages 425-433, Rhodes,
Greece, June 1999. [pdf]
-
Ervan Darnell, John Mellor-Crummey, and Ken Kennedy. Automatic software
cache coherence through vectorization. In Proceedings of 1992 International
Conference on Supercomputing, July 1992.
Grid Computing
- Ken Kennedy, Mark Mazina, John Mellor-Crummey, Keith Cooper, Linda
Torczon, Fran Berman, Andrew Chien, Holly Dail, Otto Sievert,
Dave Angulo, Ian Foster, Dennis Gannon, Lennart Johnsson, Carl
Kesselman, Ruth Aydt, Daniel Reed, Jack Dongarra, Sathish Vadhiyar,
and Rich Wolski.
Toward a Framework for Preparing and
Executing Adaptive Grid Programs.
Proceedings of NSF Next Generation Systems Program Workshop (International Parallel and Distributed Processing
Symposium 2002), Fort Lauderdale, FL, April 2002.
[abstract]
[pdf]
- Francine Berman, Andrew Chien, Keith Cooper, Jack Dongarra, Ian
Foster, Dennis Gannon, Lennart Johnsson, Ken Kennedy, Carl Kesselman,
John Mellor-Crummey, Dan Reed, Linda Torczon, and Rich Wolski. The
GrADS project: Software support for high-level grid application
development. International Journal of High Performance Computing
Applications, 15(4), Winter 2001.
Compiler Technology for Domain-Specific Languages
-
Ken Kennedy, Bradley Broom, Keith Cooper, Jack Dongarra, Rob Fowler, Dennis
Gannon, Lennart Johnsson, John Mellor-Crummey and Linda Torczon. Telescoping
languages: A strategy for automatic generation of scientific problem-solving
systems from annotated libraries.
Journal of Parallel and Distributed
Computation. 61(12), 1803-1826, Dec 2001.
[abstract],
[pdf]
Parallel Debugging
-
John M. Mellor-Crummey. Compile-time support for efficient data race detection
in shared-memory parallel programs. In Proc. ACM/ONR Workshop on Parallel
and Distributed Debugging, pages 129-139, San Diego, CA, May 1993.
Available as SIGPLAN NOTICES, 28(12), December 1993.
-
John M. Mellor-Crummey. Compile-time support for efficient data race detection
in shared-memory parallel programs. In Proc. Supercomputer Debugging
Workshop '92, Dallas, TX, October 1992.
-
John M. Mellor-Crummey. On-the-fly detection of data races for programs
with nested fork-join parallelism. In Proc. of Supercomputing '91,
pages 24-33, Albuquerque, NM, November 1991.
[abstract]
[pdf]
-
Robert Hood, Ken Kennedy, and John Mellor-Crummey. Parallel program debugging
with on-the-fly anomaly detection. In Supercomputing 1990, pages
74-81, November 1990.
-
Thomas J. LeBlanc, John M. Mellor-Crummey, and Robert J. Fowler. Analyzing
parallel program executions using multiple views.
Journal of Parallel
and Distributed Computing, 9:203-217, June 1990.
- R. J. Fowler, T. J. LeBlanc, and J. M. Mellor-Crummey. An
integrated approach to parallel program debugging and performance
analysis on large-scale multiprocessors. In Proc. of the
SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging,
pages 163-173, Madison, WI, May 1988. Special issue of SIGPLAN
Notices, 24(1), Jan. 1989.
- John M. Mellor-Crummey. Debugging and Analysis of Large-Scale
Parallel Programs. PhD thesis, Department of Computer Science,
University of Rochester, September 1989. Available as Technical report
URCS-TR-312.
-
John M. Mellor-Crummey and Thomas J. LeBlanc. A software instruction counter.
In Proc. of the 3rd International Conference on Architectural Support
for Programming Languages and Operating Systems, pages 78-86, Boston,
MA, April 1989.
-
Thomas J. LeBlanc and John M. Mellor-Crummey. Debugging parallel programs
with Instant Replay.
IEEE Transactions on Computers, C-36(4):471-482,
April 1987.
-
John M. Mellor-Crummey and Thomas J. LeBlanc. Instrumentation for distributed
systems. In Proc. of the Workshop on Instrumentation for Distributed
Computing Systems, pages 16-18, Sanibel Island, FL, January 1987.
Multiprocessor Synchronization
- Michael L. Scott and
John M. Mellor-Crummey. Fast, contention-free combining tree barriers
for shared-memory multiprocessors. International Journal of
Parallel Programming, 22(4), 1994. [postscript][src code for adaptive
barriers]
- John M. Mellor-Crummey and Michael L. Scott. Algorithms for
scalable synchronization on shared-memory multiprocessors. ACM
Transactions on Computer Systems, 9(1):21-65, February 1991.
[pdf]
[src code for
mutual exclusion locks and barriers]
- John M. Mellor-Crummey and Michael L. Scott. Synchronization
without contention. In Proc. of the 4th International Conference
on Architectural Support for Programming Languages and Operating
Systems, pages 269-278, Palo Alto, CA, April 1991. [pdf] [src code for
mutual exclusion locks and barriers]
- John M. Mellor-Crummey and Michael
L. Scott. Scalable reader-writer
synchronization for shared-memory multiprocessors.
In Proc. of the 3rd ACM Symposium on Principles and Practice of
Parallel Programming, pages 106-113, Williamsburg, VA, April
1991. [pdf] [src code for reader-writer
locks]
-
John M. Mellor-Crummey. Concurrent queues: Practical fetch-and-phi algorithms.
Technical Report 229, Department of Computer Science, University of Rochester,
November 1987.
[abstract]
[pdf]
Performance Modeling
-
Gabriel Marin and John Mellor-Crummey.
Cross Architecture Performance Predictions for Scientific Applications
Using Parameterized Models. In Proceedings of the Joint
International Conference on Measurement and Modeling of Computer Systems, June 2004.
[abstract]
[pdf]
-
Gabriel Marin and John Mellor-Crummey.
Building parameterized performance models for black-box applications.
Extended poster abstract.
Proceedings of the Los Alamos Computer Science Institute Third Annual Symposium
October, 2002, Santa Fe, NM. Published on CD-ROM.
[pdf]
Performance Analysis
-
John Mellor-Crummey, Robert Fowler, Gabriel Marin, and Nathan Tallent.
HPCView: A tool for top-down analysis of node performance.
The Journal of Supercomputing, 23, 81-104, 2002.
Special Issue with selected papers from the 2001 Los Alamos Computer Science Institute
Symposium.
[pdf]
-
John Mellor-Crummey, Robert Fowler, and Gabriel Marin. HPCView: A tool
for top-down analysis of node performance. In Proceedings of the Los
Alamos Computer Science Institute Second Annual Symposium, Santa Fe,
NM, October 2001. Distributed on CD-ROM.
[abstract],
[ps],
[pdf]
-
John Mellor-Crummey, Robert Fowler, and David Whalley. Tools for application-oriented
performance tuning. In Proceedings of the 15th ACM International Conference
on Supercomputing, Sorrento, Italy, June 2001.
[abstract],
[pdf]
-
John Mellor-Crummey, Robert Fowler, and David Whalley. On providing useful
information for analyzing and tuning applications. In Joint International
Conference on Measurement & Modeling of Computer Systems, Cambridge,
MA, June 2001. (Poster abstract.)
[abstract],
[ps]
a>
Compiler Technology
-
Mary Hall, John Mellor-Crummey, Alan Carle, and Rene Rodriguez. FIAT: a
framework for interprocedural analysis and transformation. In Proc.
Workshop on Compilers for Parallel Processing, Portland, OR, August
1993.
Parallel Applications
-
S. L. Lin, J. Mellor-Crummey, B. M. Pettitt, and G. N. Phillips, Jr. Molecular
dynamics on a distributed-memory multiprocessor.
Journal of Computational
Chemistry, 13(8):1022-1035, 1992.
Operating Systems
-
Thomas J. LeBlanc, John M. Mellor-Crummey, Neal M. Gafter, Lawrence A.
Crowl, and Peter C. Dibble. The Elmwood multiprocessor operating system.
Software--Practice
and Experience, 19(11):1029-1056, November 1989.
Architectures
-
John M. Mellor-Crummey. Experiences with the BBN Butterfly. In Proc.
of the 1988 COMPCON, pages 101-104, San Franciso, CA, February 1988.
IEEE. Invited paper.