Book:
Optimizing Compilers for Modern Architectures (with Randy Allen), Morgan-Kaufmann Publishers, San Francisco, 2002. Chinese edition: 2004, Second printing: 2005.
Book Edited:
Sourcebook of Parallel Computing (with Jack Dongarra, Ian Foster, Geoffrey Fox, William Gropp, Linda Torczon, and Andy White), Morgan Kaufmann Publishers, San Francisco, 2003.
Patent:
Digital
Computer Register Allocation and Code Spilling Using Interference Graph Coloring
(with P. Briggs, K. D. Cooper, and L. Torczon), (Serial Number: 08/027,937).
Papers and Book Chapters:
1. ÒA Global Flow Analysis Algorithm,Ó International Journal of Computer Mathematics, Gordon and Breach, Section A, Volume 3, (1971), pages 5–15.
2. ÒIndex Register Allocation in Straight Line Code and Simple Loops,Ó Design and Optimization of Compilers (R. Rustin, editor), Prentice-Hall, Englewood Cliffs, NJ, (1972), pages 51–63.
3. ÒSafety of Code Motion,Ó International Journal of Computer Mathematics, Gordon and Breach, Section A, Volume 3, (1972), pages 117–130.
4. ÒReview of A Mathematical Theory of Global Program Optimization by M. Schaefer,Ó SIAM Review, Volume 16, Number 4, (October 1974), pages 565–566.
5. ÒAn Introduction to the Set-Theoretic Language SETLÓ (with J. Schwartz), Computers and Mathematics with Applications, Permagon Press, Volume 1, (1975), pages 97-119.
6. ÒNode Listings Applied to Data Flow Analysis,Ó Conference Record of the Second ACM Symposium on Principles of Programming Languages, Palo Alto, CA, (January 1975), pages 10–21.
7. ÒProfitability Computations on Program Flow GraphsÓ (with J. Cocke), Computers and Mathematics with Applications, Pergamon Press, Volume 2, (1976), pages 145–159.
8. ÒPLANET: A Simulation Approach to PERTÓ (with R. Thrall), Computers and Operations Research, Pergamon Press, Volume 4, (1976), pages 313–325.
9. ÒAutomatic Generation of Efficient Evaluators for Attribute GrammarsÓ (with S. K. Warren), Conference Record of the Third ACM Symposium on Principles of Programming Languages, Atlanta, GA, (January 1976), pages 32–49.
10. ÒA Comparison of Two Algorithms for Global Data Flow Analysis,Ó SIAM Journal on Computing, Volume 5, Number 1, (March 1976), pages 158–180.
11. ÒGraph Grammars and Global Program Data Flow AnalysisÓ (with R. Farrow and L. Zucconi), Seventeenth Annual Symposium on Foundations of Computer Science, Houston, TX, (October 1976), pages 42–56.
12. ÒApplications of a Graph Grammar for Program Control Flow AnalysisÓ (with L. Zucconi), Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, Los Angeles, CA, (January 1977), pages 72–85.
13. ÒAn Algorithm for Reduction of Operator StrengthÓ (with J. Cocke), Communications of the ACM, Volume 20, Number 11, (November 1977), pages 850–856.
14. ÒUse-Definition Chains with Applications,Ó Journal of Computer Languages, Volume 3, Number 3, (1978), pages 163–179.
15. ÒA Survey of Compiler Optimization Techniques,Ó Le Point sur la Compilation (M. Amirchahy and N. Ne'l, editors), INRIA, Le Chesnay, France, (1978), pages 115–161.
16. ÒOptimization of Vector Operations in an Extended Fortran Compiler,Ó Proceedings of the 1978 LASL Workshop on Vector and Parallel Processors, Los Alamos, NM, (September 1978), pages 238–251.
17. ÒA Deterministic Attribute Grammar Evaluator Based on Dynamic SequencingÓ (with J. Ramanathan), ACM Transactions on Programming Languages and Systems, Volume 1, Number 1, (July 1979), pages 142–160.
18. ÒThe Early Development of Programming in the USSRÓ (English version with A. Ershov and M. Shura-Bura), A History of Computing in the Twentieth Century (N. Metropolis, J. Howlett, and G. C. Rota, editors), Academic Press, New York, (1980), pages 137–196.
19. ÒVector Mode ComputationÓ (with J. Huang and A. Liles, Jr.), IBM Technical Disclosure Bulletin, Volume 23, Number 5, (October 1980), pages 2171–2172.
20. ÒA Survey of Data Flow Analysis Techniques,Ó Program Flow Analysis: Theory and Applications (N. D. Jones and S. S. Muchnick, editors), Prentice-Hall, Englewood Cliffs, NJ, (1981), pages 5–54.
21. ÒReduction of Operator StrengthÓ (with F. Allen and J. Cocke), Program Flow Analysis: Theory and Applications (N. D. Jones and S. S. Muchnick, editors), Prentice-Hall, Englewood Cliffs, NJ, (1981), pages 79–101.
22. ÒPathlistings Applied to Data Flow AnalysisÓ (with J. Ramanathan), Acta Informatica, Volume 16, Facsimile 3, (1981), pages 253–273.
23. ÒConversion of Control Dependence to Data DependenceÓ (with J. R. Allen, C. Porterfield, and J. Warren), Conference Record of the Tenth Annual ACM Symposium on Principles of Programming Languages, Austin, TX, (January 1983), pages 177–189.
24. ÒAutomatic Loop InterchangeÓ (with J. R. Allen), Proceedings of the SIGPLAN `84 Symposium on Compiler Construction, SIGPLAN Notices, Volume 19, Number 6, (June 1984), pages 233–246.
25. ÒEfficient Computation of Flow-Insensitive Interprocedural Summary InformationÓ (with K. D. Cooper), Proceedings of the SIGPLAN `84 Symposium on Compiler Construction, SIGPLAN Notices, Volume 19, Number 6, (June 1984), pages 247–258.
26. ÒEfficient Computation of Flow-Insensitive Interprocedural Summary Information—A CorrectionÓ (with K. D. Cooper), SIGPLAN Notices, Volume 23, Number 4, (April 1988), pages 35–42.
27. ÒPFC: A Program to Convert Fortran to Parallel FormÓ (with J. R. Allen), Supercomputers: Design and Applications (K. Hwang, editor), IEEE Computer Society Press, (August 1984), pages 186–203.
28. ÒA Programming Environment for FortranÓ (with R. T. Hood), Proceedings of the Eighteenth Hawaii International Conference on System Sciences, Western Periodicals, North Hollywood, CA, Volume II (Software), (January, 1985), pages 625–637.
29. ÒA Parallel Programming EnvironmentÓ (with J. R. Allen), IEEE Software, Volume 2, Number 4, (July 1985), pages 21–29.
30. ÒThe Impact of Interprocedural Analysis and Optimization on the Design of a Software Development EnvironmentÓ (with K. D. Cooper and L. Torczon), Proceedings of the SIGPLAN `85 Symposium on Language Issues in Programming Environments, SIGPLAN Notices, Volume 20, Number 7, (July 1985), pages 107–116.
31. ÒProgramming Language Support for SupercomputersÓ (with R. T. Hood), Frontiers of Supercomputing (N. Metropolis, D. Sharp, W. Worlton, and K. Ames, editors), University of California Press, Berkeley, CA, (1986), pages 282–311.
32. ÒProgramming Environments for SupercomputersÓ (with J. R. Allen), Supercomputers: Algorithms, Architectures, and Scientific Computation (F. Matsen and T. Tajima, editors), University of Texas Press, Austin, TX, (1986), pages 19–38.
33. ÒPTOOL: A Semi-Automatic Parallel Programming AssistantÓ (with J. R. Allen, D. Baumgartner, and A. Porterfield), Proceedings of the 1986 International Conference on Parallel Processing, IEEE Computer Society Press, Washington, D.C., (1986), pages 164–170.
34. ÒOptimization of Compiled Code in the Programming EnvironmentÓ (with K. D. Cooper and L. Torczon), Proceedings of the Nineteenth Hawaii International Conference on System Sciences, Western Periodicals, North Hollywood, CA, Volume II (Software), (January 1986), pages 492–502.
35. ÒInterprocedural Optimization: Eliminating Unnecessary RecompilationÓ (with K. D. Cooper and L. Torczon), Proceedings of the SIGPLAN `86 Symposium on Compiler Construction, SIGPLAN Notices, Volume 21, Number 7, (July 1986), pages 58–67.
36. ÒInterprocedural Constant PropagationÓ (with D. Callahan, K. D. Cooper, and L. Torczon), Proceedings of the SIGPLAN `86 Symposium on Compiler Construction, SIGPLAN Notices, Volume 21, Number 7, (July 1986), pages 152–161.
37. ÒThe Impact of Interprocedural Analysis and Optimization in the Programming EnvironmentÓ (with K. D. Cooper and L. Torczon), ACM Transactions on Programming Languages and Systems, Volume 8, Number 4, (October 1986), pages 491–523.
38. ÒEditing and Compiling Whole ProgramsÓ (with K. D. Cooper, L. Torczon, A. Weingarten, and M. Wolcott), Proceedings of the ACM SIGSOFT/SIGPLAN Symposium on Practical Software Development Environments, SIGPLAN Notices, Volume 22, Number 1, (January 1987), pages 92–101.
39. ÒEfficient Recompilation of Module Interfaces in a Software Development EnvironmentÓ (with R. T. Hood and H. Muller), Proceedings of the ACM SIGSOFT/SIGPLAN Symposium on Practical Software Development Environments, SIGPLAN Notices, Volume 22, Number 1, (January 1987), pages 180–189.
40. ÒAutomatic Decomposition of Scientific Programs for Parallel ExecutionÓ (with J. R. Allen and D. Callahan), Conference Record of the Fourteenth Annual Symposium on Principles of Programming Languages, Munich, Germany, (January 1987), pages 63–76.
41. ÒParallel Programming Support in ParaScopeÓ (with D. Callahan, K. D. Cooper, R. T. Hood, L. Torczon, and S. K. Warren), Parallel Computing in Science and Engineering (R. Dierstein, D. Muller-Wichards, and H. Wacker, editors), Lecture Notes in Computer Science 295, Springer-Verlag, Berlin, (June 1987), pages 91–106.
42. ÒAutomatic Translation of Fortran Programs to Vector FormÓ (with J. R. Allen), ACM Transactions on Programming Languages and Systems, Volume 9, Number 4, (October 1987), pages 491–542.
43. ÒA Practical Environment for Scientific ProgrammingÓ (with A. Carle, K. D. Cooper, R. T. Hood, L. Torczon, and S. K. Warren), IEEE Computer, Volume 20, Number 11, (November 1987), pages 75–89.
44. ÒAnalysis of Interprocedural Side Effects in a Parallel Programming EnvironmentÓ (with D. Callahan), Journal of Parallel and Distributed Computing, Volume 5, (1988), pages 517–550.
45. ÒInterprocedural Side-Effect Analysis in Linear TimeÓ (with K. D. Cooper), Proceedings of the SIGPLAN `88 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Volume 23, Number 7, (July 1988), pages 57–66.
46. ÒEstimating Interlock and Improving Balance for Pipelined MachinesÓ (with D. Callahan and J. Cocke), Journal of Parallel and Distributed Computing, Volume 5, Number 4, (August 1988), pages 334–358.
47. ÒCompiling Programs for Distributed-Memory MultiprocessorsÓ (with D. Callahan), Journal of Supercomputing, Volume 2, Number 2, (October 1988), pages 151–169.
48. ÒParaScope: A Parallel Programming EnvironmentÓ (with D. Callahan, K. D. Cooper, R. T. Hood, and L. Torczon), The International Journal of Supercomputer Applications, Volume 2, Number 4, (December 1988), pages 84–99.
49. ÒPerformance of Parallel ProcessorsÓ (with H. Flatt), Parallel Computing, Volume 12, Number 1, (October 1989), pages 1–20.
50. ÒThe ParaScope Editor: An Interactive Parallel Programming ToolÓ (with V. Balasundaram, U. Kremer, K. McKinley, and J. Subhlok), Proceedings: Supercomputing `89, Reno, NV, (November 1989), pages 540–550.
51. ÒFast Interprocedural Alias AnalysisÓ (with K. D. Cooper), Conference Record of the Sixteenth Annual ACM SIGACT/SIGPLAN Symposium on Principles of Programming Languages, Austin, TX, (January 1989), pages 49–59.
52. ÒVirtual Shared Memory for Distributed-Memory MachinesÓ (with H. Zima), Proceedings of the Fourth Conference on Hypercubes, Concurrent Computers, and Applications, Monterey, CA, (March 1989), pages 361–366.
53. ÒCompile-Time Detection of Race Conditions in a Parallel ProgramÓ (with V. Balasundaram), Proceedings of the 1989 ACM International Conference on Supercomputing, Crete, Greece, (June 1989), pages 175–185.
54. ÒA Technique for Summarizing Data Access and Its Use in Parallelism-Enhancing TransformationsÓ (with V. Balasundaram), Proceedings of the SIGPLAN `89 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Volume 24, Number 7, (July 1989), pages 41–53.
55. ÒColoring Heuristics for Register AllocationÓ (with P. Briggs, K. D. Cooper, and L. Torczon), Proceedings of the SIGPLAN `89 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Volume 24, Number 7, (July 1989), pages 275–284.
56. ÒBlocking Linear Algebra Codes for Memory HierarchiesÓ (with S. Carr), Proceedings of the Fourth SIAM Conference on Parallel Processing for Scientific Computing, Chicago, IL, (December 1989), pages 400–405.
57. ÒExperience with Interprocedural Analysis of Array Side EffectsÓ (with P. Havlak), IEEE Transactions on Parallel and Distributed Systems, Volume 2, Number 3, (1990).
58. ÒAnalyzing and Visualizing Performance of Memory HierarchiesÓ (with D. Callahan and A. Porterfield), Performance Instrumentation and Visualization (M. Simmons and R. Koskela, editors), ACM Press, Frontier Series, New York, (1990), pages 1–26.
59. ÒAnalysis of Event Synchronization in a Parallel Programming Tool,Ó (with D. Callahan and J. Subhlok), Proceedings of the Second ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, SIGPLAN Notices, Volume 25, Number 3, (March 1990), pages 21–30.
60. ÒAn Interactive Environment for Data Partitioning and DistributionÓ (with V. Balasundaram, G. Fox, and U. Kremer), Proceedings of the Fifth Distributed-Memory Computing Conference, Volume II (Architecture, Software Tools, and Other General Issues), Charleston, SC, (April 1990), pages 1160–1170.
61. ÒConstructing the Procedure Call MultigraphÓ (with D. Callahan, A. Carle, and M. W. Hall), IEEE Transactions on Software Engineering, Volume 16, Number 4, (April 1990), pages 483–487.
62. ÒImproving Register Allocation for Subscripted VariablesÓ (with D. Callahan and S. Carr), Proceedings of the ACM SIGPLAN `90 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Volume 25, Number 6, (June 1990), pages 53–65.
63. ÒParallel Program Debugging with On-the-Fly Anomaly DetectionÓ (with R. T. Hood and J. Mellor-Crummey), Proceedings: Supercomputing `90, New York, NY, (November 1990), pages 74–81.
64. ÒLoop Distribution with Arbitrary Control FlowÓ (with K. McKinley), Proceedings: Supercomputing `90, New York, NY, (November 1990), pages 407–416.
65. ÒCompiling Scientific Code for Complex Memory HierarchiesÓ (with S. Carr), Proceedings of the Twenty-Fourth Annual Hawaii International Conference on System Sciences, Volume I (Architectures and Engineering Technologies), IEEE Computer Society Press, Los Alamitos, CA, (January 1991), pages 536–544.
66. ÒSoftware PrefetchingÓ (with D. Callahan and A. Porterfield), Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, SIGPLAN Notices, Volume 26, Number 4, (April 1991), pages 40–52.
67. ÒAnalysis and Transformation in the ParaScope EditorÓ (with K. McKinley and C. W. Tseng), Proceedings of the 1991 ACM International Conference on Supercomputing, Cologne, Germany, (June 1991), pages 433–447.
68. ÒPractical Dependence TestingÓ (with G. Goff and C. W. Tseng), Proceedings of the SIGPLAN `91 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Volume 26, Number 6, (June 1991), pages 15–29.
69. ÒA Static Performance Estimator to Guide Data Partitioning DecisionsÓ (with V. Balasundaram, G. Fox, and U. Kremer), Proceedings of the Third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, SIGPLAN Notices, Volume 26, Number 7, (July 1991), pages 213–223.
70. ÒAn Implementation of Interprocedural Bounded Regular Section AnalysisÓ (with P. Havlak), IEEE Transactions on Parallel and Distributed Systems, Volume 2, Number 3, (July 1991), pages 350-360.
71. ÒInteractive Parallel Programming Using the ParaScope EditorÓ (with K. McKinley and C. W. Tseng), IEEE Transactions on Parallel and Distributed Systems, Volume 2, Number 3, (July 1991), pages 329–341.
72. ÒInterprocedural Transformations for Parallel Code GenerationÓ (with M. W. Hall and K. McKinley), Proceedings: Supercomputing `91, Albuquerque, NM, (November 1991), pages 424–434.
73. ÒCompiler Optimizations for Fortran D on MIMD Distributed-Memory MachinesÓ (with S. Hiranandani and C. W. Tseng), Proceedings: Supercomputing `91, Albuquerque, NM, (November 1991), pages 86–100.
74. ÒAn Overview of the Fortran D Programming SystemÓ (with S. Hiranandani, C. Koelbel, U. Kremer, and C. W. Tseng), Languages and Compilers for Parallel Computing (U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors), Lecture Notes in Computer Science 589, Springer-Verlag, Berlin, (1992), pages 18–34.
75. ÒA Static Performance Estimator in the Fortran D Programming SystemÓ (with V. Balasundaram, G. Fox, and U. Kremer), Languages, Compilers, and Run-Time Environments for Distributed-Memory Machines (J. Saltz and P. Mehrotra, editors), North-Holland, Amsterdam, The Netherlands, (1992), pages 119–138.
76. ÒCompiler Support for Machine-Independent Parallel Programming in Fortran DÓ (with S. Hiranandani and C. W. Tseng), Languages, Compilers, and Run-Time Environments for Distributed-Memory Machines (J. Saltz and P. Mehrotra, editors), North-Holland, Amsterdam, The Netherlands, (1992), pages 139–176.
77. ÒProcedure CloningÓ (with K. D. Cooper and M. W. Hall), Proceedings of the 1992 International Conference on Computer Languages, Oakland, CA, (April 1992), pages 96–105.
78. ÒEvaluating Parallel Languages for Molecular Dynamics ComputationsÓ (with T. Clark, R. von Hanxleden, C. Koelbel, and L. Scott), Proceedings of the 1992 Scalable High Performance Computing Conference, IEEE Computer Society Press, Williamsburg, VA, (April 1992), pages 98–105.
79. ÒSoftware Support for Irregular and Loosely Synchronous ProblemsÓ (with A. Choudhary, G. Fox, S. Hiranandani, C. Koelbel, S. Ranka, and J. Saltz), Computing Systems in Engineering, Volume 3, Numbers 1–4, (June 1992), pages 43–52.
80. ÒRelaxing SIMD Control Flow Constraints Using Loop TransformationsÓ (with R. von Hanxleden), Proceedings of the SIGPLAN `92 Conference on Programming Language Design and Implementation, SIGPLAN Notices, Volume 27, Number 7, (July 1992), pages 188–199.
81. ÒEvaluation of Compiler Optimizations for Fortran D on MIMD Distributed-Memory MachinesÓ (with S. Hiranandani and C. W. Tseng), Proceedings of the ACM 1992 International Conference on Supercomputing, Washington, D.C., (July 1992), pages 1–14.
82. ÒOptimizing for Parallelism and Data LocalityÓ (with K. McKinley), Proceedings of the 1992 ACM International Conference on Supercomputing, Washington, D.C., (July 1992), pages 323–334.
83. ÒAutomatic Software Cache Coherence through VectorizationÓ (with E. Darnell and J. Mellor-Crummey), Proceedings of the 1992 ACM International Conference on Supercomputing, Washington, D.C., (July 1992), pages 129–138.
84. ÒCompiling Fortran D for MIMD Distributed-Memory MachinesÓ (with S. Hiranandani and C. W. Tseng), Communications of the ACM, Volume 35, Number 8, (August 1992), pages 66–80.
85. ÒEfficient Call Graph AnalysisÓ (with M. W. Hall), ACM Letters on Programming Languages and Systems, Volume 1, Number 3, (September 1992), pages 227–242.
86. ÒVector Register AllocationÓ (with J. R. Allen), IEEE Transactions on Computers, Volume 41, Number 10, (October 1992), pages 1290–1317.
87. ÒCompiling Fortran 77D and 90D for MIMD Distributed-Memory MachinesÓ (with A. Choudhary, G. Fox, S. Hiranandani, C. Koelbel, S. Ranka, and C. W. Tseng), Communications of the ACM, Volume 35, Number 8, (October 1992), pages 66–80.
88. ÒCompiler Blockability of Numerical AlgorithmsÓ (with S. Carr), Proceedings: Supercomputing `92, Minneapolis, MN, (November 1992), pages 114–124.
89. ÒInterprocedural Compilation of Fortran D for MIMD Distributed-Memory MachinesÓ (with M. W. Hall, S. Hiranandani, and C. W. Tseng), Proceedings: Supercomputing `92, Minneapolis, MN, (November 1992), pages 522–534.
90. ÒCompiler Analysis for Irregular Problems in Fortran DÓ (with R. Das, R. von Hanxleden, C. Koelbel, and J. Saltz), Proceedings of the Fifth Workshop on Languages and Compilers for Parallel Computing, New Haven, CT, (revised January 1993), pages 97–111.
91. ÒThe ParaScope Parallel Programming EnvironmentÓ (with K. D. Cooper, M. W. Hall, R. T. Hood, K. McKinley, J. Mellor-Crummey, L. Torczon, and S. K. Warren), Proceedings of the IEEE, Volume 81, Number 2, (February 1993), pages 244–263.
92. ÒUnified Compilation of Fortran 77D and 90DÓ (with A. Choudhary, G. Fox, S. Hiranandani, C. Koelbel, S. Ranka, and C. W. Tseng), ACM Letters on Programming Languages and Systems, Volume 2, Numbers 1-4, (March-December 1993), pages 95–114.
93. ÒA Methodology for Procedure CloningÓ (with K. D. Cooper and M. W. Hall), Computer Languages, Volume 19, Number 2, (April 1993), pages 105–117.
94. ÒExperiences Using the ParaScope Editor: An Interactive Parallel Programming ToolÓ (with M. W. Hall, T. Harvey, N. McIntosh, K. McKinley, J. Oldham, M. Paleczny, and G. Roth), Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, (May 1993), pages 33–43.
95. ÒAnalysis and Transformation in an Interactive Parallel Programming ToolÓ (with K. McKinley and C. W. Tseng), Concurrency: Practice and Experience, Volume 5, Number 7, (October 1993), pages 575–602.
96. ÒPreliminary Experiences with the Fortran D CompilerÓ (with S. Hiranandani and C. W. Tseng), Proceedings: Supercomputing `93, Portland, OR, (November 1993), pages 338–350.
97. ÒCache Coherence Using Local Knowledge,Ó (with E. Darnell), Proceedings: Supercomputing `93, Portland, OR, (November 1993), pages 720–729.
98. ÒMaximizing Loop Parallelism and Improving Data Locality via Loop Fusion and DistributionÓ (with K. McKinley), Languages and Compilers for Parallel Computing, (U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors), Lecture Notes in Computer Science, Number 768, Springer-Verlag, Berlin, (1993), pages 301–320.
99. ÒAutomatic Data Layout for Distributed-Memory Machines in the D Programming EnvironmentÓ (with A. Carle, U. Kremer, and J. Mellor-Crummey), Automatic Parallelization—New Approaches to Code Generation, Data Distribution, and Performance Prediction, Wiesbaden, Germany, (1993), pages 121–152.
100. ÒScalar Replacement in the Presence of Conditional Control FlowÓ (with S. Carr), Software Practice and Experience, Volume 24, Number 1, (January 1994), pages 51–77.
101. ÒCompiler Technology for Machine-Independent Parallel Programming,Ó International Journal of Parallel Programming, Volume 22, Number 1, (January 1994), pages 79–97.
102. ÒGive-N-Take—A Balanced Code Placement FrameworkÓ (with R. von Hanxleden), Proceedings of the ACM SIGPLAN '94 Conference on Program Language Design and Implementation, (March 1994), pages 107–120.
103. ÒEvaluating Compiler Optimizations for Fortran DÓ (with S. Hiranandani and C. W. Tseng), Journal of Parallel and Distributed Computing, Volume 21, (April 1994), pages 27–45.
104. ÒParallelization of Linearized Applications in Fortran DÓ (with L. Liebrock), International Parallel Processing Symposium 1994, Washington, D.C., (April 1994), pages 51–60.
105. ÒDesign and Implementation of the D EditorÓ (with S. Hiranandani, C. W. Tseng, and S. Warren), Proceedings of the Second Workshop on Environments and Tools for Parallel Scientific Computing, SIAM, Townsend, TN, (May 1994), pages 1–10.
106. ÒContext Optimization for SIMD ExecutionÓ (with G. Roth), Proceedings of the Scalable High Performance Computing Conference, Knoxville, TN, (May 1994).
107. ÒIntegrated Support for Task and Data ParallelismÓ (with K. M. Chandy, I. Foster, C. Koelbel, and C. W. Tseng), International Journal of Supercomputing Applications, Volume 8, Number 1, (Summer 1994), pages 80–98.
108. ÒCompilation Techniques for Block-Cyclic DistributionsÓ (with S. Hiranandani, J. Mellor-Crummey, and A. Sethi), Proceedings of the 1994 International Conference on Supercomputing, Manchester, England, (July 1994), pages 392–403.
109. ÒAutomatic Data Layout Using 0-1 Integer Programming,Ó (with R. Bixby and U. Kremer), Proceedings of the International Conference on Parallel Architecture and Compilation Techniques, Montreal, Canada, published in Parallel Architectures and Compilation Techniques (A-50), North-Holland: Amsterdam, The Netherlands, (August 1994), pages 111–121.
110. ÒRequirements for Data-Parallel Programming EnvironmentsÓ (with V. Adve, A. Carle, E. Granston, S. Hiranandani, C. Koelbel, U. Kremer, J. Mellor-Crummey, C. W. Tseng, and S. Warren), IEEE Transactions on Parallel and Distributed Technology, Volume 2, Number 3, (Fall 1994), pages 48–58. (Formerly entitled: "The D System: Support for Data-Parallel Programming", CRPC–TR94378.)
111. ÒValue-Based Distributions and Alignments in Fortran DÓ (with R. von Hanxleden and J. Saltz), Journal of Programming Languages, Special Issue on Compiling and Run-Time Issues for Distributed Address Space Machines, Volume 2, Number 3, (September 1994), pages 259–282.
112. ÒThe D Editor: A New Interactive Parallel Programming ToolÓ (with S. Hiranandani, C. W. Tseng, and S. Warren), Proceedings of Supercomputing '94, (November 1994), pages 733–742.
113. ÒImproving the Ratio of Memory Operations to Floating-Point Operations in LoopsÓ (with S. Carr), ACM Transactions on Programming Languages and Systems, Volume 16, Number 6, (November 1994), pages 1768–1810.
114. ÒCompiler Support for Out-of-Core Arrays on Parallel MachinesÓ (with C. Koelbel and M. Paleczny), The Fifth Symposium of the Frontiers of Massively Parallel Computation, (February 1995).
115. ÒCombining Dependence and Data-Flow Analyses to Optimize CommunicationÓ (with N. Nedeljkovic), Proceedings of the 9th International Parallel Processing Symposium, Santa Barbara, CA, (April 1995), pages 340–346.
116. ÒManagement of the NHSE—A Virtual Distributed Digital LibraryÓ (with Shirley Browne, J. Dongarra, and T. Rowan), Second International Conference on Theory and Practice of Digital Libraries, (June 1995), pages 57–64.
117. ÒEfficient Address Generation for Block-Cyclic DistributionsÓ (with N. Nedeljkovic and A. Sethi), The 9th ACM International Conference on Supercomputing, Barcelona, Spain, (July 1995).
118. ÒA Linear-Time Algorithm for Computing the Memory Access Sequence in Data-Parallel ProgramsÓ (with N. Nedeljkovic and A. Sethi), Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, SIGPLAN, (July 1995).
119. ÒOptimizing Fortran 90 Shift Operations on Distributed-Memory MulticomputersÓ (with J. Mellor-Crummey and G. Roth), Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing (LCPC '95), Columbus, OH, (August 1995).
120. ÒA Model and Compilation Strategy for Out-of-Core Data Parallel ProgramsÓ (with R. Bordawekar, A. Choudhary, C. Koelbel, and M. Paleczny), Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, SIGPLAN NOTICES, (August 1995), pages 1–17.
121. ÒIndex Array Flattening through Program TransformationÓ (with R. Das, P. Havlak, and J. Saltz), Proceedings of Supercomputing '95, (August 1995).
122. ÒIntegrating Compilation and Performance Analysis for Data Parallel Programs,Ó (with V. Adve, M. Anderson, J. C. Wang, J. Mellor-Crummey, and D. Reed), Proceedings of the Workshop on Debugging and Performance Tuning of Parallel Computing Systems, (October 1995).
123. ÒAn Integrated Compilation and Performance Analysis Environment for Data Parallel ProgramsÓ (with V. Adve, M. Anderson, J. Mellor-Crummey, D. Reed, and J. C. Wang), Proceedings of Supercomputing '95, (November 1995).
124. ÒAutomatic Data Layout for High Performance FortranÓ (with U. Kremer), Proceedings of Supercomputing '95, San Diego, CA, (December 1995).
125. ÒThe National HPCC Software ExchangeÓ (with J. C. Browne, T. Disz, J. Dongarra, G. Fox, S. I. Green, K. Hawick, K. Moore, B. Olson, J. Pool, T. Rowan, R. Stevens, and R. Wade), IEEE Computational Science and Engineering, Volume 2, Number 2, (1995), pages 62–69.
126. ÒCommunication Generation for Cyclic DistributionsÓ (with N. Nedeljkovic and A. Sethi), Languages, Compilers, and Run-Time Systems for Scalable Computers, Kluwer Academic Publishers, Boston, MA, (1995), pages 185–197.
127. ÒInterprocedural Analysis and OptimizationÓ (with K. D. Cooper, M. W. Hall, and L. Torczon), The Communications on Pure and Applied Mathematics 48: 947–1003 (1995).
128. ÒOptimal Register Assignment to Loops for Embedded Code GenerationÓ (with D. Kolson, A. Nicolau, and N. Dutt), IEEE 8th International Symposium on System Synthesis (ISSS), (September 1995).
129. ÒA Method for Register Allocation to Loops in Multiple Register File ArchitecturesÓ (with D. Kolson, A. Nicolau, and N. Dutt), IEEE 10th International Parallel Processing Symposium (IPPS), (April 1996).
130. ÒOptimal Register Assignment to Loops for Embedded Code GenerationÓ (with D. J. Kolson, A. N. Nicolau, and N. Dutt), ACM Transactions on Design Automation of Electronic Systems 1(2): 251–279, (April 1996).
131. ÒCross-Loop Reuse Analysis and Its Application to Cache OptimizationsÓ (with K. Cooper and N. McIntosh), In Proceedings of the Ninth Workshop on Languages and Compilers for Parallel Computing, San Jose, CA, (August 1996), pages 1–19.
132. ÒDependence Analysis of Fortran90 Array SyntaxÓ (with G. Roth), Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA '96), (August 9–11, 1996).
133. ÒResource-Based Communication Placement AnalysisÓ (with A. Sethi), Proceedings of the Ninth Workshop on Languages and Compilers for Parallel Computing (LCPC'96), San Jose, CA, (August 1996), pages 369–388.
134. ÒInterprocedural Compilation of Fortran DÓ (with M. W. Hall, S. Hiranandani, and C. W. Tseng), Journal of Parallel and Distributed Computing, Volume 38, Number 2, (November 1996), pages 114–129.
135. ÒA Communication Placement Framework with Unified Dependence and Data-Flow AnalysisÓ (with A. Sethi), Proceedings of the Third International Conference on High Performance Computing, India, (also available 1996 International Conference of High Performance Computing), Best Systems Paper Award, Digital Equipment (India), (1996).
136. ÒParallelization Support for Coupled Grid Applications with Small Meshes (with Lorie M. Liebrock), Concurrency—Practice and Experience 8(8): 581-615 (1996).
137. ÒExperiences in Data-Parallel ProgrammingÓ (with T. W. Clark and R. von Hanxleden), Scientific Programming, Volume 6, (1997), pages 153–158.
138. ÒOptimizing Java: Theory and PracticeÓ (with Z. Budimlic), Concurrency: Practice and Experience, Volume 9, Number 6, (1997), pages 445–463.
139. ÒCompiling Stencils in High Performance FortranÓ (with G. Roth, J. Mellor-Crummey, and R. G. Brickner), Proceedings: Supercomputing '97, San Jose, CA, (November 1997), (also available as CRPC–TR97725-S).
140. ÒA Nationwide Parallel Computing EnvironmentÓ (with C. F. Bender, J. Connolly, J. L. Hennessy, M. K. Vernon, and L. Smarr), Communications of the ACM, Volume 40, Number 11, (November 1997), pages 62–72.
141. ÒLoop Fusion in High Performance FortranÓ (with G. Roth), Proceedings of the 12th ACM International Conference on Supercomputing, Melbourne, Australia, (July 1998), pages 125–132.
142. ÒAutomatic Data Layout for Distributed Memory MachinesÓ (with U. Kremer), ACM Transactions on Programming Languages and Systems (TOPLAS), (December 1997).
143. ÒCompilers, Libraries, Languages,Ó Computational Grids: The Future of High-Performance Distributed Computing, (I. Foster and C. Kesselman, editors), Morgan Kaufmann Publishers, Inc., (August 1998), pages 181–204.
144. ÒStatus and Perspective of HPC—Discussion on HPC with Professor Ken KennedyÓ (with T. Watanabe and H. Katayama), NEC Research and Development, Volume 39, Number 4, (October 1998), pages 343–351.
145. ÒInformation Technology Research: Investing in our FutureÓ (with PITAC Committee), National Coordination Office for Computing, Information, and Communications, Washington, DC, (February 1999).
146. ÒProspects for Scientific Computing in Polymorphic, Object-Oriented StyleÓ (with Z. Budimlic), Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, San Antonio, TX, (March 1999).
147. ÒImproving Cache Performance in Dynamic Applications through Data and Computation Reorganization at Run TimeÓ (with C. Ding), Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'99), Atlanta, GA, (May 1999), pages 229–241. (Also available in ACM SIGPLAN Notices, Volume 34, Number 5).
148. ÒImproving Memory Hierarchy Performance for Irregular ApplicationsÓ (with J. Mellor-Crummey and D. Whalley), Proceedings of the 13th ACM International Conference on Supercomputing, Rhodes, Greece, (June 1999), pages 425–433.
149. ÒThe Cost of Being Object-Oriented: A Preliminary StudyÓ (with Z. Budimlic and J. Piper), Scientific Computing, Volume 7, Number 2, (1999), pages 87–95.
150. ÒInter-array Data RegroupingÓ (with C. Ding), Proceedings of the 12th International Workshop on Languages and Compilers for Parallel Computing, San Diego, CA, (August 1999).
151. ÒBandwidth-Based Performance Tuning and PredictionÓ (with C. Ding), Proceedings of IASTED International Conference on Parallel Computing and Distributed Systems, Cambridge, MA, (November 1999).
152. ÒMemory Bandwidth Bottleneck and its Amelioration by a CompilerÓ (with C. Ding), Proceedings of the 2000 International Parallel and Distributed Processing Symposium, Cancun, Mexico, (May 2000).
153. ÒTelescoping Languages: A Compiler Strategy for Implementation of High-Level Domain-Specific Programming Systems,Ó Proceedings of the 14th International Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun, Mexico, (May 2000).
154. ÒFast Greedy Weighted Fusion,Ó Proceedings of the 2000 International Conference on Supercomputing, Santa Fe, NM, (May 2000), pages 131–140.
155. ÒTransforming Loops to Recursion for Multi-Level Memory HierarchiesÓ (with Q. Yi and V. Adve), Proceeding of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2000), Vancouver, British Columbia, (June 2000).
156. ÒOverpartitioning with the Rice dHPF compilerÓ (with B. Broom, D. Chavarria-Miranda, G. Jin, R. Fowler, and J. Mellor-Crummey, Proceedings of the 4th Annual HPF User Group meeting, Tokyo, Japan, (October 2000).
157. ÒA Balanced Code Placement FrameworkÓ (with R. von Hanxleden), ACM Transactions on Programming Languages and Systems 22(5): 816–860, (September 2000).
158. ÒImproving effective bandwidth through compiler enhancement of global cache reuseÓ (with C. Ding), Proceedings of the 2001 International Parallel and Distributed Processing Symposium, San Francisco, CA, (April 2001), (selected as one of four Òbest papers'' in the conference).
159. ÒKelpIO: A Telescope-Ready Domain-Specific I/O Library for Irregular Block-Structured ApplicationsÓ (with B. Broom and R. Fowler), Proceedings of the 2001 IEEE International Symposium on Cluster Computing and the Grid, Brisbane, Australia, (May 2001) (selected as one of two Òbest papers'' in the cluster category).
160. ÒReduction in Strength of Procedures: An Optimizing Strategy for Telescoping LanguagesÓ (with A. Chauhan), Proceedings of the 2001 International Conference on Supercomputing, Sorrento, Italy (June 2001).
161. ÒImproving Memory Hierarchy Performance for Irregular Applications Using Data and Computation ReorderingsÓ [with J. Mellor-Crummey and D. Whalley]. International Journal of Parallel Programming 29(3), (June 2001).
162. ÒJaMake: A Java Compiler EnvironmentÓ (with Z. Budimlic), Proceedings of the International Conference on Large Scale Scientific Computations (ICLSSC 2001), Sozopol, Bulgaria, (June 2001).
163. ÒScalarizing Fortran 90 Array SyntaxÓ (with Y. Zhao), Proceedings of the Second Annual Symposium of the Los Alamos Computer Science Institute, Santa Fe, NM, (October 2001).
164. ÒTelescoping Languages: A Strategy for Automatic Generation of Scientific Problem-Solving Systems from Annotated LibrariesÓ (with B. Broom, K. Cooper, J. Dongarra, R. Fowler, D. Gannon, L. Johnsson, J. Mellor-Crummey, and L. Torczon), Journal of Parallel and Distributed Computing 61(12): 1803–1826, (December 2001).
165. ÒWhat Are the Top Ten Most Influential Parallel and Distributed Processing Concepts of the Past Millennium?Ó (with M. Theys, S. Ali, H. Siegel, K. Chandy, K. Hwang, L. Sha, K. Shin, M. Snir, L. Snyder, T. Sterling), Journal of Parallel and Distributed Computing 61(12): 1827-1841, (December 2001).
166. ÒThe GrADS Project: Software Support for High-Level Grid Application DevelopmentÓ (with F. Berman, A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, C. Kesselman, J. Mellor-Crummey, D. Reed, L. Torczon, and R. Wolski), International Journal of High Performance Applications and Supercomputing 15(4): 327–344, (Winter 2001).
167. ÒFast Greedy Weighted Fusion,Ó International Journal of Parallel Programming 29(5): 463–491, (October 2001).
168. ÒHigh Performance Fortran 2.0Ó (with C. Koelbel), in Compiler Optimizations for Scalable Parallel Systems, S. Pande and D. Agrawal, editors, Lecture Notes in Computer Science, Springer-Verlag, Berlin Heidelberg, Germany, (2001).
169. ÒReuse Distance Analysis for Scientific ProgramsÓ (with Y. Zhong and C. Ding), Proceedings of the 6th ACM Workshop on Languages, Compilers, and Runtime Systems for Scalable Computers (LCRÕ02), Washington, DC, (March 2002).
170. ÒKelpIO: A Telescope–ready Domain-specific I/O Library for Irregular Block-structured ApplicationsÓ (with B. Broom and R. Fowler), Future Generation Computer Systems 18: 449–460, (2002).
171. ÒToward a Framework for Preparing and Executing Adaptive Grid ProgramsÓ (with D. Angulo, R. Aydt, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, C. Kesselman, D. Reed, S. Vadhiyar and R. Wolski), Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2002), Fort Lauderdale, FL, (April 2002).
172. ÒA Rice University Perspective on Software Engineering Licensing,Ó (with M. Vardi), Communications of the ACM 45 (11): 94–95, (November 2002).
173. ÒCopy Coalescing and Live–Range Identification Without An Interference GraphÓ (with Z. Budimlic, K. Cooper, T. Harvey, and T. Oberg), Proceedings of the 2002 ACM SIGPLAN Conference on Programming Language Design and Implementation, Berlin, Germany, (June 2002).
174. ÒAdvanced Optimization Strategies in the Rice dHPF Compiler,Ó (with B. Broom, D. Chavarria–Miranda, R. Fowler, G. Jin, and J. Mellor-Crummey), HPF Special Issue of Concurrency—Practice and Experience 14 (8–9): 741–767, (August 2002).
175. ÒReducing and Vectorizing Procedures for Telescoping LanguagesÓ (with A. Chauhan), ICS'01 Special Issue of International Journal of Parallel Programming 30 (4), (August 2002).
176. ÒVizer: A system to Vectorize Intel x86 BinariesÓ (with K. Cooper and A. Dasgupta), Proceedings of the Third Annual Symposium of the Los Alamos Computer Science Institute, Santa Fe, NM, October 14–15, 2002.
177. ÒImproving Memory Hierarchy Performance Through Combined Loop Interchange and Multi–Level FusionÓ (with Q. Yi), Proceedings of the Third Annual Symposium of the Los Alamos Computer Science Institute, Santa Fe, NM, October 14–15, 2002.
178. ÒAlmost–whole–program compilationÓ (with Z. Budimlic), in Proceedings of the Joint ACM Java Grande — ISCOPE 2002 Conference, Seattle, Washington, November 3–5, 2002.
179. ÒParallelismÓ (with J. Dongarra and A. White), in Sourcebook for Parallel Computing, Morgan Kaufmann Publishers, San Francisco, chapter 1, pp. 1–12, (Fall 2002).
180. ÒParallel Programming ConsiderationsÓ (with J. Dongarra, I. Foster, G. Fox, W. Gropp, and D. Reed), in Sourcebook for Parallel Computing, Morgan Kaufmann Publishers, San Francisco, chapter 3, pp. 42–68, (Fall 2002).
181. ÒSoftware TechnologiesÓ (with I. Foster), in Sourcebook for Parallel Computing, Morgan Kaufmann Publishers, San Francisco, chapter 3, pp. 42–68, (Fall 2002).
182. ÒLanguages and CompilersÓ (with C. Koelbel), in Sourcebook for Parallel Computing, Morgan Kaufmann Publishers, San Francisco, chapter 12, pp. 343–365, (Fall 2002).
183. ÒReusable Software AlgorithmsÓ in Sourcebook for Parallel Computing, Morgan Kaufmann Publishers, San Francisco, chapter 17, pp. 461–467, (Fall 2002).
184. ÒWrap–up and Signposts to the FutureÓ (with A. White), in Sourcebook for Parallel Computing, Morgan Kaufmann Publishers, San Francisco, chapter 25. pp. 723–728, (Fall 2002).
185. ÒLanguages, Compilers, and Run-Time SystemsÓ in The Grid 2: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, San Francisco, chapter 25, pp. 491–512, (Fall 2003).
186. ÒSlice–hoisting for Array-size Inference in MATLABÓ (with A. Chauhan), Proc. 16th Workshop on Languages and Compilers for Parallel Computing (LCPC03), College Station, TX, 2-4 October 2003.
187. ÒAutomatic Type–Driven Library Generation of Telescoping LanguagesÓ (with A. Chauhan and C. McCosh), in the Proceedings of the ACM International Conference for High Performance Computing and Communications (SC2003), Phoenix, Arizona, November 15–21, 2003.
188. ÒImproving Effective Bandwidth through Compiler Enhancement of Global Cache ReuseÓ (with C. Ding), Journal of Parallel and Distributed Computing 64 (2004), pages 108–134.
189. ÒTransforming Complex Loop Nests For LocalityÓ (with V. Adve and Q. Yi), The Journal of Supercomputing, Volume 27: pages 219–264, 2004.
190. ÒNew Grid Scheduling and Rescheduling Methods in the GrADS ProjectÓ (with K. Cooper, A. Dasgupta,, C. Koelbel, A. Mandal, G. Marin, M. Mazina, J. Mellor-Crummey, F. Berman, H. Casanova, A. Chien, H. Dail, X. Liu, A. Olugbile, O. Sievert, H. Xia, L. Johnsson, B. Liu, M. Patel, D. Reed, W. Deng, C. Mendes, Z. Shi, A. YarKhan, J. Dongarra), in the 18th International Parallel and Distributed Processing Symposium (IEEE IPDPS 2004), April 2004.
191. ÒRetrospective: Coloring Heuristics for Register AllocationÓ (with P. Briggs, K. Cooper and L. Torczon), in 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979–1999): A Selection, Kathryn S. McKinley, Editor, ACM SIGPLAN Notices, Volume 39, Number 4, April 2004.
192. ÒRetrospective: Interprocedural Side-Effect Analysis in Linear TimeÓ (with K. Cooper), in 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979–1999): A Selection, Kathryn S. McKinley, Editor, ACM SIGPLAN Notices, Volume 39, Number 4, April 2004.
193. ÒRetrospective: Interprocedural Constant PropagationÓ (with D. Callahan, K. Cooper and L. Torczon), in 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979–1999): A Selection, Kathryn S. McKinley, Editor, ACM SIGPLAN Notices, Volume 39, Number 4, April 2004.
194. ÒRetrospective: Automatic Loop InterchangeÓ (with J. Allen), in 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979–1999): A Selection, Kathryn S. McKinley, Editor, ACM SIGPLAN Notices, Volume 39, Number 4, April 2004.
195. ÒRetrospective: Improving Register Allocation for Subscripted VariablesÓ (with D. Callahan and S. Carr), in 20 Years of the ACM SIGPLAN Conference on Programming Language Design and Implementation (1979–1999): A Selection, Kathryn S. McKinley, Editor, ACM SIGPLAN Notices, Volume 39, Number 4, April 2004.
196. ÒImproving Memory Hierarchy Performance Through Combined Loop Interchange and Multi-Level FusionÓ (with Q. Yi), International Journal of High Performance Computing Applications, Volume 18, Number 2, May 2004.
197. ÒAutomatic Blocking of QR and LU Factorizations for LocalityÓ (with Q. Yi, H. You, K. Seymour, and J. Dongarra), in Proceedings of the ACM SIGPLAN Workshop on Memory Systems Performance (MSP 2004), June 2004.
198. ÒAn Automatic Source-level Transformer for Improving the Performance of DSP Applications in MatlabÓ (with Arun Chauhan), in Proceedings of the IASTED International Conference on Signal and Image Processing (SIP 2004), August 2004.
199. ÒDefining and Measuring the Productivity of Programming LanguagesÓ (with C. Koelbel and R. Schreiber), in the International Journal of High Performance Computing Applications Special Issue, Vol. 18, No. 3, Fall 2004.
200. ÒAutomatic Tuning of Whole Applications Using Direct Search and a Performance-based Transformation SystemÓ (with Apan Qasem and John Mellor-Crummey), Proceedings of the LACSI Symposium, Los Alamos Computer Science Institute, Los Alamos, NM, October 2004.
201. ÒScalarization using Loop Alignment and Loop SkewingÓ (with Y. Zhao), The Journal of Supercomputing 31(1), pp. 5-46, January 2005.
202. ÒTelescoping
Languages: A System for Automatic Generation of Domain LanguagesÓ (with Bradley
Broom, Arun Chauhan, Rob Fowler, John Garvin, Charles Koelbel, Cheryl McCosh,
and John Mellor-Crummey), Proceedings of the IEEE 93 (2), pp. 387–408, February 2005.
203. ÒScalarization
on Short Vector MachinesÓ (with Yuan Zhao), in Proceedings of the 2005 IEEE International Symposium on
Performance Analysis of Systems and Software (ISPASS), Austin, TX, March 2005.
204. ÒCompiling Almost–Whole Java ProgramsÓ (with Z. Budimlic), Concurrency and Computation: Practice and Experience 17(5-6), pp. 573-587. April-May 2005.
205. ÒTask Scheduling Strategies for Workflow-based Applications in GridsÓ (with Jim Blythe, Sonal Jain, Ewa Deelman, Yolanda Gil, Karan Vahi, and Anirban Mandal), in Proceedings of the International Symposium on Cluster Computing and the Grid 2005 (CCGrid05), May 2005.
206. ÒNew
Grid Scheduling and Rescheduling Methods in the GrADS ProjectÓ (with Fran Berman, Henri Casanova, Andrew Chien, Keith Cooper,
Holly Dail, Anshuman Dasgupta, Wei Deng, Jack Dongarra, Lennart Johnsson, Ken
Kennedy, Charles Koelbel, Bo Liu, Xin Liu, Anirban Mandal, Gabriel Marin, Mark
Mazina, John Mellor-Crummey, Celso Mendes, Alex Olugbile, Mitul Patel, Dan
Reed, Zhiao Shi, Otto Sievert, Huaxia Xia, and Asim YarKhan), in International
Journal of Parallel Programming 33 (2-3),
pp. 209-229, June 2005.
207. ÒScheduling Strategies for Mapping Application Workflows
onto the GridÓ (with A. Mandal, Charles Koelbel, Gabriel Marin, John
Mellor-Crummey, Bo Liu, and Lennart Johnsson, in Proceedings of the 14th
IEEE International Symposium on High Performance Distributed Computing (HPDC
2005), Research Triangle Park, NC, July
24-27, 2005.
208. ÒA Cache-conscious Profitability Model for Empirical Tuning of Loop FusionÓ (with Apan Qasem), in Proceedings of the 2005 International Workshop on Languages and Compilers for Parallel Computing, Hawthorne, NY, October 20-22, 2005.
Accepted for Publication:
1.
ÒAutomatic Tuning of Whole Applications Using Direct Search
and a Performance-based Transformation SystemÓ (with Apan Qasem and John
Mellor-Crummey), to appear in a special issue of the Journal of
Supercomputing titled "Computer
Science in Support of High-Performance ApplicationsÓ (Rod Oldehoeft, editor).