August 27:
Simultaneous Multithreading and the Case for Chip Multiprocessing
Simultaneous multithreading: maximizing on-chip parallelism,
Dean Tullsen, Susan Eggers, and Henry Levy,
In 25 Years of the International Symposia on Computer Architecture
(Selected Papers) (Barcelona, Spain, June 27 - July 02, 1998).
G. S. Sohi, Ed. ISCA '98. ACM Press, New York, NY, 533-544.
(First published in ISCA '95.)
The case for a single-chip multiprocessor,
Kunle Olukotun, Basem Nayfeh, Lance Hammond, Ken Wilson, and Kunyung Chang,
In Proceedings of
the Seventh international Conference on Architectural Support For
Programming Languages and Operating Systems (Cambridge, Massachusetts,
United States, October 01 - 04, 1996). ASPLOS-VII. ACM Press, New
York, NY, 2-11.
A single-chip multiprocessor,
Lance Hammond, Basem Nayfeh, Kunle Olukotun.
Computer 30(9):79-85, September
1997. DOI=http://dx.doi.org/10.1109/2.612253
Evaluation of design alternatives for a multiprocessor microprocessor,
Basem Nayfeh, Lance Hammond, and Kunle Olukotun,
in Proceedings of the 23rd Annual international Symposium on Computer
Architecture (Philadelphia, Pennsylvania, United States, May 22 - 24,
1996). ISCA '96. ACM Press, New York, NY, 67-77.
September 4: Piranha and Hydra: A Tale of Two Chip Multiprocessors
Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing,
L. A. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk,
S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese
in Proceedings of the International Symposium on Computer
Architecture (ISCA),
pp. 282-293, June 2000.
The Stanford Hydra CMP,
Lance Hammond, Ben Hubbert, Michael Siu, Manohar Prabhu,
Mike Chen, and Kunle Olukotun,
IEEE Micro, pp. 71-84, March-April 2000.
IBM POWER6 microarchitecture, H. Q. Le, W. J. Starke,
J. S. Fields, F. P. O'Connell, D. Q. Nguyen, B. J. Ronchetti,
W. M. Sauer, E. M. Schwarz, and M. T. Vaden, in IBM Journal of
Research and Development, Volume 51, Number 6, 2007.
September 11: Intel's Tiled Multicore
Integration Challenges and Tradeoffs
for Tera-scale Architectures. Mani Azimi,Naveen Cherukuri,
D. N. Jayasimha, Akhilesh Kumar, Partha Kundu, Seungjoon Park, Ioannis
Schoinas, Aniruddha S. Vaidya. Intel Technology Journal, August 2007.
ftp://download.intel.com/technology/itj/2007/v11i3/1-integration/vol11-i3-art01.pdf
The Implementation of the Cilk-5 Multithreaded Language
by Matteo Frigo, Charles E. Leiserson, and Keith H. Randall.
1998 ACM SIGPLAN Conference on Programming Language Design and
Implementation (PLDI), Montreal, Canada, June 1998.
Wavescalar.
S. Swanson, K. Michelson, A. Schwerin, and M. Oskin.
In 36th Annual International Symposium on Microarchitecture (MICRO-36), December 2003.
The Distributed Microarchitecture of the TRIPS Prototype
Processor.
K. Sankaralingam, R. Nagarajan, P. Gratz, R. Desikan, D. Gulati,
H. Hanson, C. Kim, H. Liu, N. Ranganathan, S. Sethumadhavan,
S. Sharif, P. Shivakumar, W. Yoder, R. McDonald, S. Keckler, and
D. Burger.
In 39th ACM/IEEE International Symposium on
Microarchitecture (MICRO), pages 480-491, 2006.
Foundations of the C++ concurrency memory model,
H. Boehm, and S. V. Adve,
in Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation (Tucson, AZ, USA, June 07 - 13, 2008). PLDI '08. ACM, New York, NY, 68-78. DOI= http://doi.acm.org/10.1145/1375581.1375591
October 13: Java Memory Model
The Java Memory Model, J. Manson, W. Pugh, and S. V. Adve,
in Proceedings of the Symposium on Principles of Programming Languages (PoPL), January 2005.
October 17: Implementing Nested Data Parallelism
Implementation of a Portable Nested Data-Parallel Language.
Guy E. Blelloch, Siddhartha Chatterjee,
Jonathan C. Hardwick, Jay Sipelstein, and Marco Zagha.
Technical Report CMU-CS-93-112, School of Computer Science,
Carnegie Mellon University, Pittsburgh, PA. 1993.
(An earlier version of this paper appeared in "Proceedings of the
4th ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming", San Diego, May 1993.)
Detecting data races in Cilk programs that use locks,
G. Cheng, M. Feng, C.E. Leiserson, K. Randall, and A.F. Stark,
in
Proceedings of the Tenth Annual ACM Symposium on Parallel Algorithms
and Architectures (Puerto Vallarta, Mexico, June 28 - July 02,
1998). SPAA '98. ACM, New York, NY, 298-309. DOI=
November 4: Scheduling
Thread scheduling for multiprogrammed multiprocessors,
Nimar S. Arora and Robert D. Blumofe and C. Greg Plaxton,
in Proceedings of the Tenth Annual ACM Symposium on Parallel
Algorithms and Architectures (Puerto Vallarta, Mexico, June 28 - July
02, 1998). SPAA '98. ACM Press, New York, NY, 119-129.
Cache-Fair Thread Scheduling for Multicore Processors
Alexandra Fedorova, Margo Seltzer, and Michael D. Smith. Submitted to
Operating Systems Design and Implementation, 2006.
(Available as Technical Report TR-17-06, Division of Engineering and
Applied Sciences, Harvard University, October 2006.)
November 6: Scheduling and Shared Cache
Provably efficient scheduling for languages with fine-grained parallelism.
Blelloch, G. E., Gibbons, P. B., and Matias, Y. 1995.
In Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms
and Architectures (Santa Barbara, California, United States, June 24 -
26, 1995). SPAA '95. ACM Press, New York, NY, 1-12.
Effectively sharing a cache among threads,
Guy E. Blelloch and Phillip B. Gibbons, in Proceedings of the 16th
Annual ACM Symposium on Parallelism in Algorithms and Architectures
(Barcelona, Spain, June 27 - 30, 2004). SPAA '04. ACM Press, New York,
NY, 235-244.
Transactional memory, J. Larus, J. and C. Kozyrakis, in
Commun. ACM 51, 7 (Jul. 2008), 80-88.
December 2: Hardware Support for Transactional Memory
Transactional
Memory: Architectural Support for Lock-free Data Structures,
Maurice Herlihy and J. Eliot B. Moss in Proceedings of the 20th Annual
International Symposium on Computer Architecture, San Diego,
California, 1993, ACM Press, New York, NY, USA, 289-300.
ISCA most influential paper award, 2008.
Nonblocking transactions without indirection using alert-on-update,
Michael Spear, Arrvindh Shriraman, Luke Dalessandro, Sandhya
Dwarkadas, and Michael .
In Proceedings of the Nineteenth Annual ACM Symposium
on Parallel Algorithms and Architectures (San Diego, California, USA,
June 09 - 11, 2007). SPAA '07. ACM Press, New York, NY, 210-220.
December 4: Software Transactional Memory
Software transactional memory for dynamic-sized data
structures,
Maurice Herlihy, Victor Luchangco, Mark Moir, and William N. Scherer,
III, In Proceedings of the Twenty-Second Annual Symposium on
Principles of Distributed Computing (Boston, Massachusetts, July 13 -
16, 2003). PODC '03. ACM Press, New York, NY, 92-101.
Understanding Tradeoffs in Software Transactional Memory, Dice, D. and Shavit, N. 2007.
In Proceedings of the international Symposium on
Code Generation and Optimization (March 11 - 14, 2007). Code
Generation and Optimization. IEEE Computer Society, Washington, DC,
21-33.