CSE 613 (#50842): Parallel Programming, Fall 2013

Lecture Time and Location. TuTh 10:00 am - 11:20 am, Melville Library N4006, West Campus

Instructor. Rezaul A. Chowdhury (rezaul{at}cs{dot}stonybrook{dot}edu)
Office Hours. TuTh 12:30 pm - 2:00 pm, 1421 Computer Science Building

Course Description. We will explore algorithms and techniques for programming state-of-the-art shared-memory (e.g., multicores) and distributed-memory parallel computers. The course will include both theoretical and programming components. Topics to be covered include: analytical modeling of parallel programs, bounds on parallel performance, scheduling, synchronization, programming using the message-passing paradigm and for shared address-space platforms, parallel algorithms for dense matrix operations, sorting, searching, graphs, computational geometry, and dynamic programming, concurrent data structures, and transactional memory.

This course is supported by an educational grant from XSEDE (Extreme Science and Engineering Discovery Environment). We will use the computing environment provided by XSEDE for all homeworks and projects. Students will have access to the following supercomputing resources: Stampede, Kraken, Lonestar, Trestles and Keeneland.

Prerequisites. Background in algorithms analysis (e.g., CSE 373 or CSE 548) and programming languages (e.g., C/C++) is required (or consent of instructor). Computer architecture background (e.g., CSE 320 or CSE 502) will be helpful, but not essential.

Recommended Textbooks.

  1. Ananth Grama, George Karypis, Vipin Kumar, and Anshul Gupta. Introduction to Parallel Computing (2nd Edition), Addison Wesley, 2003.
  2. Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming (1st Edition), Morgan Kaufmann, 2008.
  3. Thomas Cormen, Charles Leiserson, Ronald Rivest, and Clifford Stein. Introduction to Algorithms (3rd Edition), MIT Press, 2009. (chapter 27 on Multithreaded Algorithms)
  4. Joseph JáJá. An Introduction to Parallel Algorithms (1st Edition), Addison Wesley, 1992.
  5. Peter Pacheco. Parallel Programming with MPI (1st Edition), Morgan Kaufmann, 1996.

Course Requirements. There will be 3 homework assignments (with both theory and programming components), one exam (in-class final), and a final group project. Each student will be responsible for scribing one lecture. The course grade will be based on the following.

Download the Latex template for scribe notes.

Blackboard. Some course documents will be available through Blackboard.

Lecture Schedule.

Date Topic Notes / Reading Material
Tue, Aug 27 Introduction -
Thu, Aug 29 Introduction (continued) -
Thu, Sep 5 Analytical Modeling of Parallel Programs
  • Chapter 5 (Analytical Modeling of Parallel Programs), Introduction to Parallel Computing (2nd Edition) by Grama et al.
Tue, Sep 10 Analytical Modeling of Parallel Programs (continued)
The Cilk++ Concurrency Platform
Thu, Sep 12 The Cilk++ Concurrency Platform (continued)
Tue, Sep 17 The Cilk++ Concurrency Platform (continued)
  • Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin“Reducers and other Cilk++ hyperobjects”, Proceedings of the 21st ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 79-90, 2009.
Thu, Sep 19 Scheduling and Work Stealing
  • Project Proposal Due
  • Chapter 27 (Multithreaded Algorithms), Introduction to Algorithms (3rd Edition) by Cormen et al.
  • Nimar S. Arora, Robert D. Blumofe, and Charles G. Plaxton, “Thread Scheduling for Multiprogrammed Multiprocessors”, Proceedings of the 10th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 119-129, 1998.
Tue, Sep 24 Scheduling and Work Stealing (continued)
Parallel Matrix Multiplication and Mergesort
  • HW1 Out
  • Chapter 4 (Divide-and-Conquer), Introduction to Algorithms (3rd Edition) by Cormen et al.
Thu, Sep 26 Parallel Matrix Multiplication and Mergesort (continued)
  • Chapter 27 (Multithreaded Algorithms), Introduction to Algorithms (3rd Edition) by Cormen et al.
Tue, Oct 1 The Master Theorem
  • Chapter 4 (Divide-and-Conquer), Introduction to Algorithms (3rd Edition) by Cormen et al.
Thu, Oct 3 Parallel Quicksort and Selection
  • Chapter 2 (Basic Techniques), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Tue, Oct 8 Parallel Quicksort and Selection (continued)
  • Chapter 9 (Randomized Algorithms), Section 9.6.3 (A Parallel Randomized Quicksort Algorithm), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Thu, Oct 10 Parallel Quicksort and Selection (continued)
  • Chapter 4 (Searching, Merging, and Sorting), Section 4.5 (Selection), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Tue, Oct 15 Parallel Connected Components
  • Chapter 2 (Basic Techniques), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Thu, Oct 17 Parallel Connected Components (continued)
  • Guy E. Blelloch and Bruce M. Maggs, “Parallel Algorithms” (Section 4.3), The Computer Science Handbook, Editor: Allen Tucker, Chapman & Hall/CRC, 2004.
  • HW1 Due
Tue, Oct 22 Parallel Connected Components (continued)
  • HW2 Out
Thu, Oct 24 Project Progress Report (in-class presentation) -
Tue, Oct 29 Graph Algorithms: Minimum Spanning Trees (and Radix Sort)
Thu, Oct 31 Graph Algorithms: Minimum Spanning Trees (and Radix Sort) (continued) -
Tue, Nov 5 Graph Algorithms: Minimum Spanning Trees (and Radix Sort) (continued) -
Thu, Nov 7 Graph Algorithms: Minimum Spanning Trees (and Radix Sort) (continued) -
Tue, Nov 12 The Message Passing Interface (Guest Lecturer: Jesmin Jahan Tithi)
  • Chapter 6 (Programming Using the Message-Passing Paradigm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 2 (Message-Passing Computing), Parallel Programming (2nd Edition) by Wilkinson & Allen
  • HW2 Due
Thu, Nov 14 Distributed-Memory Algorithms: Dense Matrices
  • Chapter 2 (Parallel Programming Platforms), Section 2.5.1 (Message Passing Costs in Parallel Computers), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 4 (Basic Communication Operations), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 6 (Programming Using the Message-Passing Paradigm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 8 (Dense Matrix Algorithms), Section 8.2.2 (Cannon's Algorithm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 10 (Graph Algorithms), Section 10.4.2 (Floyd's Algorithm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
Tue, Nov 19 Distributed-Memory Algorithms: Dense Matrices (continued)
  • HW3 Out
Thu, Nov 21 Distributed-Memory Algorithms: Sorting and Searching
  • Chapter 9 (Sorting), Section 9.5 (Bucket and Sample Sort), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 6 (Programming Using the Message-Passing Paradigm), Section 6.6.10 (Example: Sample Sort), Introduction to Parallel Computing (2nd Edition) by Grama et al.
Tue, Nov 26 Distributed-Memory Algorithms: Sorting and Searching (continued) -
Tue, Dec 3 Distributed-Memory Algorithms: Sorting and Searching (continued)
  • Chapter 11 (Search Algorithms for Discrete Optimization Problems), Sections 11.4.1-11.4.4 (Parallel Depth-First Search), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • John H. Reif, “Depth-First Search is Inherently Sequential”, Information Processing Letters, vol. 20 (5), pp. 229-234, 1985.
Thu, Dec 5 Final Exam (in-class) HW3 Due

Programming Resources.