CSE 613 (#51928): Parallel Programming, Spring 2019

Lecture Time and Location. MF 1:00 pm - 2:20 pm, Room 2120 (Old Computer Science Building), West Campus

Instructor. Rezaul A. Chowdhury (rezaul{at}cs{dot}stonybrook{dot}edu)
Office Hours. M 4:00 pm - 5:30 pm, F 6:00 pm - 7:30 pm, 239 New Computer Science Building

Course Description. We will explore algorithms and techniques for programming state-of-the-art shared-memory (e.g., multicores) and distributed-memory parallel computers. The course will include both theoretical and programming components. Topics to be covered include: analytical modeling of parallel programs, bounds on parallel performance, scheduling, synchronization, programming using the message-passing paradigm and for shared address-space platforms, parallel algorithms for dense matrix operations, sorting, searching, graphs, computational geometry, and dynamic programming, concurrent data structures, and transactional memory.

This course is supported by an educational grant from XSEDE (Extreme Science and Engineering Discovery Environment). We will use the computing environment provided by XSEDE for all homeworks and projects. Students will have access to the following supercomputing resources (not final yet): Stampede 2 and Comet.

Prerequisites. Background in algorithms analysis (e.g., CSE 373 or CSE 548) and programming languages (e.g., C/C++) is required (or consent of instructor). Computer architecture background (e.g., CSE 320 or CSE 502) will be helpful, but not essential.

Recommended Textbooks.

  1. Ananth Grama, George Karypis, Vipin Kumar, and Anshul Gupta. Introduction to Parallel Computing (2nd Edition), Addison Wesley, 2003.
  2. Maurice Herlihy and Nir Shavit. The Art of Multiprocessor Programming (1st Edition), Morgan Kaufmann, 2008.
  3. Thomas Cormen, Charles Leiserson, Ronald Rivest, and Clifford Stein. Introduction to Algorithms (3rd Edition), MIT Press, 2009. (chapter 27 on Multithreaded Algorithms)
  4. Joseph JáJá. An Introduction to Parallel Algorithms (1st Edition), Addison Wesley, 1992.
  5. Peter Pacheco. Parallel Programming with MPI (1st Edition), Morgan Kaufmann, 1996.

Course Requirements. There will be 3 homework assignments (with both theory and programming components) and a final group project. Each student will be responsible for scribing one lecture. The course grade will be based on the following.

Download the Latex template for scribe notes.

Blackboard. Some course documents will be available through Blackboard.

Lecture Schedule.

Date Topic Notes / Reading Material
Mon, Jan 28 Introduction -
Fri, Feb 1 Class Canceled -
Mon, Feb 4 Introduction (continued) -
Fri, Feb 8 Analytical Modeling of Parallel Programs
Mon, Feb 11 Analytical Modeling of Parallel Programs (continued)

The Cilk++ Concurrency Platform
Fri, Feb 15 The Cilk++ Concurrency Platform (continued)
  • Yuxiong He, Charles E. Leiserson, and William M. Leiserson“The Cilkview Scalability Analyzer”, Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pp. 145-156, 2010.
Mon, Feb 18 The Cilk++ Concurrency Platform (continued)
Fri, Feb 22 The Cilk++ Concurrency Platform (continued)

Greedy Scheduling

Analysis of a Work Stealing Scheduler
  • Chapter 27 (Multithreaded Algorithms), Introduction to Algorithms (3rd Edition) by Cormen et al.
Mon, Feb 25 Analysis of a Work Stealing Scheduler (continued)
Fri, Mar 1 Introduction to OpenMP
(Guest Lecturer: Md. Abdullah Shahneous Bari)
-
Mon, Mar 4 Introduction to OpenMP
(continued) (Guest Lecturer: Md. Abdullah Shahneous Bari)
-
Fri, Mar 8 Analysis of a Work Stealing Scheduler (continued)

High Probability Bounds
  • Umut A. Acar, Guy E. Blelloch, and Robert D. Blumofe, “The Data Locality of Work Stealing”, Proceedings of the 12th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 1-12, 2000.
  • [optional] Chapter 6 (Algorithms Involving Sequences and Sets), Section 6.9.2 (A Coloring Problem), Introduction to Algorithms - A Creative Approach (1st Edition) by Udi Manber.
  • [optional] Chapter 1 (Introduction), Section 1.1 (A Min-Cut Algorithm), Randomized Algorithms (1st Edition) by Rajeev Motwani and Prabhakar Raghavan.
  • Torben Hagerup and Christine Rüb, “A Guided Tour of Chernoff Bounds”, Information Processing Letters, 33(6), pp. 305-308, 1990.
Mon, Mar 11 Analysis of a Work Stealing Scheduler (continued)

Basic Parallel Algorithmic Techniques
  • Guy E. Blelloch and Bruce M. Maggs, “Parallel Algorithms”, The Computer Science Handbook, Editor: Allen Tucker, Chapman & Hall/CRC, 2004.
Fri, Mar 15 Basic Parallel Algorithmic Techniques (continued)
  • Chapter 2 (Basic Techniques), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Mon, Mar 18 Spring Break -
Fri, Mar 22 Spring Break -
Mon, Mar 25 Basic Parallel Algorithmic Techniques (continued)
Analyzing Divide-and-Conquer Algorithms
  • Chapter 27 (Multithreaded Algorithms), Introduction to Algorithms (3rd Edition) by Cormen et al.
  • Chapter 4 (Divide-and-Conquer), Introduction to Algorithms (3rd Edition) by Cormen et al.
Fri, Mar 29 Analyzing Divide-and-Conquer Algorithms (continued)
Parallel Quicksort and Selection
  • Chapter 9 (Randomized Algorithms), Section 9.6.3 (A Parallel Randomized Quicksort Algorithm), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Mon, Apr 1 Parallel Quicksort and Selection (continued)
  • Chapter 4 (Searching, Merging, and Sorting), Section 4.5 (Selection), An Introduction to Parallel Algorithms (1992) by Joseph JáJá
Fri, Apr 5 The Message Passing Interface
(Guest Lecturer: Zafar Ahmad)
  • Chapter 6 (Programming Using the Message-Passing Paradigm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 2 (Message-Passing Computing), Parallel Programming (2nd Edition) by Wilkinson & Allen
Mon, Apr 8 Parallel Quicksort and Selection (continued)
Parallel Connected Components
  • Guy E. Blelloch and Bruce M. Maggs, “Parallel Algorithms” (Section 4.3), The Computer Science Handbook, Editor: Allen Tucker, Chapman & Hall/CRC, 2004.
Fri, Apr 12 Parallel Connected Components (continued)
Minimum Spanning Trees (and Radix Sort)
  • Guy E. Blelloch and Bruce M. Maggs, “Parallel Algorithms” (Section 4.3), The Computer Science Handbook, Editor: Allen Tucker, Chapman & Hall/CRC, 2004.
Mon, Apr 15 Minimum Spanning Trees (and Radix Sort) (continued)
Fri, Apr 19 Minimum Spanning Trees (and Radix Sort) (continued)
Mon, Apr 22 Maximal Independent Set
Fri, Apr 26 Distributed-Memory Algorithms: Dense Matrices
  • Chapter 2 (Parallel Programming Platforms), Section 2.5.1 (Message Passing Costs in Parallel Computers), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 4 (Basic Communication Operations), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 6 (Programming Using the Message-Passing Paradigm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 8 (Dense Matrix Algorithms), Section 8.2.2 (Cannon's Algorithm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
Mon, Apr 29 Distributed-Memory Algorithms: Dense Matrices (continued)
  • Chapter 10 (Graph Algorithms), Section 10.4.2 (Floyd's Algorithm), Introduction to Parallel Computing (2nd Edition) by Grama et al.
Fri, May 3 Distributed-Memory Algorithms: Sorting and Searching
  • Chapter 9 (Sorting), Section 9.5 (Bucket and Sample Sort), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • Chapter 6 (Programming Using the Message-Passing Paradigm), Section 6.6.10 (Example: Sample Sort), Introduction to Parallel Computing (2nd Edition) by Grama et al.
Mon, May 6 Distributed-Memory Algorithms: Sorting and Searching (continued)
  • Chapter 11 (Search Algorithms for Discrete Optimization Problems), Sections 11.4.1-11.4.4 (Parallel Depth-First Search), Introduction to Parallel Computing (2nd Edition) by Grama et al.
  • John H. Reif, “Depth-First Search is Inherently Sequential”, Information Processing Letters, vol. 20 (5), pp. 229-234, 1985. (If you are unable access the paper please check Blackboard under Documents/Misc for a local copy)
Fri, May 10 Concurrent Data Structures: Queues and Stacks
  • Chapter 3 (Concurrent Objects), The Art of Multiprocessor Programming (2nd Edition) by Herlihy and Shavit.
  • Chapter 10 (Concurrent Queues and the ABA Problem), The Art of Multiprocessor Programming (2nd Edition) by Herlihy and Shavit.
  • Chapter 11 (Concurrent Stacks and Elimination), The Art of Multiprocessor Programming (2nd Edition) by Herlihy and Shavit.