CS 260r: Topics and Close Readings in Software Systems (2022)

CS 260r is a readings and discussion course about research into software systems. Our primary work will involve reading and discussing recent, and less-recent, systems papers, and, at the end of the semester, a project. The topic changes every time; past offerings include:

This year’s topic is large-scale distributed and parallel computations. We will consider some data-centric systems (large-scale databases), but I intend us to cover many computation-centric systems. Topics will include:

Distributed computation infrastructure (e.g. OpenMPI, Cilk)
Distributed data stores and relational-type databases (e.g. Google’s Spanner, Facebook’s Tao)
Graph databases
Machine learning parameter server-type systems
Scientific computing systems (e.g. Dask)
Blockchain computations

The purpose of this class is for us all to learn. I know systems research, but large-scale distributed and parallel computations are not my area of expertise. So I hope we’ll all better understand the landscape of the literature in this area, including what questions are open, what questions are settled, how deployed systems relate, and what remains to be done.

In our first unit, I’d like to discuss some of the most important systems and frameworks in this area.

Final Project Paper Advice
On reading research papers
Project 1: Simulator
January 26: Early scientific computing systems (slides)
- “Standards for Message-Passing in a Distributed Memory Environment”, Walker DW (1992) (link)
- “MPI: A Standard Message Passing Interface”, Walker DW, Dongarra JJ (1996) (link)
- “Monitors, Messages, and Clusters: The p4 Parallel Programming System”, Butler RM, Lusk ML (1994) (link)
- Optional backup reading: “MPI: A Message Passing Interface”, The MPI Forum (1993) (link)
January 31: Map-Reduce and Spark
- “MapReduce: Simplified Data Processing on Large Clusters”, Dean J, Ghemawat S (2004) (link)
- “Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing” (Spark), Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) (link)
- Background (read this after MapReduce): “A Bridging Model for Parallel Computation”, Valiant L (1990) (link)
February 2: Incremental MapReduce (slides)
- “DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language”, Yu Y, Isard M, Fetterly D, Budiu M, Erlingsson Ú, Gunda PK, Currey J (2008) (link)
- “Incremental, Iterative Data Processing with Timely Dataflow”, Murray DG, McSherry F, Isard M, Isaacs R, Barham P, Abadi M (2016) (link)
February 7: Distributed machine learning: Parameter servers (Maegan & Albert)
- “Scaling Distributed Machine Learning with the Parameter Server”, Li M, Andersen DG, Park JW, Smola AJ, Ahmed A, Josifovski V, Long J, Shekita EJ, Su BY (2014) (link; ‘Li PDF’ is the PDF)
- “PLink: Discovering and Exploiting Datacenter Network Locality for Efficient Cloud-Based Distributed Training”, Luo L, West P, Krishnamurthy A, Ceze L, Nelson J (2020) (link)
February 9: Modern DML and scientific computing toolkits (slides) (April & Zev)
- “TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems”, Abadi M, Agarwal A, Barham P, Brevdo E, et al. (2015) (link)
- “Dask: Parallel Computation with Blocked algorithms and Task Scheduling”, Rocklin M (2015) (link)
- “A performance comparison of Dask and Apache Spark for data-intensive neuroimaging pipelines”, Dugré M, Hayot-Sasson V, Glatard T (2019) (link) (potentially replicable!)
February 14: PyTorch and dynamic scheduling in DML (Sanket)
- “PyTorch: An Imperative Style, High-Performance Deep Learning Library”, Paszke A, Gross S, Massa F, et al. (2019) (link)
- “PyTorch Distributed: Experiences on Accelerating Data Parallel Training”, Li S, Zhao Y, Varma R, et al. (2020) (link)
February 16: Presentations and discussions on Project 1, Phase 1
- “The TensorFlow partitioning and scheduling problem: it’s the critical path!”, Mayer R, Mayer C, Laich L (2017) (link)
February 21: No class
February 23: Cluster schedulers (slides)
- “Quincy: Fair scheduling for distributed computing clusters”, Isard M, Prabhakaran V, Currey J, et al. (2009) (link)
- “Mesos: A platform for fine-grained resource sharing in the data center”, Hindman A, Konwinski A, Zaharia M, Ghodsi A et al. (2011) (link)
- “Large-scale cluster management at Google with Borg”, Verma A, Pedrosa L, Korupolu M et al. (2015) (link)
February 28: Relational databases (Savvy)
- “Spark SQL: Relational Data Processing in Spark”, Armbrust M, Xin RS, Lian C, Huai Y, Liu D, Bradley JK, Meng X, Kaftan T, Franklin MJ, Ghodsi A, Zaharia M (2015) (link)
- “F1: A distributed SQL database that scales”, Shute J, Vingralek R, Samwel B, Handy B, Whipkey C et al. (2013) (link)
March 2: Graph databases (Jordan & Troy)
- “GraphX: Graph Processing in a Distributed Dataflow Framework”, Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) (link)
- “Scalability! But at what COST?”, McSherry F, Isard M, Murray DG (2015) (link)
March 7: Simulator presentations
March 9: Final project pitches
March 14, 16: Break
March 21: Incremental MapReduce and successors (Chelse & Esther)
- “NetSolve/D: A Massively Parallel Grid Execution System for Scalable Data Intensive Collaboration”, Beck M, Dongarra J, Plank JS (2005) (link)
  - Access this paper through the Harvard Libraries: log in then search for “NetSolve/D”
  - Grid computing
- “Dynamic Control Flow in Large-Scale Machine Learning”, Yu Y, Abadi M, Barham P, et al. (2018) (link)
March 23: Byzantine fault-tolerance and blockchain
- “Practical Byzantine Fault Tolerance”, Castro M, Liskov B (1999) (link)
- “Bitcoin: A Peer-to-Peer Electronic Cash System”, Nakamoto S (2008) (link)
March 28: Blockchains and smart contract errors (Hao & Katherine)
- “OmniLedger: A Secure, Scale-Out, Decentralized Ledger via Sharding”, Kokoris-Kogias E, Jovanovic P, Gasser L, Gailly N, Syta E, Ford B (2018) (link)
- “A Concurrent Perspective on Smart Contracts”, Sergey I, Hobor A (2017) (link)
March 30: Styles of distributed computation (Shreyas & Will)
- “Distributed Aggregation for Data-Parallel Computing: Interfaces and Implementations”, Yu Y, Gunda PK, Isard M (2009) (link)
April 4: Consensus (Jaylen)
- “Paxos made simple”, Lamport L (2001) (link)
- One-slide Paxos pseudocode, Robert Morris (2014)
- “In search of an understandable consensus algorithm”, Ongaro D, Ousterhout J (2014) (link)
- Optional backup: “Paxos made live: an engineering perspective”, Chanra T, Griesemer R, Redston J (2007)
April 6: Engineering correct distributed systems (Dimitrije & Simas)
- “How Amazon Web Services Uses Formal Methods”, Newcombe C, Rath T, Zhang F, Munteanu B, Brooker M, Deardeuff M (2015) (link)
- “Uncovering Bugs in Distributed Storage Systems during Testing (Not in Production!)” Deligiannis P, McCutchen M, Thomson P, Chen S, et al. (2016) (link)
April 11: Well-balanced distributed computations (Abe)
- “TritonSort: A Balanced Large-Scale Sorting System”, Rasmussen A, Porter G, Conley M, Madhyastha HV, Mysore RN, Pucher A, Vahdat A” (2011) (link)
- “Serving DNNs like Clockwork: Performance Predictability from the Bottom Up”, Gujarati A, Karimi R, Alzayat S, Hao W, Kaufmann A, Vigfusson Y, Mace J (2020) (link)
April 13: Ray (Priyan & Varun)
- “Ray: A Distributed Framework for Emerging AI Applications”, Moritz P, Nishihara R, Wang S, Tumanov A, Liaw R, Liang E, Elibol M, Yang Z, Paul W, Jordan MI, Stoica I (2018) (link)
- “RLlib: Abstractions for Distributed Reinforcement Learning”, Liang E, Liaw R, Nishihara R, Moritz P, Fox R, Goldberg K, Gonzalez J, Jordan M, Stoica I (2018) (link)
April 18: Content delivery networks (Yunho & Andrew)
- “The Akamai Network: A Platform for High-Performance Distributed Applications”, Nygren E, Sitaraman RK, Sun J (2010) (link)
- OPTIONAL: “Democratizing Content Publication with Coral”, Freedman F, Freudenthal E, Mazières D (2004) (link)—optional because the next paper has a summary of the most important points
- “Experiences with CoralCDN: A Five-Year Operational View”, Freedman F (2010) (link)
April 20: Distributed system tracing (Nick & Rithvik)
- “X-Trace: A Pervasive Network Tracing Framework”, Fonseca R, Porter G, Katz RH, Shenker S, Stoica I (2007) (link)
- “Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems”, Mace J, Roelke R, Fonseca R (2015) (link)
- OPTIONAL backup: “Fay: Extensible Distributed Tracing from Kernels to Clusters”, Erlingsson Ú, Peinado M, Peter S, Budiu M, Mainar-Ruiz G (2012) (link)
April 25: Distributed file systems
- “Ceph: A Scalable, High-Performance Distributed File System”, Weil SA, Brandt SA, Miller EL, Long DEL (2006) (link)
- “Finding a needle in Haystack: Facebook’s photo storage”, Beaver D, Kumar S, Li HC, Sobel J, Vajgel P (2010) (link)
- Some other distributed file systems, if you’re interested: GFS, Tectonic
April 27: Looking back

Grid computing and the Globus toolkit
- “Computational Grids”, Foster I, Kesselman C (1998) (link)
- “Globus Toolkit Version 4: Software for Service-Oriented Systems”, Foster I (2006) (link)
- “Globus: Recent Enhancements and Future Plans”, Chard K, Tuecke S, Foster I, Allen B et al. (2016) (link)
Later scientific computing systems
- “Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud”, Low Y, Gonzalez J, Kyrola A, Bickson D, Guestrin C, Hellerstein JM (2012) (link)
- “Elemental: A New Framework for Distributed Memory Dense Matrix Computations”, Poulson J, Marker B, van de Geijn RA, Hammond JR, Romero NA (2013) (link) (math heavy!)
References
- “On Runtime Systems for Task-Based Programming on Heterogeneous Platforms”, Thibault S (2018) (link) — a full book, early sections contain an excellent overview
- “Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques, and Tools”, Mayer R and Jacobsen HA (2020) (link)