Projects and Close Readings in Software Systems:
Verified Systems

CS 260r Spring 2017

Lectures:      MW 1–2:30pm, MD 221
Instructor:      Eddie Kohler
Office hours:      Time Tuesday 10am–noon, by appointment, or any time I’m in my office and available

Material:      Course code repository
Piazza
Course Coq cheatsheet

Resources:      Coq homepage
Download Coq and CoqIDE (or, on Mac, use Homebrew: brew install coq Caskroom/cask/coqide)
Proof General (Coq for Emacs)
Coq cheatsheet
Two Coq textbooks: Software Foundations, Certified Programming with Dependent Types

Problem sets:      see course code repository

Project information

Week 1 Mon 1/23 Course introduction
Wed 1/25 seL4: A verified operating system
“seL4: Formal verification of an operating-system kernel” (Research Highlight), Gerwin Klein, June Andronick, Kevin Elphinstone, Gernot Heiser, David Cock, Philip Derrin, Dhammika Elkaduwe, Kai Engelhardt, Rafal Kolanski, Michael Norrish, Thomas Sewell, Harvey Tuch and Simon Winwood, Communications of the ACM 53:6, June 2010, pp107–115
Further information:

Survey question! The core of the seL4 correctness proof is the forward simulation relation described in §4.4 and illustrated in Figure 6. (I believe insane category theorists would describe this as a kind of “Galois connection.”) Explain how the forward simulation relation would work in the restricted domain of seL4’s scheduling policy (Figure 3). What abstract state (σ) is implied by Figure 3? What concrete state would you expect the implementation to maintain? What does an abstract scheduling operation perform (what does M_1 do)? Finally, what is the relation between abstract and concrete states?

Week 2 Mon 1/30 The CompCert compiler
“Formal verification of a realistic compiler” (Research Highlight), Xavier Leroy, Communications of the ACM 52(7), 2009, pp107–115 (Aaron)
Further information:

Survey questions! 1. The seL4 paper, and our class discussion on 1/25, discussed the forward simulation that shows the functional correctness of seL4’s implementation relative to its spec. The CompCert paper also discusses functional correctness, in Section 2, but it does not use simulation. Describe at least two ways that Equation (1) differs from a forward simulation.

2. Section 4.3 gives the following relation between original registers and allocated registers: “R(r) = R'(φ(r)) for all pseudo-registers r live at point 𝓁.” Say that CompCert’s designer made a mistake in the definition of “live pseudo-registers,” so that in some cases, a pseudo-register was incorrectly considered dead at point 𝓁 (even though its value was used at some later point). What would be a consequence of this mistake?

Wed 2/1 Coq work
Week 3 Mon 2/6 A verified kernel subsystem
“Jitk: A trustworthy in-kernel interpreter infrastructure”, Xi Wang, David Lazar, Nickolai Zeldovich, Adam Chlipala, Zachary Tatlock, Proc. OSDI 2014 (Jason)
Further information:

Survey questions! 1. Jitk security filters are defined in SCPL, a new language, but the kernel verifier—the most important contribution—accepts BPF filters as input. SCPL requires an additional proof layer, a burden. What is an advantage of SCPL?

2. The last paragraph of §5.3 seems very precisely worded. When the authors say Lemma 3 “provides a strong sanity check on the internal consistency of our encoder and decoder,” is this simply another way of saying Lemma 3 proves the correctness of their encoder and decoder? Why or why not?

Wed 2/8 A Coq-verified kernel
“CertiKOS: An extensible architecture for building certified concurrent OS kernels”, Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan (Newman) Wu, Jieung Kim, Vilhelm Sjoberg, David Costanzo, Proc. OSDI 2016 (Lily)
Further information:

Survey question! Section 3.4 concerns the semantics of partially-active multicore machines. Given a program P, and a behavior set denoted [[*]], [[P]]_pt(A) describes the observable behaviors when only the CPUs in A are “active,” in the sense the paper defines. Each element of this set may contain observable actions from *any* CPU, either in A or in (C – A). Is there any conceptual difference between the kinds of actions [[P]]_pt(A) will contain for CPUs in A and those for CPUs in (C – A)?

Week 4 Mon 2/13 More on CertiKOS
Wed 2/15 Using an “autoactive” verifier
“Ironclad Apps: End-to-End Security via Automated Full-System Verification”, Chris Hawblitzel, Jon Howell Jacob R. Lorch, Arjun Narayan, Bryan Parno, Danfeng Zhang, and Brian Zill, Proc. OSDI 2014 (Rob)

Survey question! Which parts of the Ironclad Apps ecosystem are trusted (i.e., unverified) in the proof of Remote Equivalence?

Week 5 Mon 2/20 Presidents’ Day holiday
Wed 2/21 Verve

Survey question! Give one or more “ensures” clauses for the YieldTo procedure defined in §4.6, in either English text or Boogie-like pseudocode. Your clauses should enforce the invariants on which the Verve Nucleus depends.

Week 6 Mon 2/27 Permutation presentations, dependent types, refine
Wed 3/1 FSCQ
“Using Crash Hoare Logic for Certifying the FSCQ File System”, Haogang Chen, Daniel Ziegler, Tej Chajed, Adam Chlipala, M. Frans Kaashoek, and Nickolai Zeldovich, Proc. SOSP 2015 (Gabe)

Survey question! Consider the “log_commit()” pseudocode in Figure 14. The correctness proof would not go through if any of the disk_sync() calls were removed. Roughly speaking, how would the proof of atomic_two_write() fail if the second call to disk_sync() were removed?

Week 7 Mon 3/6 Verdi: Distributed systems verification
“Verdi: A Framework for Implementing and Formally Verifying Distributed Systems”, James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson, Proc. PLDI 2015 (Hao)

Survey question! The Figure 10 sequence numbering transformer depends on “int” (https://coq.inria.fr/library/Coq.ZArith.Int.html). Would it be possible to use CompCert’s “int32” or “int64” instead? How would the proof need to change?

Wed 3/8 IronFleet
“IronFleet: Proving Practical Distributed Systems Correct”, Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jay Lorch, Bryan Parno, Michael Lowell Roberts, Srinath Setty, and Brian Zill, Proc. SOSP 2015 (Thomas)

Survey questions! 1. The Verdi key-value store specification uses per-key traces (page 364 of Verdi), whereas the IronKV key-value store specification uses a transition relation on an abstract hash-table state (Figure 11 on page 10 of IronFleet). Contrast these approaches.

2 (optional but recommended). The Verdi key-value store implementation https://github.com/uwplse/verdi/blob/master/systems/VarD.v#L110 supports five operations, get, put, delete, compare-and-swap, and compare-and-delete. Now consider the “high-level spec for IronKV” in Figure 11. Write a new version of this spec that supports at least one additional operation from VarD. (NB: In Dafny, the “::” operator means “such that,” not “cons.”)

Week 8 Mon 3/20 Coq work/project discussion
Wed 3/22 Bugs in verified code
“An Empirical Study on the Correctness of Formally Verified Distributed Systems”, Pedro Fonseca, Kaiyuan Zhang, Xi Wang, and Arvind Krishnamurthy, Proc. EuroSys 2017 (Jeff)

Survey questions! 1. Which of the bugs listed in Figure 3 could be fixed by using a TCP transport shim rather than a UDP transport shim?

2. What properties would the TCP shim need to have to actually fix those bugs?

Week 9 Mon 3/27 Separation logic
“A Primer on Separation Logic (and Automatic Program Verification and Analysis)”, Peter W. O’Hearn, Software Safety and Security: Tools for Analysis and Verification, NATO Science for Peace and Security 33, 2012, sections 1–3 only (Eddie)
Optional alternate material—the research paper: “Local reasoning about programs that alter data structures”, Peter O’Hearn, John Reynolds, and Hongseok Yang, Proc. CSL 2001

Survey question! 1. Describe the meaning of the “-*” definition from Section 3 in intuitive terms.

2. Give a heap that distinguishes ls(10,10) and ils(10,10) (part of Exercise 4). (A heap “distinguishes” two assertions if it meets one assertion, but not the other.)

Wed 3/29 Concurrent separation logic
“Concurrent separation logic”, Stephen Brookes and Peter W. O’Hearn, ACM SIGLOG News 3(3), July 2016, Section 2 (David)
“Resources, Concurrency and Local Reasoning”, Peter W. O’Hearn, Theoretical Computer Science 375(1–3), May 2007, Sections 6–7
Week 10 Mon 4/3 Weak memory models
“GPS: Navigating Weak Memory with Ghosts, Protocols, and Separation”, Aaron Turon, Viktor Vafeiadis, and Derek Dreyer, Proc. OOPSLA ’14 (Michael)
Note! This paper has an extensive appendix. Only print pp1–17.

Survey questions! 1. Which of the following relations must be transitive? sb, mo, rf, sw, hb.

2. Why does concurrent separation logic avoid needing protocol assertions (box things)?

Wed 4/5 RCU
“Verifying Read-Copy-Update in a Logic for Weak Memory”, Joseph Tassarotti, Derek Dreyer, and Viktor Vafeiadis, Proc. PLDI ’15 (Thomas)

Survey question! Given a singly-linked RCU list of ints, the following code returns the first int (or -1 if the list is empty):

head = rcuReadStart(list);
i = head ? head->data : -1;
rcuQuiescentState(list, list->wcounter);
return i;

This code should verify given the RCU specifications in Figure 3, but if you switched the order of the i assignment with rcuQuiescentState, the code would be incorrect and would not verify. Why not? Give an intuitive answer, a specific answer involving Figure 3, or, preferably, both.

Week 11 Mon 4/10 Crash refinement
“Push-Button Verification of File Systems via Crash Refinement”, Helgi Sigurbjarnarson, James Bornholt, Emina Torlak, and Xi Wang, Proc. OSDI 2016 (Dan and Ezra)

Survey question! Give at least one example of a common file system operation or optimization that satisfies the no-op definition (definition 5). How does the no-op definition make proofs easier?

Wed 4/12 ExpressOS
“Verifying Security Invariants in ExpressOS”, Haohui Mai, Edgar Pek, Hui Xue, Samuel T. King, and P. Madhusudan, Proc. ASPLOS ’13 (Crystal)

Survey question! In §6 we are told that “It is important for the ExpressOS kernel to maintain proper mappings between the file descriptors (fd) of the user-level helper and those of the application.” Important for security, for functional correctness, or both? Explain your answer.

Week 12 Mon 4/17 Project proposal discussion
Wed 4/19 Boogie
“Boogie: A Modular Reusable Verifier for Object-Oriented Programs”, Mike Barnett, Bor-Yuh Evan Chang, Robert DeLine, Bart Jacobs, and K. Rustan M. Leino, Proc. FMCO [Formal Methods for Components and Objects] 2005 (Richard)

Survey questions! 1. Allocation in BoogiePL is represented by “havoc e; assume …;”. What would be different if this translation used “assert” instead of “assume”?

2. The “assume” clause for the allocation in Fig.3 reads “e ≠ null /\ typeof(e)=Example /\ Heap[e,allocated]=false”. Briefly explain the purposes of these three clauses.

Week 13 Mon 4/24 Cocoon
“Correct by Construction Networks Using Stepwise Refinement”, Leonid Ryzhyk, Nikolaj Bjørner, Marco Canini, Jean-Baptiste Jeannin, Cole Schlesinger, Douglas B. Terry, and George Varghese, Proc. NSDI 2017

Survey question! Cocoon can be analogized to a compiler such as CompCert. What is Cocoon's output language, and what assumptions does Cocoon verification make about that output language?

Wed 4/26 Project presentations
Future reading
CertiKOS
“Deep Specifications and Certified Abstraction Layers”, Ronghui Gu, Jérémie Koenig, Tahina Ramananandro, Zhong Shao, Xiongnan (Newman) Wu, Shu-Chun Weng, Haozhong Zhang, and Yu Guo, Proc. POPL 2015
Verve/Boogie/Ironclad
Certified assembly languages
“The Bedrock Structured Programming System: Combining Generative Metaprogramming and Hoare Logic in an Extensible Program Verifier”, Adam Chlipala, Proc. ICFP 2013
“Mostly-Automated Verification of Low-Level Programs in Computational Separation Logic”, Adam Chlipala, Proc. PLDI 2011
Verifying subsystems and applications
“Push-Button Verification of File Systems via Crash Refinement”, Helgi Sigurbjarnarson, James Bornholt, Emina Torlak, and Xi Wang, Proc. OSDI 2016
“Functional MOdels of Hadoop MapReduce with Application to Scan”, Kiminori Matsuzaki, Int. J. Parallel Prog. 2016. This seems like a bad paper. We generally try to read good papers, but it still might be worth seeing how little is required for publication in certain kinds of commercial journal
“Investigating Safety of a Radiotherapy Machine Using System Models with Pluggable Checkers”, Stuart Pernsteiner, Calvin Loncaric, Emina Torlak, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Jonathan Jacky, Proc. CAV 2016.
“From Network Interface to Multithreaded Web Applications: A Case Study in Modular Program Verification”, Adam Chlipala, Proc. POPL 2015
“How to Make Ad Hoc Proof Automation Less Ad Hoc”, Georges Gonthier, Beta Ziliani, Aleksandar Nanevski, and Derek Dreyer, J. Functional Programming 23(4), 2013
Concurrency
“Automated and modular refinement reasoning for concurrent programs”, Shaz Qadeer, Serdar Tasiran, and Chris Hawblitzel, Proc. CAV 2015
Distributed systems
“Synthesis of Self-Stabilizing and Byzantine-Resilient Distributed Systems”, Roderick Bloem, Nicolas Braud-Santoni and Swen Jacobs, Proc. CAV 2016
“Model Checking at Scale: Automated Air Traffic Control Design Space Exploration”, Marco Gario, Alessandro Cimatti, Cristian Mattarei, Stefano Tonetta and Kristin Yvonne Rozier, Proc. CAV 2016
“Ivy: Safety Verification by Interactive Generalization”, Oded Padon, Kenneth McMillan, Aurojit Panda, Sharon Shoham, Proc. PLDI 2016
“Formal specification and verification of CRDTs”, Peter Zeller, Annette Bieniusa, and Arnd Poetzsch-Heffter, Proc. FORTE 2014
More whole-OS verification
“Verifying Security Invariants in ExpressOS”, Haohui Mai, Edgar Pek, Hui Xue, Samuel T. King, and P. Madhusudan, Proc. ASPLOS 2013
“Translation Validation for a Verified OS Kernel”, Thomas Sewell, Magnus Myreen, and Gerwin Klein, Proc. PLDI 2013
Formal methods falling short of verification
“How Amazon Web Services Uses Formal Methods”, Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, and Michael Deardeuff, Communications of the ACM 58(4), pp.66–73