4/7 Distributed tracing

Reading

Today we read two papers about profiling and performance debugging in systems. The first paper, though not frequently cited, is an excellent example of a debugging narrative. The second paper’s an award winner.

  1. The Mystery Machine: End-to-end performance analysis of large-scale Internet services, Michael Chow, David Meisner, Jason Flinn, Daniel Peek, Thomas F. Wenisch (OSDI 2014)

  2. “Pivot tracing: Dynamic causal monitoring for distributed systems”, Jonathan Mace, Ryan Roelke, Rodrigo Fonseca (SOSP 2015)

Reading questions

Both the Mystery Machine and pivot tracing center around happens-before. Compare the abstractions around happens-before relationships in these papers. Could the Mystery Machine benefit directly or indirectly from the happens-before join abstraction?