This is not the current version of the class.

Part 2: Testing and defensive programming

Overview

• We want programs to work right
• Let’s program so that our programs are more likely to work right

Terms

• Validation: “a process designed to increase our confidence that a program will function as we intend it to” [LG86]
• Debugging: “the process of ascertaining why a program is not functioning properly”
• Defensive programming: “the practice of writing programs in a way designed specifically to ease the process of validation and debugging”

Validation 1: Testing

• Check that the code produces the correct answer
• Requires a notion of a correct answer—the specification

Unit tests

• Check that a program unit, such as a function or class, works according to its spec
• Black-box testing: Uses the interface to the program unit, not the implementation
• Grey-box or white-box testing: Uses both the interface and the implementation

Integration tests

• Check that a whole program behaves as expected
• Often harder to create than unit tests because a whole program’s spec is usually big and unwieldy
• Unix nerds represent
• But there are shortcuts
• Simply encoding a program’s current behavior can help catch unexpected changes

Testing goals

• Testing is a pragmatic process
• An ethical necessity
• But complete testing can be impossible and mistaken
• Finite budget of time available for testing and development
• How to use that budget most effectively?
• Want to find all the bugs
• Relates to the different ways bugs can occur
• Want to find the most important bugs
• Relates to frequency, consequences of failure, …
• “Our goal must be to find a reasonably small set of tests that will allow us to approximate the information we would have obtained through exhaustive testing.”

Example specification

// This function treats a and b as mathematical integers
// and returns min(INT_MAX, max(a + b, INT_MIN)).


(Ref: Testing with pictures)

Coverage

• Metric for test suite quality
• Specifications and implementations both have paths or branches
• Coverage measures how many of those distinct paths are tested by the test suite
• Specification path coverage: paths through the specification
• Implementation path coverage: paths through the code
• Implementation statement coverage: fraction of lines of code evaluated

Validation 2: Proving

• Prove formally that code matches its spec
• Awesome progress recently, still (too) rare in practice
• Focus of research effort in programming languages
• “Beware of bugs in the above code; I have only proved it correct, not tried it.” —Don Knuth
• Some forms of proving are common!
• Type safety eliminates some errors (assuming compiler correctness)
• Memory safety, static checking, …

Defensive programming

• Program so bugs are less likely
• Program so bugs are caught earlier
• Program so test suites are easier to write

Invariants

• Logical statements about a program that must hold in every correct execution

Classes of invariant

• Representation invariants
• Property of a data structure’s representation
• Functions that modify the data structure are allowed to break the invariant, but they must restore it before returning
• Preconditions
• Property of arguments to a function, or system state before a function is called
• Postconditions
• Property of function return values, or system state after the function returns
• Loop invariants, etc.

Assertions

• Executable check that an invariant holds
• In C/C++: assert(EXPR)
• EXPR should not have side effects
• Ref: Use of Assertions

Assertion patterns

• Invariant checks

• Precondition invariants at top of function
• Representation invariants: extract to a separate method, such as check; call on entry, on exit, wherever
• Postcondition invariants at bottom of function
• Failsafes

• “I think that X is true at this point in the code”
• But X is not immediately obvious from context
• Common to add and remove these as you debug
• Assertion expansion

• Assertion failure messages may not have sufficient context
• gdb can help (set a breakpoint at panic, use backtrace)
• Or expand the assertion:

assert(!yields_ || contains(yields_));
=>
if (!(!yields_ || contains(yields_))) {
log_printf("CRAP %p vs. %p\n", this, yields_);
}
assert(!yields_ || contains(yields_));

• Or use assert_eq, assert_ge, etc.

Test suite construction

• A great test suite is repeatable
• Otherwise, can’t evaluate the fix
• Randomness can achieve good coverage without much thinking
• Use deterministic randomness for repeatability: srand
• Boundary conditions
• E.g., INT_MAX + 1, INT_MAX + (INT_MIN + 1)
• Rarely found by naive random testing
• Ref: “Random testing is carpet bombing for software”; Ref
• Unit should export functions useful for testing
• “it is worth your while to write a considerable amount of code whose only purpose is to help you examine intermediate results
• For example, a function that prints statistics about your buddy allocator to the log—or even the contents of your free lists
• Look for ways to expand invariants
• E.g., Keep a statistic that can be computed more than one way; assert the two calculations equal

Debugging

• Debugging is science
• “The crux of the scientific method is to
1. begin by studying already available data,
2. form a hypothesis that is consistent with those data, and
3. design and run a repeatable experiment that has the potential to refute the hypothesis.”
• Narrowing-down hypotheses: “This bug is caused by multiprocessor interactions.” (refutation: run with NCPU=1); “This bug is caused by system calls.” (refutation: try to cause it with interrupts); etc.
• The final hypothesis: “This bug will be fixed by this fix.”
• Not infrequently, we don’t understand the bug until it’s fixed
• But try to understand the bug even after “fixing” it!

Exercise

• Defensive programming for buddy allocation
• Representation invariants?
• Testing strategies?
• Functions useful for testing?