Eraser: A Dynamic Data Race Detector for Multi-Threaded Programs
As a part of my OS class this semester (Fall 2016), we have been reading a bunch of very interesting papers. I wish to blog every paper I read for this class, but I don’t get time :/ Anyway, here is a brief summary of this paper for noobs.
Paper: Eraser: A Dynamic Data Race Detector for Multithreaded Programs
Authors: Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson
Date: 1997
What is it about? The basics….
The paper attacks the problem of data race in multithreaded programs.
Debugging multithreaded programs can be difficult. Basic errors in synchronization can produce timing-dependent data races that can take “several weeks” to debug.
Previous work done by Lamport’s ‘happens-before’ relation checks that conflicting memory accesses from different threads are separated by synchronization events.
The work in this paper improves upon the ‘happens-before’ algorithm by simply checking that all shared memory accesses follow a consistent locking discipline.
The paper argues that for many programs, its approach is “simpler, more efficient, and more thorough” at catching races than the ‘happens-before’ algorithm.
This work also improves upon the work done by Hoare (Monitors) in that they don’t protect against data races in programs with dynamically allocated shared variables.
Some key things….
The paper proposes a new algorithm to detect data races called “The Lockset Algorithm”.
“The Lockset Algorithm” enforces a locking discipline that some lock protects every shared variable. This means that any thread holds the lock whenever it accesses the variable.
It improves upon the locking discipline and extends “The Lockset Algorithm” to accommodate initialization, read-shared data, and read-write locks.
Developed a program annotation to allow users to eliminate false reports.
Experimental stuff
Eraser was implemented for the DIGITAL Unix operating system on the Alpha processor using the ATOM binary modification system.
Applications typically slowed down by a factor of 10 to 20 while using Eraser.
While programming some tests that contained common synchronization errors, Eraser detected the data race errors correctly.
Eraser also successfully tackled large multithreaded servers written by researchers at DEC SRC: the HTTP server and indexing engine from AltaVista, the Vesta cache server, and the Petal distributed disk system.
Eraser found undesirable race conditions in three of the four server programs, and in many of the undergraduate homework problems. It also produced false alarms that were suppressed using annotations.
The authors mention, “Eraser’s testing is not very sensitive to the scheduler interleaving.” To prove this, they did two experiments on Ni2 and Vesta.
The above is only a summary, and doesn’t contain all the details mentioned in the paper. For a thorough review, please go through the entire paper. If you have some questions, feel free to email me at fabidi89[at]vt[dot] edu.