List of important publications in concurrent, parallel, and distributed computing

Synchronising concurrent processes. Achieving consensus in a distributed system in the presence of faulty nodes, or in a wait-free manner. Mutual exclusion in concurrent systems.

Dijkstra: “Solution of a problem in concurrent programming control”

.

This paper presented the first solution to the mutual exclusion problem. Leslie Lamport writes that this work “started the field of concurrent and distributed algorithms”.

Pease, Shostak, Lamport: “Reaching agreement in the presence of faults”
Lamport, Shostak, Pease: “The Byzantine generals problem”

.

.

These two papers introduced and studied the problem that is nowadays known as Byzantine fault tolerance. The 1980 paper presented the classical lower bound that agreement is impossible if at least 1/3 of the nodes are faulty; it received the Edsger W. Dijkstra Prize in Distributed Computing in 2005. The highly-cited 1982 paper gave the problem its present name, and also presented algorithms for solving the problem.

Fischer, Lynch, Paterson: “Impossibility of distributed consensus with one faulty process”

.

This work shows the surprising result that it is impossible to achieve consensus in asynchronous systems even if only one process may be faulty. This paper received the PODC Influential-Paper Award in 2001.

Dwork, Lynch, Stockmeyer: “Consensus in the presence of partial synchrony”

.

This paper managed to circumvent the lower bound of by studying systems that lie between synchronous and asynchronous systems. The paper received the Edsger W. Dijkstra Prize in Distributed Computing in 2007.

Mellor-Crummey, Scott: “Algorithms for scalable synchronization on shared-memory multiprocessors”

.

This paper received the Edsger W. Dijkstra Prize in Distributed Computing in 2006 for presenting “probably the most influential practical mutual exclusion algorithm of all time”.

Herlihy: “Wait-free synchronization”

.

This work introduced a way to organise different kinds of concurrent objects into a hierarchy. The hierarchy was formed by using the “consensus number” of the object, i.e., its ability to solve the consensus problem in a wait-free manner. In this hierarchy, registers which support atomic reads and writes are on the lowest level, with consensus number 1. Examples of concurrent objects with consensus number 2 include test-and-set instructions and fetch-and-add instructions – the higher consensus number indicates that both of these are strictly stronger than atomic read/write registers. Compare-and-swap instructions are an example of a concurrent object with infinite consensus number. The paper received the Edsger W. Dijkstra Prize in Distributed Computing in 2003.

Herlihy, Shavit: “The topological structure of asynchronous computation”
Saks, Zaharoglou: “Wait-free k-set agreement is impossible …”

. Gödel prize lecture.

.

These two papers study wait-free algorithms for generalisations of the consensus problem, and showed that these problems can be analysed by using topological properties and arguments. Both papers received the Gödel Prize in 2004.

Fault-tolerance

Dijkstra: “Self-stabilizing systems in spite of distributed control”

.

This work introduced the concept of self-stabilization, creating a whole new sub-field of distributed computing. The paper received the PODC Influential-Paper Award in 2002.

Distributed graph algorithms

Distributed algorithms for solving graph problems.

Gallager, Humblet, Spira: “A distributed algorithm for minimum-weight spanning trees”

.

This paper presented an efficient distributed algorithm for finding a minimum spanning tree. The paper received the Edsger W. Dijkstra Prize in Distributed Computing in 2004.

Awerbuch, Peleg: “Sparse partitions”

.

This paper received the Edsger W. Dijkstra Prize in Distributed Computing in 2008.

Foundations of distributed systems

Fundamental concepts such as time and knowledge in distributed systems.

Lamport: “Time, clocks, and the ordering of events in a distributed system”

.

This paper studied causality in distributed systems, introduced Lamport timestamps which is a mechanism for generating logical timestamps, and also presented a general mechanism for managing replication. The paper received the PODC Influential-Paper Award in 2000.

Halpern, Moses: “Knowledge and common knowledge in a distributed environment”

.

This paper formalised the notion of “knowledge” in distributed systems, demonstrated the importance of the concept of “common knowledge” in distributed systems, and also proved that common knowledge cannot be achieved if communication is not guaranteed. The paper received the Gödel Prize in 1997 and the Edsger W. Dijkstra Prize in Distributed Computing in 2009.

See also

  • List of important publications in computer science
  • List of important publications in theoretical computer science