Consensus (computer science)

Consensus (computer science)

Consensus is a problem in distributed computing that encapsulates the task of group agreement in the presence of faults.[1]

In particular, any process in the group may fail at any time. Consensus is fundamental to core techniques in fault tolerance, such as state machine replication.

Contents

Problem description

A process is called "correct" if it does not fail at any point during its execution. Unlike Terminating Reliable Broadcast, the typical consensus problem does not label any single process as a "sender". Every process "proposes" a value; the goal of the protocol is for all correct processes to choose a single value from among those proposed. Valid consensus protocols must provide important guarantees to all processes involved. All correct processes must eventually decide the same value, for example, and that value must be one of those proposed. A correct process is therefore guaranteed that the value it decides was also decided by all other correct processes, and can act on that value accordingly.

More precisely, a consensus protocol must satisfy the four formal properties below.

Termination
every correct process decides some value.
Validity
if all processes propose the same value v, then every correct process decides v.
Integrity
every correct process decides at most one value, and if it decides some value v, then v must have been proposed by some process.
Agreement
if a correct process decides v, then every correct process decides v.

The possibility of faults in the system makes these properties more difficult to satisfy. A simple but invalid consensus protocol might have every process broadcast its proposal to all others, and have a process decide on the smallest value received. Such a protocol, as described, does not satisfy Agreement if faults can occur: if a process crashes after sending its proposal to some processes, but before sending it to others, then the two sets of processes may decide different values.

Failure model

The consensus problem is challenging primarily because one or more of the processes involved may fail at any time. Consensus protocols typically make one of two assumptions about how processes may fail:

  • In a fail-stop or fail-crash model, a failed process simply stops participating in the protocol. This models a communication failure (e.g., network partition) or a system crash. A variant is the crash-recovery model, where halted processes can restart and restore their state previous to the crash, through stable storage.
  • In a Byzantine failure model, a failed process may behave in an arbitrary fashion. In particular, a failed process may collaborate with other failed processes in order to deliberately subvert the operations of the consensus protocol. Because failed processes are allowed to behave arbitrarily and to collaborate, this models both software bugs and the possibility of attack by a hostile adversary.

Impossibility

Consensus has been shown to be impossible to solve in several models of distributed computing.

In an asynchronous system, where processes have no common clock and run at arbitrarily varying speeds, the problem is impossible to solve if one process may crash and processes communicate by sending messages to one another.[2] The technique used to prove this result is sometimes called an FLP impossibility proof, named after its creators, Michael J. Fischer, Nancy A. Lynch and Michael S. Paterson, who won the Dijkstra Prize for this result. The technique has been widely used to prove other impossibility results. For example, a similar proof can be used to show that consensus is also impossible in asynchronous systems where processes communicate by reading and writing shared variables if one process may crash.[3]

A notable exception to the FLP impossibility proof is in quantum computing, where it has been shown that asynchronous consensus can always be achieved even in the presence of faults, crashes, or deliberate efforts to undermine the consensus by participating processes.[4]

The FLP result does not state that consensus can never be reached: merely that under the model's assumptions, no algorithm can always reach consensus in bounded time. There exist algorithms, even under the asynchronous model, that can reach consensus with probability one. The FLP proof hinges on demonstrating the existence of an order of message receipts that causes the system to never reach consensus. This "bad" input however may be vanishingly unlikely in practice.

In a synchronous system, where all processes run at the same speed, consensus is impossible if processes communicate by sending messages to one another and one third of the processes can experience Byzantine failures.[5]

Important consensus protocols

Google has implemented a distributed lock service library called Chubby.[6] Chubby maintains lock information in small files which are stored in a replicated database to achieve high availability in the face of failures. The database is implemented on top of a fault-tolerant log layer which is based on the Paxos consensus algorithm. In this scheme, Chubby clients communicate with the Paxos master in order to access/update the replicated log; i.e., read/write to the files.[7]

Bitcoin uses proof of work to maintain consensus in its peer-to-peer network. Nodes in the bitcoin network attempt to solve a cryptographic proof-of-work problem, where probability of finding the solution is proportional to the computational effort, in hashes per second, expended, and the node that solves the problem has their version of the block of transactions added to the peer-to-peer distributed timestamp server accepted by all of the other nodes. As any node in the network can attempt to solve the proof-of-work problem, a Sybil attack becomes unfeasible unless the attacker has over 50% of the computational resources of the network.

  • Chandra-Toueg consensus algorithm
  • Randomized consensus

References

  1. ^ Lamport, Leslie; Marshall Pease and Robert Shostak (April 1980). "Reaching Agreement in the Presence of Faults". Journal of the ACM 27 (2): 228–234. doi:10.1145/322186.322188. 10.1145/322186.322188. http://research.microsoft.com/users/lamport/pubs/reaching.pdf. Retrieved 2007-07-25. 
  2. ^ Fischer, Michael J.  ; Nancy A. Lynch; Michael S. Paterson (April 1985). "Impossibility of Distributed Consensus with One Faulty Process". Journal of the ACM 32 (2): 374–382. doi:10.1145/3149.214121. http://portal.acm.org/citation.cfm?doid=3149.214121. Retrieved 2007-04-29  . 
  3. ^ Loui, M. C.; Abu-Amara, H. H. (1987). "Memory requirements for agreement among unreliable asynchronous processes". In Preparata, F. P.. Advances in Computing Research. 4. Greenwich, Connecticut: JAI Press. pp. 163–183. 
  4. ^ Helm, Louis K. (2008). "Quantum distributed consensus". Proceedings of the twenty-seventh ACM symposium on Principles of distributed computing (ACM) 1 (1): 445. doi:10.1145/1400751.1400841. ISBN 9781595939890. http://portal.acm.org/citation.cfm?id=1400841. 
  5. ^ Fischer, Michael J.; Nancy A. Lynch; Michael Merritt (1986). "Easy impossibility proofs for distributed consensus problems". Distributed Computing (Springer) 1 (1): 26–39. doi:10.1007/BF01843568. 
  6. ^ Burrows, M. (2006). "The Chubby lock service for loosely-coupled distributed systems". Proceedings of the 7th Symposium on Operating Systems Design and Implementation. USENIX Association Berkeley, CA, USA. pp. 335–350. http://labs.google.com/papers/chubby.html. 
  7. ^ C., Tushar; Griesemer, R; Redstone J. (2007). "Paxos Made Live - An Engineering Perspective". Proceedings of the Twenty-sixth Annual ACM Symposium on Principles of Distributed Computing. Portland, Oregon, USA: ACM Press New York, NY, USA. pp. 398–407. doi:http://doi.acm.org/10.1145/1281100.1281103. http://delivery.acm.org/10.1145/1290000/1281103/p398-chandra.pdf?key1=1281103&key2=4382532021&coll=GUIDE&dl=GUIDE&CFID=15324100&CFTOKEN=95510390. Retrieved 2008-02-06. 

Further reading

  • Herlihy, M.; Shavit, N. (1999). "The topological structure of asynchronous computability". Journal of the ACM 46 (6): 858. doi:10.1145/331524.331529.  edit
  • Saks, M.; Zaharoglou, F. (2000). "Wait-Free k-Set Agreement is Impossible: The Topology of Public Knowledge". SIAM Journal on Computing 29 (5): 1449. doi:10.1137/S0097539796307698.  edit

Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • List of important publications in computer science — This is a list of important publications in computer science, organized by field. Some reasons why a particular publication might be regarded as important: Topic creator – A publication that created a new topic Breakthrough – A publication that… …   Wikipedia

  • Consensus dynamics — or agreement dynamics is an area of research lying at the intersection of systems theory and graph theory. A major topic of investigation is the agreement or consensus problem[1] in multi agent systems that concerns processes by which a… …   Wikipedia

  • Replication (computer science) — Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault tolerance, or accessibility. It could be data replication if the… …   Wikipedia

  • consensus — noun ADJECTIVE ▪ broad, clear, common, general, prevailing, rough, widespread ▪ That seems to be the prevailing consensus. ▪ …   Collocations dictionary

  • Consensus (disambiguation) — For the Wikipedia policy, see Wikipedia:Consensus. The word consensus may mean: 1992 Consensus Consensual nonconsent Consensus (computer science) achieving coherence, or quorum, among nodes of a distributed computer system. Consensus (medical)… …   Wikipedia

  • Consensus — has two common meanings. One is a general among the members of a given group or community, each of which exercises some discretion in decision making and follow up action. The other is as a theory and practice of getting such agreements (for… …   Wikipedia

  • Computer-supported collaboration — (CSC) research focuses on technology that affects groups, organizations, communities and societies, e.g., voice mail and text chat. It grew from cooperative work study of supporting people s work activities and working relationships. As net… …   Wikipedia

  • Consensus algorithm — may refer to one of several proposed protocols for solving the consensus problem in the field of Computer Science. Some of these include: Paxos (computer science) Chandra Toueg consensus algorithm This disambiguation page lists articles… …   Wikipedia

  • Science — This article is about the general term, particularly as it refers to experimental sciences. For the specific topics of study by scientists, see Natural science. For other uses, see Science (disambiguation) …   Wikipedia

  • Science de la cognition — Dans l article des sciences cognitives, il est exprimé que les cogniticiens appartiennent à la science de la cognition. Nous aborderons donc, dans cet article, le principe même qui a fondé les sciences cognitives : la fusion des savoirs des… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”