Chord (peer-to-peer)

Chord (peer-to-peer)

In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A distributed hash table stores key-value pairs by assigning keys to different computers (known as "nodes"); a node will store the values for all the keys for which it is responsible. Chord specifies how keys are assigned to nodes, and how a node can discover the value for a given key by first locating the node responsible for that key.[1]

Chord is one of the four original distributed hash table protocols, along with CAN, Tapestry, and Pastry. It was introduced in 2001 by Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, and Hari Balakrishnan, and was developed at MIT.[2]

Chord project.png

Contents

Overview

Using the Chord lookup protocol, node keys are arranged in a circle that has at most 2m nodes. The circle can have IDs/keys ranging from 0 to 2m − 1.

IDs and keys are assigned an m-bit identifier using consistent hashing. The SHA-1 algorithm is the base hashing function for consistent hashing. Consistent hashing is integral to the robustness and performance of Chord because both keys and IDs (IP addresses) are uniformly distributed and in the same identifier space. Consistent hashing is also necessary to let nodes join and leave the network without disruption.

Each node has a successor and a predecessor. The successor to a node (or key) is the next node (key) in the identifier circle in a clockwise direction. The predecessor is counter-clockwise. If there is a node for each possible ID, the successor of node 2 is node 3, and the predecessor of node 1 is node 0; however, normally there are holes in the sequence. For example, the successor of node 153 may be node 167 (and nodes from 154 to 166 will not exist); in this case, the predecessor of node 167 will be node 153.

Since the successor (or predecessor) node may disappear from the network (because of failure or departure), each node records a whole segment of the circle adjacent to it, i.e. the r nodes preceding it and the r nodes following it. This list results in a high probability that a node is able to correctly locate its successor or predecessor, even if the network in question suffers from a high failure rate.

Chord protocol

In a 16-node Chord network, the nodes are arranged in a circle. Each node is connected to other nodes at distances 1, 2, 4, and 8 away.
A 16-node Chord network. The "fingers" for one of the nodes are highlighted.

The Chord protocol is one solution for connecting the peers of a P2P network. Chord consistently maps a key onto a node. Both keys and nodes are assigned an m-bit identifier. For nodes, this identifier is a hash of the node's IP address. For keys, this identifier is a hash of a keyword, such as a file name. It is not uncommon to use the words "nodes" and "keys" to refer to these identifiers, rather than actual nodes or keys. There are many other algorithms in use by P2P, but this is a simple and common approach.

A logical ring with positions numbered 0 to 2m − 1 is formed among nodes. Key k is assigned to node successor(k), which is the node whose identifier is equal to or follows the identifier of k. If there are N nodes and K keys, then each node is responsible for roughly K / N keys.

When a new node joins or leaves the network, responsibility for O(K / N) keys changes hands.

If each node knows only the location of its successor, a linear search over the network could locate a particular key. This is a naive method for searching the network, since any given message could potentially have to be relayed through most of the network. Chord implements a faster search method.

Chord requires each node to keep a "finger table" containing up to m entries. The ith entry of node n will contain the address of successor((n + 2i − 1) mod 2m).

With such a finger table, the number of nodes that must be contacted to find a successor in an N-node network is O(log N). (See proof below.)

Potential uses

  • Cooperative Mirroring: A load balancing mechanism by a local network hosting information available to computers outside of the local network. This scheme could allow developers to balance the load between many computers instead of a central server to ensure availability of their product.
  • Time-shared storage: In a network, once a computer joins the network its available data is distributed throughout the network for retrieval when that computer disconnects from the network. As well as other computers' data is sent to the computer in question for offline retrieval when they are no longer connected to the network. Mainly for nodes without the ability to connect full time to the network.
  • Distributed Indices: Retrieval of files over the network within a searchable database. e.g. P2P file transfer clients.
  • Large scale combinatorial searches: Keys being candidate solutions to a problem and each key mapping to the node, or computer, that is responsible for evaluating them as a solution or not. e.g. Code Breaking

Proof sketches

If two nodes are at a distance 11 apart along the ring (i.e., there are 10 nodes between them), it takes three hops to send a message from one to the other. The first hop covers a distance of 8 units, the second 2 units, and the final hop 1 unit.
The routing path between nodes A and B. Each hop cuts the remaining distance in half (or better).

With high probability, Chord contacts O(log N) nodes to find a successor in an N-node network.

Suppose node n wishes to find the successor of key k. Let p be the predecessor of k. We wish to find an upper bound for the number of steps it takes for a message to be routed from n to p. Node n will examine its finger table and route the request to the closest predecessor of k that it has. Call this node f. If f is the ith entry n's finger table, then both f and p are at distances between 2i − 1 and 2i from n along the identifier circle. Hence, the distance between f and p along this circle is at most 2i − 1. Thus the distance from f to p is less than the distance from n to f: the new distance to p is at most half the initial distance.

This process of halving the remaining distance repeats itself, so after t steps, the distance remaining to p is at most 2m / 2t; in particular, after log N steps, the remaining distance is at most 2m / N. Because nodes are distributed uniformly at random along the identifier circle, the expected number of nodes falling within an interval of this length is 1, and with high probability, there are fewer than log N such nodes. Because the message always advances by at least one node, it takes at most logN steps for a message to traverse this remaining distance. The total expected routing time is thus O(log N).

If Chord keeps track of r = O(log N) predecessors/successors, then with high probability, if each node has probability of 1/4 of failing, find_successor (see below) and find_predecessor (see below) will return the correct nodes

Simply, the probability that all r nodes fail is \left({{1}\over{4}}\right)^r = O\left({{1}\over{N}}\right), which is a low probability; so with high probability at least one of them is alive and the node will have the correct pointer.

Pseudocode

Definitions for pseudocode:

  • finger[k]: first node that succeeds (n+2^{k-1})  \mbox{ mod } 2^m, 1 \leq k \leq m
  • successor: the next node from the node in question on the identifier ring
  • predecessor: the previous node from the node in question on the identifier ring

The pseudocode to find the successor node of an id is given below:

 // ask node n to find the successor of id
 n.find_successor(id)
   if (id \in (n, successor] ) // Yes, that should be a closing square bracket to match the opening parenthesis. It is a half closed interval.
     return successor;
   else
     // forward the query around the circle
     n0 = closest_preceding_node(id);
     return n0.find_successor(id);
 // search the local table for the highest predecessor of id
 n.closest_preceding_node(id)
   for i = m downto 1
     if (finger[i]\in(n,id))
       return finger[i];
   return n;

The pseudocode to stabilize the chord ring/circle after node joins and departures is as follows:

 // create a new Chord ring.
 n.create()
   predecessor = nil;
   successor = n;
 // join a Chord ring containing node n'.
 n.join(n')
   predecessor = nil;
   successor = n'.find_successor(n);
 // called periodically. n asks the successor
 // about its predecessor, verifies if n's immediate
 // successor is consistent, and tells the successor about n
 n.stabilize()
   x = successor.predecessor;
   if (x\in(n, successor))
     successor = x;
   successor.notify(n);
 // n' thinks it might be our predecessor.
 n.notify(n')
   if (predecessor is nil or n'\in(predecessor, n))
     predecessor = n';
 // called periodically. refreshes finger table entries.
 // next stores the index of the finger to fix
 n.fix_fingers()
   next = next + 1;
   if (next > m)
     next = 1;
   finger[next] = find_successor(n+2next − 1);
 // called periodically. checks whether predecessor has failed.
 n.check_predecessor()
   if (predecessor has failed)
     predecessor = nil;

See also

  • OverSim - the overlay simulation framework
  • SimGrid - a toolkit for the simulation of distributed applications -

References

  1. ^ "Chord - frequently asked questions". Chord Project. http://pdos.csail.mit.edu/chord/faq.html. 
  2. ^ Stoica, Ion et al. (2001). "Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications". Proceedings of SIGCOMM'01 (ACM Press New York, NY, USA). 

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Chord — may refer to: Chord (music), an aggregate of musical pitches sounded simultaneously. Chord (guitar) an aggregate of musical pitches played simultaneously on a guitar Chord (geometry), a line segment joining two points on a curve Chord (astronomy) …   Wikipedia

  • Chord — ist ein strukturiertes Peer to Peer System, welches im Gegensatz zu den meisten unstrukturierten Systemen eine effiziente Suche nach Inhalten ermöglicht. Wie auch Gnutella ist Chord keine konkrete Implementierung, sondern ein Protokollsystem. Es… …   Deutsch Wikipedia

  • Chord (distributed hash table) — Chord is one of the original distributed hash table protocols. Chord is being developed at MIT and the current Chord source code can be downloaded and used under the MIT License.Overview Using the Chord lookup protocol, node keys are arranged in… …   Wikipedia

  • Chord (Begriffsklärung) — Chord (von chorda, lat. Sehne, Saite, engl. für Akkord) bezeichnet ein Protokollsystem in Peer to Peer Netzen, siehe Chord eine historische Winkelfunktion, siehe Chord (Mathematik) Chords steht für Twisted Chords, Plattenlabel Chords Five, Band… …   Deutsch Wikipedia

  • Peer-to-peer — Not to be confused with point to point. This article is about peer to peer computing. For other uses, see Peer to peer (disambiguation). A peer to peer system of nodes without central infrastructure …   Wikipedia

  • Peer-to-Peer — Pair à pair Pour les articles homonymes, voir P2P (homonymie). Le pair à pair (traduction de l anglicisme peer to peer, souvent abrégé « P2P »), est un modèle de réseau informatique. Il s oppose au modèle client serveur. Le partage de… …   Wikipédia en Français

  • Peer-to-peer — Pair à pair Pour les articles homonymes, voir P2P (homonymie). Le pair à pair (traduction de l anglicisme peer to peer, souvent abrégé « P2P »), est un modèle de réseau informatique. Il s oppose au modèle client serveur. Le partage de… …   Wikipédia en Français

  • Peer 2 peer — Pair à pair Pour les articles homonymes, voir P2P (homonymie). Le pair à pair (traduction de l anglicisme peer to peer, souvent abrégé « P2P »), est un modèle de réseau informatique. Il s oppose au modèle client serveur. Le partage de… …   Wikipédia en Français

  • Peer to Peer — Pair à pair Pour les articles homonymes, voir P2P (homonymie). Le pair à pair (traduction de l anglicisme peer to peer, souvent abrégé « P2P »), est un modèle de réseau informatique. Il s oppose au modèle client serveur. Le partage de… …   Wikipédia en Français

  • Peer to peer — Pair à pair Pour les articles homonymes, voir P2P (homonymie). Le pair à pair (traduction de l anglicisme peer to peer, souvent abrégé « P2P »), est un modèle de réseau informatique. Il s oppose au modèle client serveur. Le partage de… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”