Distributed revision control

Distributed revision control

A distributed revision control system (DRCS), distributed version control or decentralized version control (DVCS) keeps track of software revisions and allows many developers to work on a given project without necessarily being connected to a common network.

Contents

Distributed vs. centralized

Distributed revision control (DRCS) takes a peer-to-peer approach, as opposed to the client-server approach of centralized systems. Rather than a single, central repository on which clients synchronize, each peer's working copy of the codebase is a bona-fide repository.[1] Distributed revision control conducts synchronization by exchanging patches (change-sets) from peer to peer. This results in some important differences from a centralized system:

  • No canonical, reference copy of the codebase exists by default; only working copies.
  • Common operations (such as commits, viewing history, and reverting changes) are fast, because there is no need to communicate with a central server.[2]

Rather, communication is only necessary when pushing or pulling changes to or from other peers.

  • Each working copy effectively functions as a remote backup of the codebase and of its change-history, providing natural protection against data loss.[2]

Other differences are as follows:

  • There may be many "central" repositories.
  • Code from disparate repositories are merged based on a web of trust, i.e., historical merit or quality of changes.
  • Numerous different development models are possible, such as development / release branches or a Commander / Lieutenant model, allowing for efficient delegation of topical developments in very large projects.[3]
  • Lieutenants are project members who have the power to dynamically decide which branches to merge.
  • Network is not involved in most operations.
  • A separate set of "sync" operations are available for committing or receiving changes with remote repositories.

DVCS proponents point to several advantages of distributed version control systems over the traditional centralised model:

  • Allows users to work productively even when not connected to a network
  • Makes most operations much faster since no network is involved
  • Allows participation in projects without requiring permissions from project authorities, and thus arguably better fosters culture of meritocracy[citation needed] instead of requiring "committer" status
  • Allows private work, so users can use their revision control system even for early drafts they do not want to publish
  • Avoids relying on a single physical machine as a single point of failure.
  • Still permits centralized control of the "release version" of the project
  • For FLOSS software projects, it becomes much easier to create a project fork from a project that is stalled because of leadership conflicts or design disagreements.

Software development author Joel Spolsky describes distributed version control as "possibly the biggest advance in software development technology in the [past] ten years."[4]

As a disadvantage of DVCS, one could note that initial cloning of a repository is slower compared to centralized checkout, because all branches and revision history are copied. This may be relevant if access speed is low and the project is large enough. For instance, the size of the cloned git repository (all history, branches, tags, etc.) for the Linux kernel is approximately the size of the checked-out uncompressed HEAD, whereas the equivalent checkout of a single branch in a centralized checkout would be the compressed size of the contents of HEAD (except without any history, branches, tags, etc.). Another problem with DVCS is the lack of locking mechanisms that is part of most centralized VCS and still plays an important role when it comes to non-mergable binary files such as graphic assets.

Systems

Open systems

An "open system" of distributed revision control is characterized by its support for independent branches, and its heavy reliance on merge operations. Its general characteristics include:

  • Every working copy is effectively a fork.
  • The system implements each branch as a working copy, with merges conducted by ordinary patch exchange, from branch to branch.
  • Code forking therefore occurs more readily, where desired, because every working copy is a potential fork. (By the same token, undesirable forks are easier to mend because, if the dispute can be resolved, re-merging the code is easy.)
  • It may be possible to "cherry-pick" single changes, selectively pulling them from peer to peer.
  • New peers can freely join, without applying for access to a server.

One of the first open systems, BitKeeper, served in the development of the Linux kernel. When the makers of BitKeeper decided in 2005 to restrict its licensing,[5] Linus Torvalds, looking for a free alternative, finally started developing his own distributed source control management software, Git.

For a list of distributed revision control systems, see the comparison of revision control software.

Replicated systems

A replicated system of distributed revision control depends on a replicated database. A check-in is equivalent to a distributed commit. Successful commits create a single baseline, which reduces the need for merges. An example of a replicated distributed system is Code Co-op.

Work model

The distributed model is generally better suited for large projects with partly independent developers, such as the Linux kernel project, because developers can work independently and submit their changes for merge (or rejection). The distributed model flexibly allows adopting custom source code contribution workflows, with the integrator workflow being the most widely use one.

In the centralized model, developers should serialize their work, or they may have problems with different versions.

History

First generation open-source DVCS systems include Arch and Monotone. The second generation was initiated by the arrival of Darcs, followed by a host of others. Among them, Mercurial and Git were created as potential replacements for BitKeeper when it was pulled from free use by the Linux kernel project by its publisher. Bazaar followed not long after.

Before these, closed source DVCS systems such as Sun WorkShop TeamWare (which inspired BitKeeper) were widely used in enterprise settings.

Future

Some natively centralized systems are starting to grow distributed features. For example, Subversion is able to do many operations with no network.[6] It may become more difficult to separate natively distributed vs centralized systems.

There are many tools that rely on version control, such as wikis, file systems, and text editors. Some are starting to adopt DVCS features, and even integrate with them, for example the Gazest wiki, ikiwiki.

See also

References

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Revision control — For the Wikipedia revision control system, see Wikipedia:Revision control. Example history tree of a revision controlled project. Revision control, also known as version control and source control (and an aspect of software configuration man …   Wikipedia

  • ArX (revision control) — Infobox Software name = ArX caption = developer = Walter Landry latest release version = 2.2.4 latest release date = release date and age|2005|11|17 latest preview version = latest preview date = operating system = Linux, Windows, Mac OS X… …   Wikipedia

  • Comparison of revision control software — The following is a comparison of revision control software. The following tables includes general and technical information for notable revision control and software configuration management (SCM) software. This is an incomplete list, which may… …   Wikipedia

  • List of revision control software — This is a list of notable software for revision control. Distributed model In the distributed approach, each developer works directly with their own local repository, and changes are shared between repositories as a separate step. Open source *… …   Wikipedia

  • Merge (revision control) — Merging (also called integration) in revision control, is a fundamental operation that reconciles multiple changes made to a revision controlled collection of files. Most often, it is necessary when a file is modified by two people on two… …   Wikipedia

  • Distributed Concurrent Versions System — Developer(s) elego Software Solutions GmbH Initial release August 2002; 9 years ago (2002 08) Stable release 1.0.3 / September 25, 2006; 5 years ago ( …   Wikipedia

  • Control system (disambiguation) — A control system is a device or set of devices to manage, command, direct or regulate the behavior of other devices or systems. A control mechanism is a process used by a control system. Control system may refer to: Contents 1 General control… …   Wikipedia

  • Distributed operating system — A distributed operating system is the logical aggregation of operating system software over a collection of independent, networked, communicating, and spatially disseminated computational nodes.[1] Individual system nodes each hold a discrete… …   Wikipedia

  • Distributed social network — A distributed social network is an Internet social network service that is decentralized and distributed across different providers. The emphasis of the distribution is on portabilitya[›], interoperability and federation capability. It contrasts… …   Wikipedia

  • Distributed transmission system — This article is about terrestrial broadcasting. For electrical power distribution, see distributed generation. In North American digital terrestrial television broadcasting, a distributed transmission system (DTS or DTx) is a form of single… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”