- Distributed revision control
-
A distributed revision control system (DRCS), distributed version control or decentralized version control (DVCS) keeps track of software revisions and allows many developers to work on a given project without necessarily being connected to a common network.
Contents
Distributed vs. centralized
Distributed revision control (DRCS) takes a peer-to-peer approach, as opposed to the client-server approach of centralized systems. Rather than a single, central repository on which clients synchronize, each peer's working copy of the codebase is a bona-fide repository.[1] Distributed revision control conducts synchronization by exchanging patches (change-sets) from peer to peer. This results in some important differences from a centralized system:
- No canonical, reference copy of the codebase exists by default; only working copies.
- Common operations (such as commits, viewing history, and reverting changes) are fast, because there is no need to communicate with a central server.[2]
Rather, communication is only necessary when pushing or pulling changes to or from other peers.
- Each working copy effectively functions as a remote backup of the codebase and of its change-history, providing natural protection against data loss.[2]
Other differences are as follows:
- There may be many "central" repositories.
- Code from disparate repositories are merged based on a web of trust, i.e., historical merit or quality of changes.
- Numerous different development models are possible, such as development / release branches or a Commander / Lieutenant model, allowing for efficient delegation of topical developments in very large projects.[3]
- Lieutenants are project members who have the power to dynamically decide which branches to merge.
- Network is not involved in most operations.
- A separate set of "sync" operations are available for committing or receiving changes with remote repositories.
DVCS proponents point to several advantages of distributed version control systems over the traditional centralised model:
- Allows users to work productively even when not connected to a network
- Makes most operations much faster since no network is involved
- Allows participation in projects without requiring permissions from project authorities, and thus arguably better fosters culture of meritocracy[citation needed] instead of requiring "committer" status
- Allows private work, so users can use their revision control system even for early drafts they do not want to publish
- Avoids relying on a single physical machine as a single point of failure.
- Still permits centralized control of the "release version" of the project
- For FLOSS software projects, it becomes much easier to create a project fork from a project that is stalled because of leadership conflicts or design disagreements.
Software development author Joel Spolsky describes distributed version control as "possibly the biggest advance in software development technology in the [past] ten years."[4]
As a disadvantage of DVCS, one could note that initial cloning of a repository is slower compared to centralized checkout, because all branches and revision history are copied. This may be relevant if access speed is low and the project is large enough. For instance, the size of the cloned git repository (all history, branches, tags, etc.) for the Linux kernel is approximately the size of the checked-out uncompressed HEAD, whereas the equivalent checkout of a single branch in a centralized checkout would be the compressed size of the contents of HEAD (except without any history, branches, tags, etc.). Another problem with DVCS is the lack of locking mechanisms that is part of most centralized VCS and still plays an important role when it comes to non-mergable binary files such as graphic assets.
Systems
Open systems
An "open system" of distributed revision control is characterized by its support for independent branches, and its heavy reliance on merge operations. Its general characteristics include:
- Every working copy is effectively a fork.
- The system implements each branch as a working copy, with merges conducted by ordinary patch exchange, from branch to branch.
- Code forking therefore occurs more readily, where desired, because every working copy is a potential fork. (By the same token, undesirable forks are easier to mend because, if the dispute can be resolved, re-merging the code is easy.)
- It may be possible to "cherry-pick" single changes, selectively pulling them from peer to peer.
- New peers can freely join, without applying for access to a server.
One of the first open systems, BitKeeper, served in the development of the Linux kernel. When the makers of BitKeeper decided in 2005 to restrict its licensing,[5] Linus Torvalds, looking for a free alternative, finally started developing his own distributed source control management software, Git.
For a list of distributed revision control systems, see the comparison of revision control software.
Replicated systems
A replicated system of distributed revision control depends on a replicated database. A check-in is equivalent to a distributed commit. Successful commits create a single baseline, which reduces the need for merges. An example of a replicated distributed system is Code Co-op.
Work model
The distributed model is generally better suited for large projects with partly independent developers, such as the Linux kernel project, because developers can work independently and submit their changes for merge (or rejection). The distributed model flexibly allows adopting custom source code contribution workflows, with the integrator workflow being the most widely use one.
In the centralized model, developers should serialize their work, or they may have problems with different versions.
History
First generation open-source DVCS systems include Arch and Monotone. The second generation was initiated by the arrival of Darcs, followed by a host of others. Among them, Mercurial and Git were created as potential replacements for BitKeeper when it was pulled from free use by the Linux kernel project by its publisher. Bazaar followed not long after.
Before these, closed source DVCS systems such as Sun WorkShop TeamWare (which inspired BitKeeper) were widely used in enterprise settings.
Future
Some natively centralized systems are starting to grow distributed features. For example, Subversion is able to do many operations with no network.[6] It may become more difficult to separate natively distributed vs centralized systems.
There are many tools that rely on version control, such as wikis, file systems, and text editors. Some are starting to adopt DVCS features, and even integrate with them, for example the Gazest wiki, ikiwiki.
See also
- Revision control
- List of revision control software
- Comparison of revision control software
- Category:Software using distributed revision control
- Repository clone
- Git, an Open Source DVCS developed for Linux Kernel development
- Mercurial, a cross-platform system similar to Git, considered by some to be easier to use
- BitKeeper
- Bazaar (software)
- Concurrent Versions System, a predecessor of distributed version control systems
- TortoiseHg, a graphical interface for Mercurial
References
- ^ Wheeler, David. "Comments on Open Source Software / Free Software (OSS/FS) Software Configuration Management (SCM) Systems". http://www.dwheeler.com/essays/scm.html. Retrieved May 8, 2007.
- ^ a b O'Sullivan, Bryan. "Distributed revision control with Mercurial". http://hgbook.red-bean.com/hgbook.html. Retrieved July 13, 2007.
- ^ http://mercurial.selenic.com/wiki/Workflows#Overview_and_plan
- ^ Spolsky, Joel (2010-03-17). "Distributed Version Control is here to stay, baby". Joel on Software. http://joelonsoftware.com/items/2010/03/17.html. Retrieved 2010-06-18.
- ^ "Bitmover ends free Bitkeeper, replacement sought for managing Linux kernel code". Wikinews. April 7, 2005. http://en.wikinews.org/wiki/Bitmover_ends_free_Bitkeeper%2C_replacement_sought_for_managing_Linux_kernel_code.
- ^ http://osdir.com/Article203.phtml
External links
- Essay on various revision control systems, especially the section "Centralized vs. Decentralized SCM"
- Introduction to distributed version control systems - IBM Developer Works article
Revision control software Years, where available, indicate the date of first stable release. Systems with names in italics are no longer maintained or have planned end-of-life dates.Local only - PVCS (1985)
Client-server Free/open-source- CVS (1990)
- CVSNT (1998)
- Subversion (2004)
Proprietary- Software Change Manager (1970s)
- ClearCase (1992)
- CMVC (1994)
- Visual SourceSafe (1994)
- Perforce (1995)
- StarTeam (1995)
- MKS Integrity (2001)
- AccuRev SCM (2002)
- SourceAnywhere (2003)
- Vault (2003)
- Team Foundation Server (2005)
- Rational Team Concert (2008)
Distributed Free/open-sourceProprietary- TeamWare (1990s?)
- Code Co-op (1997)
- BitKeeper (1998)
- Plastic SCM (2006)
Concepts - Category
- Comparison
- List
Categories:- Version control
- Software development
- Free software projects
- Free revision control software
- Distributed revision control systems
- Concurrent Versions System
Wikimedia Foundation. 2010.