Copy-on-write

Copy-on-write

Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, they can all be given pointers to the same resource. This function can be maintained until a caller tries to modify its "copy" of the resource, at which point a true private copy is created to prevent the changes becoming visible to everyone else. All of this happens transparently to the callers. The primary advantage is that if a caller never makes any modifications, no private copy need ever be created.

Copy-on-write in virtual memory

Copy-on-write finds its main use in virtual memory operating systems; when a process creates a copy of itself, the pages in memory that might be modified by either the process or its copy are marked copy-on-write. When one process modifies the memory, the operating system's kernel intercepts the operation and copies the memory so that changes in one process's memory are not visible to the other.

Another use involves the calloc function. This can be implemented by having a page of physical memory filled with zeros. When the memory is allocated, the pages returned all refer to the page of zeros and are all marked as copy-on-write. This way, the amount of physical memory allocated for the process does not increase until data is written. This is typically only done for larger allocations.

Copy-on-write can be implemented by notifying the MMU that certain pages in the process's address space are read-only. When data is written to these pages, the MMU raises an exception which is handled by the kernel, which allocates new space in physical memory and makes the page being written to correspond to that new location in physical memory.

One major advantage of COW is the ability to use memory sparsely. Because the usage of physical memory only increases as data is stored in it, very efficient hash tables can be implemented which only use little more physical memory than is necessary to store the objects they contain. However, such programs run the risk of running out of virtual address space — virtual pages unused by the hash table cannot be used by other parts of the program. The main problem with COW at the kernel level is the complexity it adds, but the concerns are similar to those raised by more basic virtual-memory concerns such as swapping pages to disk; when the kernel writes to pages, it must copy any such pages marked copy-on-write.

Other applications of copy-on-write

COW is also used outside the kernel, in library, application and system code. The string class provided by the C++ standard library, for example, was specifically designed to allow copy-on-write implementations:

std::string x("Hello");
 
std::string y = x;  // x and y use the same buffer
 
y += ", World!";    // now y uses a different buffer
                    // x still uses the same old buffer

In multithreaded systems, COW can be implemented without the use of traditional locking and instead use Compare-and-swap to increment or decrement the internal reference counter. Since the original resource will never be altered, it can safely be copied by multiple threads (after the reference count was increased) without the need of performance-expensive locking such as mutexes. If the reference counter turns 0, then by definition only 1 thread is holding a reference so the resource can safely be de-allocated from memory, again without the use of performance-expensive locking mechanisms. The benefit of not having to copy the resource (and the resulting performance gain over traditional deep-copying) will therefore be valid in both single- and multithreaded systems.

The COW concept is also used in virtualization/emulation software such as VMware Virtualization, Bochs, QEMU, Linux vserver, UML and VirtualBox for virtual disk storage. This allows a great reduction in required disk space when multiple VMs can be based on the same hard disk image, as well as increased performance as disk reads can be cached in RAM and subsequent reads served to other VMs out of the cache. This is usually the case.

The COW concept is also used in maintenance of instant snapshot on database servers like Microsoft SQL Server 2005. Instant snapshots preserve a static view of a database by storing a pre-modification copy of data when underlying data are updated. Instant snapshots are used for testing uses or moment-dependent reports and should not be used to replace backups.

COW may also be used as the underlying mechanism for snapshots provided by logical volume management and Microsoft Volume Shadow Copy Service.

The copy-on-write technique can be used to emulate a read-write storage on media that require wear levelling or are physically Write Once Read Many.

See also


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Copy-on-write — Saltar a navegación, búsqueda En Informática, Copy on write (copiar al escribir, a veces abreviado como COW ) es una política de optimización utilizada en programación. Si múltiples procesos piden recursos que inicialmente son indistinguibles… …   Wikipedia Español

  • Copy-On-Write — Le Copy on write ou copie sur écriture (souvent désigné par son sigle anglais COW) est une stratégie d optimisation utilisée en programmation informatique. L idée fondamentale : si de multiples appelants demandent des ressources initialement …   Wikipédia en Français

  • Copy-On-Write — Das Copy On Write Verfahren (kurz COW genannt, vom Englischen für Kopieren beim Schreiben) ist in der Datenverarbeitung eine Optimierungsmethode zur Vermeidung unnötiger Kopien und Kopiervorgänge, beispielsweise zwischen Prozessen unter… …   Deutsch Wikipedia

  • Copy-on-write — …   Википедия

  • copy-on-write — Kopieren einer veränderten Datei in einen lokalen Bereich beim möglichen Überschreiben der Ursprungsdatei Verfahren des TFS …   Acronyms

  • copy-on-write — Kopieren einer veränderten Datei in einen lokalen Bereich beim möglichen Überschreiben der Ursprungsdatei Verfahren des TFS …   Acronyms von A bis Z

  • Copy constructor — A copy constructor is a special constructor in the C++ programming language creating a new object as a copy of an existing object. The first argument of such a constructor is a reference to an object of the same type as is being constructed… …   Wikipedia

  • copy — /ˈkɒpi / (say kopee) noun (plural copies) 1. a transcript, reproduction, or imitation of an original. 2. that which is to be transcribed, reproduced, or imitated. 3. written, typed, or printed matter, or artwork, intended to be reproduced in… …  

  • Copy protection — Copy protection, also known as content protection, copy obstruction, copy prevention and copy restriction, refer to techniques used for preventing the reproduction of software, films, music, and other media, usually for copyright reasons.[1]… …   Wikipedia

  • write — W1S1 [raıt] v past tense wrote [rəut US rout] past participle written [ˈrıtn] ▬▬▬▬▬▬▬ 1¦(book/article/poem etc)¦ 2¦(letter)¦ 3¦(form words)¦ 4¦(state something)¦ 5¦(music/song)¦ 6¦(computer program)¦ 7¦(a computer records something)¦ …   Dictionary of contemporary English

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”