- Remote Differential Compression
Remote Differential Compression (RDC) is a client-server synchronization algorithm that allows the contents of two files to be synchronized by communicating only the differences between them. It was introduced with
Windows Server 2003 R2 and is included with later Windows client and server operating systems.Unlike
Binary Delta Compression (BDC), which is designed to operate only on known versions of a single file, RDC does not make assumptions about file similarity or versioning. The differences between files are computed on the fly, therefore RDC is suitable for efficient synchronization of files that have been updated independently, network bandwidth is small or in scenarios where the files are large but the differences between them are small.The algorithm used is based on fingerprinting blocks on each file locally at both ends of the replication partners. Since many types of file changes can cause the file contents to move (for example, a small insertion or deletion at the beginning of a file can cause the rest of the file to become misaligned to the original content) the blocks used for comparison are not based on static arbitrary cut points but on cut points defined by the contents of each file segment. This means that if a part of a file changes in lenght or blocks of the contents get moved to other parts of the file, the block boundaries for the parts that have not changed remain fixed related to the contents, and thus the series of fingerprints for those blocks don't change either, they just change position. By comparing all hashes in a file to the hashes for the same file at the other end of the replication pair, RDC is able to identify which blocks of the file have changed and which haven't, even if the contents of the file has been significantly reshuffled. Since comparing large files could imply making large numbers of signature comparisons, the algorithm is recursively applied to the hash sets to detect which blocks of hashes have changed or moved around, significantly reducing the amount of thata that needs to be transmitted for comparing files.
The Client Side Caching (CSC) feature in
Windows Vista makes use of the technology for the first time, allowing file types such asMicrosoft Outlook personal folders (*.pst) to be made available offline. Previously,Windows XP used only filemetadata to test if a file such as a .pst had changed. When the application "touches" a .PST file's date, even when it does not make any changes, it triggers an update of the file in Windows XP causing CSC to recopy these large files unnecessarily.In Windows Vista the file will be updated ony if it has actually been modified, and only the actual parts of the file that have been changed are transmitted.External links
* [http://technet2.microsoft.com/windowsserver/en/library/8c4cf2e7-0b92-4643-acbd-abfa9f189d031033.mspx?mfr=true Introduction to DFS replication]
* [http://msdn2.microsoft.com/en-us/library/aa372948.aspx About Remote Differential Compression]
* [ftp://ftp.research.microsoft.com/pub/tr/TR-2006-157.pdf Optimizing File Replication over Limited-Bandwidth Networks using Remote Differential Compression]
Wikimedia Foundation. 2010.