- Capacity optimization
Capacity optimization technologies are similar to
data compression technologies, but they look for redundancy of very large sequences of bytes across very large comparison windows. Typically usingcryptographic hash functions as identifiers of unique sequences, long (8KB+) sequences are compared to the history of other such sequences, and where possible, the first uniquely stored version of a sequence is referenced rather than stored again.Capacity optimization generally refers to the use of this kind of technology in a storage system. An example of this kind of system is the Venti file system [ [http://cm.bell-labs.com/who/seanq/venti-fast02-talk.pdf Venti filesystem] ] in the Plan9 open source OS. (Source code is available [http://v9fs.sourceforge.net/ here] ). There are also implementations in networking (especially Wide Area networking), where they are sometimes called bandwidth optimization technologies. [ [http://www.cs.washington.edu/homes/djw/papers/spring-sigcomm00.pdf Spring and Wetherall, "A Protocol Independent Technique for Eliminating Redundant Network Traffic"] ]
Commercial implementations of capacity optimization are most often found in backup/recovery storage, where storage of iterating versions of backups day to day creates an opportunity for reduction in space using this approach. The term was first used widely in 2005. [ [http://searchstorage.techtarget.com/sDefinition/0,290660,sid5_gci1103991,00.html "Capacity optimization" defined by searchstorage.com] ]
Other non-vendor terms to describe this technology have included deduplication or factoring.
References
Wikimedia Foundation. 2010.