- Soft heap
In
computer science , the soft heap, designed byBernard Chazelle in 2000, is a variant on the simple heapdata structure . By carefully "corrupting" (increasing) the keys of at most a certain fixed percentage of values in the heap, it is able to achieve amortized constant-time bounds for all five of its operations:
* create("S"): Create a new soft heap
* insert("S", "x"): Insert an element into a soft heap
* meld("S", " S' "): Combine the contents of two soft heaps into one, destroying both
* delete("S", "x"): Delete an element from a soft heap
* findmin("S"): Get the element with minimum key in the soft heapThe term "corruption" here is the result of what Chazelle called "carpooling" in a soft heap. Each node in the soft heap contains a linked-list of keys and one common key. The common key is an upper bound on the values of the keys in the linked-list. Once a key is added to the linked-list, it is considered corrupted because its value is never again relevant in any of the soft heap operations: only the common keys are compared. It is unpredictable which keys will be corrupted in this manner; it is only known that at most a fixed percentage will be corrupted. This is what makes soft heaps "soft"; you can't be sure whether or not any particular value you put into it will be corrupted. The purpose of these corruptions is effectively to lower theinformation entropy of the data, enabling the data structure to break through information-theoretic barriers regarding heaps.Other heaps such as
fibonacci heap s achieve most of these bounds without any corruption, but cannot provide a constant-time bound on the critical "delete" operation. The percentage of values which are corrupted can be chosen freely, but the lower this is set, the more time insertions require (O(log 1/ε) for an error rate of ε).Applications
Surprisingly, soft heaps are useful in the design of deterministic algorithms, despite their unpredictable nature. They were the key in creating the best-known algorithm for finding a
minimum spanning tree to date. They can also be used to easily build an optimalselection algorithm , as well as "near-sorting" algorithms, which are algorithms that place every element near its final position, a situation in whichinsertion sort is fast.One of the simplest examples is the selection algorithm. Say we want to find the "k"th largest of a group of "n" numbers. First, we choose an error rate of 1/4; that is, at most 25% of the keys we insert will be corrupted. Now, we insert all "n" elements into the heap — at this point, at most "n"/4 keys are corrupted. Next, we delete the minimum element from the heap "n"/2 times. Because this is decreasing the size of the heap, it cannot increase the number of corrupted elements. The result is that, still, at most "n"/4 keys are corrupted.
Thus, at least "n"/2 − "n"/4 = "n"/4 of the remaining keys are not corrupted, and each must be larger than every element we removed. Moreover, because keys are only increased by corruption, the last and largest element "L" we removed must exceed the original keys of the "n"/4 other elements we removed. In other words, "L" divides the elements somewhere between 25%/75% and 75%/25%. We then partition the set about "L" using the "partition" algorithm from
quicksort and apply the same algorithm again to either the set of numbers less than "L" or the set of numbers greater than "L", neither of which can exceed (3/4)"n" elements. Since each insertion and deletion requires O(1) amortized time, the total deterministic time is bounded by a multiple of:T("n") = (5/4)"n" + (5/4)(3/4)"n" + (5/4)(3/4)2"n" + ... = 5n.
The final algorithm looks like this:
function softHeapSelect(a [1..n] , k) if k = 1 then return minimum(a [1..n] ) create(S) for i from 1 to n insert(S, a [i] ) for i from 1 to n/4 x := findmin(S) delete(S, x) xIndex := partition(a, x) "// Returns new index of pivot x" if k < xIndex softHeapSelect(a [1..xIndex-1] , k) else softHeapSelect(a [xIndex..n] , k-xIndex+1)
External links
* Chazelle's original paper [http://www.cs.princeton.edu/~chazelle/pubs/sheap.ps (ps)] [http://www.cs.princeton.edu/~chazelle/pubs/sheap.pdf (pdf)] directly from the author's website, including C code.
Wikimedia Foundation. 2010.