- Set (computer science)
In
computer science , a set is a collection (container) of certain values, without any particular order, and no repeated values. It corresponds with afinite set in mathematics. Disregardingsequence , and the fact that there are no repeated values, it is the same as a list. A set can be seen as anassociative array (partial mapping) in which the value of each key-value pair is ignored.Implementations
Sets can be implemented using various
data structure s. Ideal set data structures make it efficient to check if an object is in the set, as well as enabling other useful operations such as iterating through all the objects in the set, performing a union or intersection of two sets, or taking the complement of a set in some limited domain. Anyassociative array data structure can be used to implement a set by letting the set of keys be the elements of the set and ignoring the values. Because of the similarity to associative arrays, sets are commonly implemented in the same ways, namely, aself-balancing binary search tree for sorted sets (which has O(log n) for most operations), or ahash table for unsorted sets (which has O(1) average-case, but O(n) worst-case, for most operations). A sorted linear hash table [ Citation | title=Sorted Linear Hash Table | first1=Thomas | last1=Wang |url=http://www.concentric.net/~Ttwang/tech/sorthash.htm | year=1997 ] may be used to provide deterministically ordered sets.Other popular methods include
array s. In particular a subset of the integers 1.."n" can be implemented efficiently as an "n"-bitbit array , which also support very efficient union and intersection operations. ABloom map implements a set probabilistically, using a very compact representation but risking a small chance of false positives on queries.However, very few of these data structures support set operations such as union or intersection efficiently. For these operations, more specialized set data structures exist.
Language support
One of the earliest languages to support sets was Pascal; many languages now include it, whether in the core language or in a
standard library . The Java programming language offers the Javadoc:SE|java/util|Set interface to support sets (with the Javadoc:SE|java/util|HashSet class implementing it using a hash table), and the Javadoc:SE|java/util|SortedSet sub-interface to support sorted sets (with the Javadoc:SE|java/util|TreeSet class implementing it using a binary search tree). InC++ , STL provides the "set
" template class, which implements a sorted set using a binary search tree; and SGI's STL provides the "hash_set
" class, which implements a set using a hash table. Python has a built-in set type, but no set literal.Multiset
A variation of the set is the
multiset or bag, which is the same as a set data structures, but allows repeated values. Formally, a multiset can be thought of as an associative array that maps unique elements topositive integer s, indicating the mulplicity of the element, although actual implementation may vary. C++'s Standard Template Library provides the "multiset
" class for the sorted multiset, and SGI's STL provides the "hash_multiset
" class, which implements a set using a hash table.Apache Commons Collections provides [http://commons.apache.org/collections/api-release/org/apache/commons/collections/Bag.html Bag] and SortedBag interface for Java; along with implementing classes like HashBag and TreeBag, analogous to similarly-named Set implementations.References
ee also
*
Bloom map
*Disjoint-set data structure
*Standard Template Library - contains a C++ set class
Wikimedia Foundation. 2010.