- SUHA
In computer science, SUHA (Simple Uniform Hashing Assumption) or the Uniform Hashing Assumption is a basic assumption that facilitates the mathematical analysis of hash tables. The assumption states that a hypothetical
hashing function will evenly distribute items into the slots of a hash table. Moreover, each item to be hashed has an equalprobability of being placed into a slot, regardless of the other elements already placed. This assumption generalizes the details of the hash function and allows for certain assumptions about the stochastic system.Applications
SUHA is most commonly used as a foundation for mathematical proofs describing the properties and behavior of hash tables in
theoretical computer science . Minimizing hashing collisions can be achieved with a uniform hashing function. These functions often rely on the specific input data set and can be quite difficult to implement. Assuming uniform hashing allows hash table analysis to be made without exact knowledge of the input or the hash function used.Mathematical Implications
Certain properties of a hash tables can be derived once uniform hashing is assumed.
Uniform Distribution
Under the assumption of uniform hashing, given a hash function "h", and a hash table of size "m". The probability that 2 non-equal elements will hash to the same slot is:
Collision Chain Length
Under the assumption of uniform hashing, the
load factor and the average chain length of a hash table of size "m" with "n" elements will be:uccessful Lookup
Under the assumption of uniform hashing, the average time (in big-O notation) to successfully find an element in a hash table using chaining is:
Unsuccessful Lookup
Under the assumption of uniform hashing, the average time (in big-O notation) to unsuccessfully find an element in a hash table using chaining is:
Example
A simple example of using SUHA can be seen while observing an arbitrary hash table of size 10 and a data set of 30 unique elements. If chaining is used to deal with collisions, the average chain length of this hash table may be a desirable value. Without any assumptions and with no more additional information about the data or hash function, the chain length cannot be estimated. With SUHA however, we can state that because of an assumed uniform hashing, each element has an equal probability of mapping to a slot. Since no particular slot should be favored over another, the 30 elements should hash into the 10 slots uniformly. This will produce a hash table with, on average, 10 chains each of length 3:
:
:
ee also
*Hash Table
*Hash Collision
*Perfect HashingReferences
General
* cite book
last = Collins
first = William
title = Data Structures and the Java Collections Framework
publisher = McGraw-Hill
date= 2004
chapter = Section 14.3.2: The Uniform Hashing Assumption
pages = 608
id = ISBN 0072823798
* cite book
last = Cormen
first = Thomas H.
authorlink = Thomas H. Cormen
coauthors =Charles E. Leiserson ,Ronald L. Rivest ,Clifford Stein
title =Introduction to Algorithms
publisher = MIT Press and McGraw-Hill
date= 2001
chapter = Section 11.2: Hash Tables
pages = 226-228
id = ISBN 0-262-03293-7
Wikimedia Foundation. 2010.