- 2-3-4 tree
A 2-3-4 tree (also called a 2-4 tree), in
computer science, is a self-balancing data structurethat is commonly used to implement dictionaries. 2-3-4 trees are B-trees of order 4; like B-trees in general, they can search, insert and delete in O(log "n") time.
Each datum in a 2-3-4 tree is called an element. These are grouped into nodes, which may be:
* A 2-node containing 1 element and 2 children, or
* A 3-node containing 2 elements and 3 children, or
* A 4-node containing 3 elements and 4 children
Each child ("p", "q", "r" and "s" in the diagrams) is a possibly empty 2-3-4 subtree. The root node is the topmost node with no parent; it serves as a starting point when walking through the tree because every other node can be reached from it. A leaf node is a node with no children.
B-trees, 2-3-4 trees are "ordered": each element must be greater than or equal to any others to its left and in its left subtree. Each child then becomes an interval bracketed by the elements to its left and right. ("p", "q", "r" and "s" in the 4-node diagram above would have intervals (-∞, a), (a, b), (b, c) and (c, ∞).)
2-3-4 trees are an
isometryof red-black trees, meaning that they are equivalent data structures. In other words, for every 2-3-4 tree, there exists at least one red-black tree with data elements in the same order. Moreover, insertion and deletion operations on 2-3-4 trees that cause node expansions, splits and merges are equivalent to the color-flipping and rotations in red-black trees. Introductions to red-black trees usually introduce 2-3-4 trees first, because they are conceptually simpler. 2-3-4 trees, however, can be difficult to implement in most programming languages because of the large number of special cases involved in operations on the tree. Red-black trees are simpler to implement, so tend to be used instead.
To insert a value, we start at the root of the 2-3-4 tree:
# If the current node is a 4-node:
#* Push the middle element of the 4-node up into the parent, leaving a 3-node.
#* Split the remaining 3-node up into a pair of 2-nodes.
#* If this is the root node (which thus has no parent), the middle value becomes the new root 2-node and the tree height increases by 1. Ascend into the root.
#** Otherwise, push the middle value up into the parent node. Ascend into the parent node.
# Find the child whose interval contains the value to be inserted.
# If the child is empty, insert the value into current node and finish.
#* Otherwise, descend into the child and repeat from step 1.citation|last1 = Ford|first1 = William|first2 = William|last2 = Topp|title = Data Structures with C++ Using STL|edition = 2nd|location = New Jersey|publisher = Prentice Hall|year = 2002|isbn = 0-13-085850-1|pages = 683] citation|title = Data Structures and Algorithms in C++|first1 = Michael T|last1 = Goodrich|first2 = Roberto|last2 = Tamassia|first3 = David M|last3 = Mount|isbn = 0-471-20208-8|publisher =
Wileyurl = http://cpp.datastructures.net/presentations/24Trees.pdf|publisher = Wiley|date = 2002]
To insert the value "25" into this 2-3-4 tree::
* Begin at the root (10, 20) and descend towards the rightmost child (22, 24, 29). (Its interval (20, ∞) contains 25.)
* Node (22, 24, 29) is a 4-node, so its middle element 24 is pushed up into the parent node.:
* The remaining 3-node (22, 29) is split into a pair of 2-nodes (22) and (29). Ascend back into the new parent (10, 20, 24).
* Descend towards the rightmost child (29). (Its interval (24, ∞) contains 25.):
* Node (29) has no rightmost child. (The child for interval (29, ∞) is empty.) Stop here and insert value 25 into this node.:
Deletion is the more complex operation and involves many special cases.
First the element to be deleted needs to be found. The element must be in a node at the bottom of the tree; otherwise, it must be swapped with another element which precedes it in in-order traversal (which must be in a bottom node) and that element removed instead.
If the element is to be removed from a 2-node, then a node with no elements would result. This is called underflow. To solve underflow, an element is pulled from the parent node into the node where the element is being removed, and the vacancy created in the parent node is replaced with an element from a sibling node. (Sibling nodes are those which share the same parent node.) This is called transfer.
If the siblings are 2-nodes themselves, underflow still occurs, because now the sibling has no elements. To solve this, two sibling nodes are fused together (after pulling element from the parent node).
If the parent is a 2-node, underflow will occur on the parent node. This is solved by using the methods above. This may cause different parent node to sustain underflow as deletions and replacements are being made, referred to as underflow cascading.
Deletion in a 2-3-4 tree is O(log n), while transfer and fusion constant time, O(1). [cite web|work = CS251: Data Structures Lecture Notes|title = (2,4) Trees|first = Ananth|last = Grama|url = http://www.cs.purdue.edu/homes/ayg/CS251/slides/chap13a.pdf|accessdate = 2008-04-10|date = 2004|publisher = Department of Computer Science, Purdue University]
* [http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml Animation of a 2-3-4 Tree]
* [http://www.cs.unm.edu/~rlpm/499/ttft.html Java Applet showing a 2-3-4 Tree]
Wikimedia Foundation. 2010.