B+ tree

B+ tree

In computer science, a B+ tree is a type of tree which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, each of which is identified by a "key". It is a dynamic, multilevel index, with maximum and minimum bounds on the number of keys in each index segment (usually called a 'block' or 'node'). In a B+ tree, in contrast to a B-tree, all records are stored at the lowest level of the tree; only keys are stored in interior blocks.

The primary value of a B+ tree is in storing data for efficient retrieval in a block-oriented storage context. Given a storage system with a block size of b, a B+ tree which stores a number of keys equal to a multiple of b will be very efficient when compared to a binary search tree (the corresponding data structure for non-block-oriented storage contexts).

The ReiserFS filesystem (for Unix and Linux), XFS filesystem (for IRIX and Linux), JFS2 filesystem (for AIX, OS/2 and Linux), and NTFS all use this type of tree for block indexing. Relational databases such as PostgreSQL and MySQL also often use this type of tree for table indices.

Details

The order of a B+ tree measures the capacity of nodes in the tree. It is defined as a number "d" such that "d" <= "m" <= 2 "d", where "m" is the number of entries in each node. For example, if the order of a B+ tree is 7, each internal node (except for the root) may have between 7 and 14 keys; the root may have between 1 and 14.

Furthermore, every internal node has at least "d"+1 and at most 2"d"+1 children.

Search

The algorithm to perform a search for a record r follows pointers to the correct child of each node until a leaf is reached. Then, the leaf is scanned until the correct record is found (or until failure).

function search(record r) u := root while (u is not a leaf) do choose the correct pointer in the node move to the first node following the pointer u := current node scan u for r

This pseudocode assumes that no repetition is allowed.

Insertion Into a B+ Tree:

* do a search to determine what bucket the new record should go in * if the bucket is not full, add the record. * otherwise, split the bucket. * allocate new leaf and move half the bucket's elements to the new bucket * insert the new leaf's smallest key and address into the parent. * if the parent is full, split it also * now add the middle key to the parent node * repeat until a parent is found that need not split * if the root splits, create a new root which has one key and two pointers.

Characteristics

For a "b"-order B+ tree with "h" levels of index:
* The maximum number of records stored is n = b^h
* The minimum number of keys is 2(b/2)^{h-1}
* The space required to store the tree is O(n)
* Inserting a record requires O(log_bn) operations in the worst case
* Finding a record requires O(log_bn) operations in the worst case
* Removing a (previously located) record requires O(log_bn) operations in the worst case
* Performing a range query with "k" elements occurring within the range requires O(log_bn+k) operations in the worst case.

Relationship to other data structures

The B+ tree (and most other "B..." trees) are all specializations of the (a,b)-tree, in which the minimum and maximum order is explicitly defined (the "a" and "b", respectively).

The B+ tree is a variant of the B-tree, the latter of which can store both keys and records in its interior nodes; in this sense, the B+ tree is a specialization of the B-tree.

The B# Tree is a specialization of the B+ tree, which adds additional restrictions.

Implementation

The leaves (the bottom-most index blocks) of the B+ tree are often linked to one another in a linked list; this makes range queries simpler and more efficient (though the aforementioned upper bound can be achieved even without this addition). This does not substantially increase space consumption or maintenance on the tree.

If a storage system has a block size of "B" bytes, and the keys to be stored have a size of "k", arguably the most efficient B+ tree is one where b=(B/k)-1. Although theoretically the one-off is unnecessary, in practice there is often a little extra space taken up by the index blocks (for example, the linked list references in the leaf blocks). Having an index block which is slightly larger than the storage system's actual block represents a significant performance decrease; therefore erring on the side of caution is preferable.

If nodes of the B+ tree are organised as arrays of elements, then it may take a considerable time to insert or delete an element as half of the array will need to be shifted on average. To overcome this problem elements inside a node can be organized in a binary tree or a B+ tree instead of an array.

B+ trees can also be used for data stored in RAM. In this case a reasonable choice for block size would be the size of processor's cache line. However some studies have proved that a block size few times larger than processor's cache line can deliver better performance if cache prefetching is used.

Space efficiency of B+ trees can be improved by using some compression techniques. One possibility is to use delta encoding to compress keys stored into each block. For internal blocks, space saving can be achieved by either compressing keys or pointers. For string keys, space can be saved by using the following technique: Normally the "i"th entry of an internal block contains the first key of block i+1. Instead of storing the full key, we could store the shortest prefix of the first key of block i+1 that is strictly greater (in lexicographic order) than last key of block i. There is also a simple way to compress pointers: if we suppose that some consecutive blocks i, i+1...i+k are stored contiguously, then it will suffice to store only a pointer to the first block and the count of consecutive blocks.

All the above compression techniques have some drawbacks. First, a full block must be decompressed to extract a single element. One technique to overcome this problem is to divide each block into sub-blocks and compress them separately. In this case searching or inserting an element will only need to decompress or compress a sub-block instead of a full block. Another drawback of compression techniques is that the number of stored elements may vary considerably from a block to another depending on how well the elements are compressed inside each block.

History

B+ tree was first described in the paper "Organization and Maintenance of Large Ordered Indices. Acta Informatica 1: 173-189 (1972)" by Rudolf Bayer and Edward M. McCreight.

See also

*NTFS
*Databases
*Binary Search Tree
*B# Tree
*B Tree
*Bitmap index
*JFS
*Reiserfs

External links

* [http://www-static.cc.gatech.edu/classes/AY2004/cs4420_spring/project/Bplustree.c B+ tree in C language]
* [http://www.scalingweb.com/opensource.php Open Source C++ B+ Tree Implementation]
* [http://idlebox.net/2007/stx-btree/ B+ tree implementation as C++ template library]
* [http://search.cpan.org/~hanenkamp/Tree-BPTree-1.07 Perl implementation of B+ trees]
* [http://bplusdotnet.sourceforge.net java/C#/python implementations of B+ trees]
* [http://www-users.itlabs.umn.edu/classes/Spring-2006/csci4707/B+Trees.pdf Study notes for B+ Trees - Insertion and Deletion]
* [http://www.cecs.csulb.edu/%7emonge/classes/share/B+TreeIndexes.html Dr. Monge's B+ Tree index notes]
* [http://www.virtualmachinery.com/btreeguide.htm Link to how BTrees work]
* [http://www2.enel.ucalgary.ca/~whassane/papers/CSB+_ccece_2007.pdf Evaluating the performance of CSB+-trees on Mutithreaded Architectures]
* [http://www.eecs.umich.edu/~jignesh/quickstep/publ/cci.pdf Effect of node size on the performance of cache conscious B+-trees]
* [http://www.pittsburgh.intel-research.net/people/gibbons/papers/fpbptrees.pdf Fractal Prefetching B+-trees]
* [http://gemo.futurs.inria.fr/events/EXPDB2006/PAPERS/Jonsson.pdf Towards pB+-trees in the field: implementations Choices and performance]
* [https://oa.doria.fi/bitstream/handle/10024/2906/cachecon.pdf?sequence=1 Cache-Conscious Index Structures for Main-Memory Databases]
* [http://ale3hs.aliagas.gr/content/view/23/1/ B+Tree Java SortedMap Implementation]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • Tree — /tree/, n. Sir Herbert Beerbohm /bear bohm/, (Herbert Beerbohm), 1853 1917, English actor and theater manager; brother of Max Beerbohm. * * * I Woody perennial plant. Most trees have a single self supporting trunk containing woody tissues, and in …   Universalium

  • Tree — (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree, oak, do ry… …   The Collaborative International Dictionary of English

  • Tree bear — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree beetle — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree bug — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree cat — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree clover — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree crab — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree creeper — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree cricket — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

  • Tree crow — Tree Tree (tr[=e]), n. [OE. tree, tre, treo, AS. tre[ o], tre[ o]w, tree, wood; akin to OFries. tr[=e], OS. treo, trio, Icel. tr[=e], Dan. tr[ae], Sw. tr[ a], tr[ a]d, Goth. triu, Russ. drevo, W. derw an oak, Ir. darag, darog, Gr. dry^s a tree,… …   The Collaborative International Dictionary of English

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”