- Kd-trie
A "k"d-
trie is a spatialdata structure for indexing points in "k"-dimensional space. It is a variation on the "k"d-tree that only stores points in the leaf nodes, sometimes in small collections called bins or buckets. Internal nodes only store the dimension and the value that split. This removes the restriction that the splitting plane pass through one of the points being stored. "k"d-tries are usually static data structures.Most papers refer to "k"d-tries as "k"d-trees because they're extremely similar. The articles are currently separate to avoid confusion about the algorithms that are applied to them. In fact, the original paper by Friedman and Bentley redefine the structure without using a different name for them. Overmars refers to them as "pseudo k-d tree," after his own "pseudo quad-tree." Samet calls it an "adaptive k-d tree" because it uses "adaptive partitioning." However, the original "k"d-tree also uses adaptive partitioning—that is, it adapts to the data set, instead of splitting regularly (in the geometric sense).
Constructing a "k"d-trie
Depending on the application, the choice of splitting rule varies. However, the overall method remains the same:
function split ("list of points" pointList, "int" depth) { "// varies, but should return a discriminator dimension and value" } function kdtrie ("list of points" pointList, "int" depth) { if pointList is empty return nil; else { var discriminator = split(pointList, depth) "// Create node and construct subtrees" var "tree_node" node; node.discriminator := discriminator; node.leftChild := kdtrie(points in pointList less than discriminator, depth+1); node.rightChild := kdtrie(points in pointList greater than or equal to discriminator, depth+1); return node; } }
plitting rules
The original splitting rule is the one presented in Bentley's paper for building a "k"d-tree. It is optimal in the sense of height, of function split ("list of points" pointList, "int" depth) { "// Select axis based on depth so that axis cycles through all valid values" var "int" dimension := depth mod k; "// Sort point list and choose median as pivot element" select median from pointList using predicate: point1 [dimension] < point2 [dimension] ; return dimension, median }
In a later paper, Bentley et al. propose a variation on the above rule. Instead of simply rotating through the dimensions, it selects the dimension with the maximum spread (max-min). This is now referred to as the standard splitting rule. It maximizes the probability of needing to explore only one subtree, while ensuring that each tree is of equal size; therefore the depth is still optimal. However, the aspect ratio of the cells may be very high.
The midpoint splitting rule selects on the middle of the longest axis of the space being searched, regardless of the distribution of points. This guarantees that the aspect ratio will be at most 2:1, but the depth is dependent on the distribution of points. A variation, called sliding-midpoint, only splits on the middle if there are points on both sides of the split. Otherwise, it splits on point nearest to the middle. Maneewongvatana and Mount have shown that this offers "good enough" performance on common data sets.
Other splitting rules exist, but are not as commonly used.
Applications
Approximate nearest neighbor
Using sliding-midpoint, an approximate nearest neighbor query can be answered in
Approximate range counting
Using sliding-midpoint, can be answered in
External links
* [http://www.cs.umd.edu/~mount/ANN/ C++ library that uses "k"d-trees for Approximate Nearest Neighbor Searching]
* The [http://cgal.org/ Computational Geometry Algorithms Library] includes an implementation of "k"d-tries for [http://www.cgal.org/Manual/3.2/doc_html/cgal_manual/Spatial_searching/Chapter_main.html spatial searching]References
* Freidman, J. H., Bentley, J. L., and Finkel, R. A. 1977. [http://portal.acm.org/citation.cfm?id=355745 An Algorithm for Finding Best Matches in Logarithmic Expected Time] . ACM Trans. Math. Softw. 3, 3 (Sep. 1977), 209-226.
* S. Maneewongvatana and D. M. Mount. [http://www.cs.umd.edu/~mount/Papers/cgc99-smpack.pdf It's okay to be skinny, if your friends are fat] . 4th Annual CGC Workshop on Comptutational Geometry, 1999.
Wikimedia Foundation. 2010.