Kd-trie

Kd-trie

A "k"d-trie is a spatial data structure for indexing points in "k"-dimensional space. It is a variation on the "k"d-tree that only stores points in the leaf nodes, sometimes in small collections called bins or buckets. Internal nodes only store the dimension and the value that split. This removes the restriction that the splitting plane pass through one of the points being stored. "k"d-tries are usually static data structures.

Most papers refer to "k"d-tries as "k"d-trees because they're extremely similar. The articles are currently separate to avoid confusion about the algorithms that are applied to them. In fact, the original paper by Friedman and Bentley redefine the structure without using a different name for them. Overmars refers to them as "pseudo k-d tree," after his own "pseudo quad-tree." Samet calls it an "adaptive k-d tree" because it uses "adaptive partitioning." However, the original "k"d-tree also uses adaptive partitioning—that is, it adapts to the data set, instead of splitting regularly (in the geometric sense).

Constructing a "k"d-trie

Depending on the application, the choice of splitting rule varies. However, the overall method remains the same:

function split ("list of points" pointList, "int" depth) { "// varies, but should return a discriminator dimension and value" } function kdtrie ("list of points" pointList, "int" depth) { if pointList is empty return nil; else { var discriminator = split(pointList, depth) "// Create node and construct subtrees" var "tree_node" node; node.discriminator := discriminator; node.leftChild := kdtrie(points in pointList less than discriminator, depth+1); node.rightChild := kdtrie(points in pointList greater than or equal to discriminator, depth+1); return node; } }

plitting rules

The original splitting rule is the one presented in Bentley's paper for building a "k"d-tree. It is optimal in the sense of height, of O left ( log n ight ) function split ("list of points" pointList, "int" depth) { "// Select axis based on depth so that axis cycles through all valid values" var "int" dimension := depth mod k; "// Sort point list and choose median as pivot element" select median from pointList using predicate: point1 [dimension] < point2 [dimension] ; return dimension, median }

In a later paper, Bentley et al. propose a variation on the above rule. Instead of simply rotating through the dimensions, it selects the dimension with the maximum spread (max-min). This is now referred to as the standard splitting rule. It maximizes the probability of needing to explore only one subtree, while ensuring that each tree is of equal size; therefore the depth is still optimal. However, the aspect ratio of the cells may be very high.

The midpoint splitting rule selects on the middle of the longest axis of the space being searched, regardless of the distribution of points. This guarantees that the aspect ratio will be at most 2:1, but the depth is dependent on the distribution of points. A variation, called sliding-midpoint, only splits on the middle if there are points on both sides of the split. Otherwise, it splits on point nearest to the middle. Maneewongvatana and Mount have shown that this offers "good enough" performance on common data sets.

Other splitting rules exist, but are not as commonly used.

Applications

Approximate nearest neighbor

Using sliding-midpoint, an approximate nearest neighbor query can be answered in O left ( frac{ 1 }{ { epsilon }^d } log n ight )

Approximate range counting

Using sliding-midpoint, can be answered in O left ( log n + { left ( frac{1}{ epsilon } ight ) }^d ight )

External links

* [http://www.cs.umd.edu/~mount/ANN/ C++ library that uses "k"d-trees for Approximate Nearest Neighbor Searching]
* The [http://cgal.org/ Computational Geometry Algorithms Library] includes an implementation of "k"d-tries for [http://www.cgal.org/Manual/3.2/doc_html/cgal_manual/Spatial_searching/Chapter_main.html spatial searching]

References

* Freidman, J. H., Bentley, J. L., and Finkel, R. A. 1977. [http://portal.acm.org/citation.cfm?id=355745 An Algorithm for Finding Best Matches in Logarithmic Expected Time] . ACM Trans. Math. Softw. 3, 3 (Sep. 1977), 209-226.
* S. Maneewongvatana and D. M. Mount. [http://www.cs.umd.edu/~mount/Papers/cgc99-smpack.pdf It's okay to be skinny, if your friends are fat] . 4th Annual CGC Workshop on Comptutational Geometry, 1999.


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Trie-Chateau — Trie Château Trie Château Pays …   Wikipédia en Français

  • Trie-château — Pays …   Wikipédia en Français

  • Trie-la-ville — 49° 17′ 26″ N 1° 50′ 00″ E / 49.2905555556, 1.83333333333 …   Wikipédia en Français

  • Trie-sur-Baise — Trie sur Baïse Trie sur Baïse Pays   …   Wikipédia en Français

  • Trie-sur-baïse — Pays   …   Wikipédia en Français

  • Trie sur Baïse — Pays   …   Wikipédia en Français

  • Trie-Château — Trie Château …   Deutsch Wikipedia

  • Trie — Trie, der die Zeichenketten Java, Rad, Rand, Rau, Raum und Rose speichert Ein Trie oder Präfixbaum ist eine Datenstruktur, die in der Informatik zum Suchen nach Zeichenketten verwendet wird. Es handelt sich dabei um einen speziellen Suchbaum zur… …   Deutsch Wikipedia

  • Trie-la-Ville — Trie la Ville …   Deutsch Wikipedia

  • Trie (Begriffsklärung) — Trie bezeichnet: Trie, eine Datenstruktur in der Informatik Trie (Fluss), einen Fluss im französischen Département Somme Trie Château, eine Gemeinde im französischen Département Oise Trie la Ville, eine Gemeinde im französischen Département Oise… …   Deutsch Wikipedia

  • trie — 1. (trie) s. f. Sorte de morue verte. ÉTYMOLOGIE    C est l anc. franç. trie, signifiant triage (voy. tri). SUPPLÉMENT AU DICTIONNAIRE 1. TRIE. Ajoutez : 2°   Action de trier le poisson, pour le vendre. Aucun trieur ni écoreur de service pour la… …   Dictionnaire de la Langue Française d'Émile Littré

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”