Term indexing

Term indexing

In computer science, term indexing is the task of creating an index of terms and clauses in a collection.

Many operations in automatic theorem provers require search in huge collections of terms and clauses. Such operations typically fall intothe following scheme. Given a collection S of terms (clauses) and a queryterm (clause) q, find in S some/all terms t related to q according to acertain retrieval condition. Most interesting retrieval conditionsare formulated as existence of a substitution that relates in a specialway the query and the retrieved objects t. Here is a list of retrievalconditions frequently used in provers:

* term q is unifiable with term t, i.e., there exists a substitution heta , such that q heta = t heta
* term t is an instance of q, i.e., there exists a substitution heta, such that q heta = t
* term t is a generalisation of q, i.e., there exists a substitution heta, such that q = t heta
* clause q subsumes clause t, i.e., there exists a substitution heta, such that q heta is a subset/submultiset of t
* clause q is subsumed by t, i.e., there exists a substitution heta, such that t heta is a subset/submultiset of q More often than not, we are actually interested in finding the appropriate substitutions explicitly, together with the retrieved terms t,rather than just in establishing existence of such substitutions.

Very often the sizes of term sets to be searched are large, the retrieval calls are frequent and the retrieval condition testis rather complex. In such situations linear search in S, when the retrievalcondition is tested on every term from S, becomes prohibitively costly. To overcome this problem, special data structures, called "indexes", are designed in order to support fast retrieval. Such data structures, together with the accompanying algorithms for index maintenanceand retrieval, are called "term indexing techniques".

Classic indexing techniques

* discrimination trees
* substitution trees
* path indexing

Modern indexing techniques

* feature vector indexing
* code trees
* context trees
* relational path indexing

Further reading

* P. Graf, Term Indexing, Lecture Notes in Computer Science 1053, 1996 (slightly outdated overview)
* R. Sekar and I.V. Ramakrishnan and A. Voronkov, Term Indexing, in A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning, volume 2, 2001 (recent overview)
* W. W. McCune, Experiments with Discrimination-Tree Indexing and Path Indexing for Term Retrieval, Journal of Automated Reasoning, 9(2), 1992
* P. Graf, Substitution Tree Indexing, Proc. of RTA, Lecture Notes in Computer Science 914, 1995
* M. Stickel, The Path Indexing Method for Indexing Terms, Tech. Rep. 473, Artificial Intelligence Center, SRI International, 1989
* S. Schulz, Simple and Efficient Clause Subsumption with Feature Vector Indexing, Proc. of IJCAR-2004 workshop ESFOR, 2004
* A. Riazanov and A. Voronkov, Partially Adaptive Code Trees, Proc. JELIA, Lecture Notes in Artificial Intelligence 1919, 2000
* H. Ganzinger and R. Nieuwenhuis and P. Nivela, Fast Term Indexing with Coded Context Trees, Journal of Automated Reasoning, 32(2), 2004
* A. Riazanov and A. Voronkov, Efficient Instance Retrieval with Standard and Relational Path Indexing, Information and Computation, 199(1-2), 2005


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Term Discrimination — is a way to rank keywords in how useful they are for Information Retrieval. Overview This is a method similar to tf idf but it deals with finding keywords suitable for information retrieval and ones that are not. Please refer to Vector Space… …   Wikipedia

  • Latent Semantic Indexing — (kurz LSI, englisch für schwache Bedeutungseinordnung) ist ein (patentgeschütztes) Verfahren des Information Retrieval, das 1990 zuerst von Deerwester et al.[1] erwähnt wurde. Verfahren wie das LSI sind insbesondere für die Suche auf großen… …   Deutsch Wikipedia

  • Compound term processing — is the name that is used for a category of techniques in Information retrieval applications that performs matching on the basis of compound terms. Compound terms are built by combining two (or more) simple terms, for example triple is a single… …   Wikipedia

  • Web indexing — (or Internet indexing ) includes back of book style indexes to individual websites or an intranet, and the creation of keyword metadata to provide a more useful vocabulary for Internet or onsite search engines. With the increase in the number of… …   Wikipedia

  • Enhanced indexing — In finance, enhanced indexing is a catch all term that describes strategies employed to outperform traditional indexing. Enhanced indexing attempts to generate modest excess returns compared to index funds and other passive management… …   Wikipedia

  • Instant indexing — is a marketing term that denotes a feature offered by select Internet search engines that enables users to submit content for immediate inclusion into the index.Delayed inclusionCertain search engine services may require an extended period of… …   Wikipedia

  • Index term — An index term or descriptor in Information Retrieval is a term that captures the essence of the topic of a document. It is used as keyword to retrieve documents in an information system, for instance a catalog or a search engine. A popular form… …   Wikipedia

  • Mortimer Taube — Born December 6, 1910 Jersey City, New Jersey Died September 3, 1965 Annapolis, Maryland Education University of Chicago, University of California at Berkeley Occupation Information Scientist …   Wikipedia

  • E equational theorem prover — E is a modern, high performance theorem prover for full first order logic with equality. It has been developed primarily in the Automated Reasoning Group at TU Munich.The system is based on the equational superposition calculus. In contrast to… …   Wikipedia

  • LNCS1053 — P. Graf: Term Indexing, Springer Verlag 1996 (Subseries LNAI) …   Acronyms

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”