- Signature files
Signature files is a technique applied for document retrieval. The idea behind Signature files is to create a "quick and dirty filter" that will keep all the documents that match to the queryand hopefully a few ones that do not. The way this is done is by creating for each file a signature, typically a hash coded version. One method is superimposed coding.A post-processing step is done to discard the false alarms.This structure since in most cases is inferior to
inverted file s in terms of speed, size and functionality, is not used much. However, with proper parameters it can beat the inverted files in certain environments.References
* Christos Faloutsos and Stavros Christodoulakis, "Signature files: An access method for documents and its analytical performance evaluation." ACM Transactions on Information Systems (TOIS), Vol. 2, No. 4 (1984), pp. 267-288.
* Justin Zobel, Alistair Moffat and Kotagiri Ramamohanarao, "Inverted files versus signature files for text indexing". ACM Transactions on Database Systems (TODS), Vol. 23, Issue 4 (1998), pp. 453-490.
* Ben Carterette and Fazli Can, "Comparing inverted files and signature files for searching a large lexicon." Information Processing and Management, Vol. 41, No. 3 (2005), pp. 613-633.
Wikimedia Foundation. 2010.