Lesk algorithm

Lesk algorithm

The Lesk algorithm is a classical algorithm for word sense disambiguation introduced by Michael E. Lesk in 1986 [" [http://portal.acm.org/ft_gateway.cfm?id=318728&type=pdf&dl=GUIDE&dl=ACM Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone] ", Michael Lesk, ACM Special Interest Group for Design of Communication Proceedings of the 5th annual international conference on Systems documentation, p. 24 - 26, 1986. ISBN 0897912241 [http://portal.acm.org/citation.cfm?id=318728] ] .

The Lesk algorithm is based on the assumption that words in a given neighbourhood will tend to share a common topic. A naive implementation of the The Lesk algorithm would be
# choosing pairs of ambiguous words within a neighbourhood
# checks their definitions in a dictionary
# choose the senses as to maximise the number of common terms in the definitions of the chosen words.

Accuracy on "Pride and Prejudice" and selected papers of the Associated Press was found to be in the 50% to 70% range.

A simplified version of the Lesk algorithm is to compare the dictionary definition of an ambiguous word with the terms contained of the neighbourhood.

Versions have been adapted to Wordnet [" [http://www.cs.cmu.edu/~banerjee/Publications/cicling2002.ps.gz An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet] ", Satanjeev Banerjee and Ted Pedersen, Lecture Notes In Computer Science; Vol. 2276, Pages: 136 - 145, 2002. ISBN 3540432191] .

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Mike Lesk — Michael E. Lesk is a computer programmer. In the 1960s, Michael Lesk worked for the SMART Information Retrieval System project, wrote much of its retrieval code and did many of the retrieval experiments, as well as obtaining a PhD in Chemical… …   Wikipedia

  • Word-sense disambiguation — Disambiguation redirects here. For other uses, see Disambiguation (disambiguation). In computational linguistics, word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying which… …   Wikipedia

  • Word sense disambiguation — In computational linguistics, word sense disambiguation (WSD) is the process of identifying which sense of a word is used in any given sentence, when the word has a number of distinct senses. For example, consider two examples of the distinct… …   Wikipedia

  • Protein domain — Pyruvate kinase, a protein from three domains (PDB 1pkn) A protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three… …   Wikipedia

  • Diff — In computing, diff is a file comparison utility that outputs the differences between two files, or the changes made to a current file by comparing it to a former version of the same file. Diff displays the changes made per line for text files.… …   Wikipedia

  • diff — This article is about the file comparison utility. For other uses, see DIFF (disambiguation). Diffs redirects here. For the American punk rock group, see The Diffs. In computing, diff is a file comparison utility that outputs the differences… …   Wikipedia

  • List of programmers — This list is incomplete; you can help by expanding it. This is a list of programmers notable for their contributions to software, either as original author or architect, or for later additions. Contents: A B C D E F G H I J K L M N …   Wikipedia

  • Sequence alignment — In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.[1]… …   Wikipedia

  • Ribbon diagram — as arrows, and lines or thin tubes for random coils. The direction of the polypeptide chain may be indicated by a colour ramp along the length of the ribbon. [ [http://www.danforthcenter.org/smith/molview/over/overview.html Thomas J. Smith ] ]… …   Wikipedia

  • Index (search engine) — Search engine indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, physics, and… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”