Distributional hypothesis

Distributional hypothesis

The Distributional Hypothesis in linguistics is the theory that words that occur in the same contexts tend to have similar meanings.[1] The underlying idea that "a word is characterized by the company it keeps" was popularized by Firth.[2] The Distributional Hypothesis is the basis for Statistical Semantics. Although the Distributional Hypothesis originated in Linguistics, it is now receiving attention in Cognitive Science especially regarding the context of word use.[3]

In recent years, the distributional hypothesis has provided the basis for the theory of similarity-based generalization in language learning: the idea that children can figure out how to use words they've rarely encountered before by generalizing about their use from distributions of similar words.[4] The distributional hypothesis suggests that the more semantically similar two words are, the more distributionally similar they will be in turn, and thus the more that they will tend to occur in similar linguistic contexts. Whether or not this suggestions holds has significant implications for both the data-sparsity problem in computational modeling, and for the question of how children are able to learn language so rapidly given relatively impoverished input (this is also known as the problem of the poverty of the stimulus).

See also

External links

References

  1. ^ Harris, Z. (1954). "Distributional structure". Word 10 (23): 146–162. 
  2. ^ Firth, J.R. (1957). A synopsis of linguistic theory 1930-1955. In Studies in Linguistic Analysis, pp. 1-32. Oxford: Philological Society. Reprinted in F.R. Palmer (ed.), Selected Papers of J.R. Firth 1952-1959, London: Longman (1968).
  3. ^ McDonald, S., and Ramscar, M. (2001). Testing the distributional hypothesis: The influence of context on judgements of semantic similarity. In Proceedings of the 23rd Annual Conference of the Cognitive Science Society, pages 611-616.
  4. ^ Yarlett, D (2008) Language Learning Through Similarity-Based Generalization, PhD Thesis, Stanford University.