Version space

Version space

A version space in concept learning or induction is the subset of all hypotheses that are consistent with the observed training examples (Mitchell 1997). This set contains all hypotheses that have not been eliminated as a result of being in conflict with observed data.

The Version Space algorithm

In settings where there is a generality-ordering on hypotheses, it is possible to represent the version space by two sets of hypotheses: (1) the most specific consistent hypotheses, and (2) the most general consistent hypotheses, where "consistent" indicates agreement with observed data. The most specific hypotheses (i.e., the specific boundary SB) are the hypotheses that cover the observed positive training examples, and as little of the remaining feature space as possible. These are hypotheses which if reduced any further would "exclude" a "positive" training example, and hence become inconsistent. These minimal hypotheses essentially constitute a (pessimistic) claim that the true concept is defined just by the "positive" data already observed: Thus, if a novel (never-before-seen) data point is observed, it should be assumed to be negative. (I.e., if data has not previously been ruled in, then it's ruled out.)

The most general hypotheses (i.e., the general boundary GB) are those which cover the observed positive training examples, but also cover as much of the remaining feature space without including any negative training examples. These are hypotheses which if enlarged any further would "include" a "negative" training example, and hence become inconsistent. These maximal hypotheses essentially constitute a (optimistic) claim that the true concept is defined just by the "negative" data already observed: Thus, if a novel (never-before-seen) data point is observed, it should be assumed to be positive. (I.e., if data has not previously been ruled out, then it's ruled in.)

Thus, during the learning process, the version space (which itself is a set – possibly infinite – containing "all" consistent hypotheses) can be represented by just its lower and upper bounds (maximally general and maximally specific hypothesis sets), and learning operations can be performed just on these representative sets.

Historical background

The notion of Version Spaces was introduced by Mitchell as a framework for understanding the basic problem of supervised learning within the context of "solution search". Although the basic "candidate elimination" search method that accompanies the Version Space framework is "not" a popular learning algorithm (for various reasons, including the issue of noise (Russell & Norvig 2002)), there "are" some practical implementations that have been developed (e.g., Sverdlik & Reynolds 1992, Hong & Tsang 1997, Dubois & Quafafou 2002).

ee also

*Rough set. [The rough set framework focuses on the case where ambiguity is introduced by an impoverished feature set. That is, the target concept cannot be decisively described because the available feature set fails to disambiguate objects belonging to different categories. The version space framework focuses on the (classical induction) case where the ambiguity is introduced by an impoverished data set. That is, the target concept cannot be decisively described because the available data fails to uniquely pick out a hypothesis. Naturally, both types of ambiguity can occur in the same learning problem.]
* provides an interesting discussion of the general problem of induction.
* Mill (1843/2002) is the classic source on inductive logic.

References

*cite conference
first = Vincent
last = Dubois
coauthors = Quafafou, Mohamed
title = Concept learning with approximation: Rough version spaces
booktitle = Rough Sets and Current Trends in Computing: Proceedings of the Third International Conference, RSCTC 2002
pages = 239–246
date = 2002
location = Malvern, Pennsylvania

*cite journal
last = Hong
first = Tzung-Pai
coauthors = Shian-Shyong Tsang
title = A generalized version space learning algorithm for noisy and uncertain data
journal = IEEE Transactions on Knowledge and Data Engineering
volume = 9
issue = 2
pages = 336–340
date = 1997
doi = 10.1109/69.591457

*cite book
last = Mill
first = John Stuart
title = A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation
publisher = University Press of the Pacific
date = 1843/2002
location = Honolulu, HI

*cite book
last = Mitchell
first = Tom M.
title = Machine Learning
publisher = McGraw-Hill
date = 1997
location = Boston

*cite journal
last = Rendell
first = Larry
title = A general framework for induction and a study of selective induction
journal = Machine Learning
volume = 1
pages = 177–226
date = 1986
doi = 10.1007/BF00114117

*Russell Norvig 2003
*cite conference
first = W.
last = Sverdlik
coauthors = Reynolds, R.G.
title = Dynamic version spaces in machine learning
booktitle = Proceedings, Fourth International Conference on Tools with Artificial Intelligence (TAI '92)
pages = 308 - 315
date = 1992
location = Arlington, VA


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Space Mountain: Mission 2 — Space Mountain Pour les articles homonymes, voir Space Mountain (homonymie). Space Mountain est l une des attractions phares des parcs à thèmes de Disney. Elle a été reproduite dans chacun des Royaumes Enchantés dans le land, Tomorrowland ou… …   Wikipédia en Français

  • Space mountain — Pour les articles homonymes, voir Space Mountain (homonymie). Space Mountain est l une des attractions phares des parcs à thèmes de Disney. Elle a été reproduite dans chacun des Royaumes Enchantés dans le land, Tomorrowland ou Discoveryland.… …   Wikipédia en Français

  • Space Patrol — has been the title of several science fiction works: *Space Patrol (1950s), the United States 1950s TV series with a concurrent radio version *Space Patrol (1962 TV series), a 1962 UK puppet television series, shown in the US under the name… …   Wikipedia

  • Space Mountain — Pour les articles homonymes, voir Space Mountain (homonymie). Space Mountain est l une des attractions phares des parcs à thèmes de Disney. Elle a été reproduite dans chacun des Royaumes Enchantés dans le land, Tomorrowland ou Discoveryland.… …   Wikipédia en Français

  • Space Cowboy (song) — Infobox Single Name = Space Cowboy Artist = Jamiroquai from Album = The Return of the Space Cowboy B side = Journey to Arnhemland/The Kids Released = 26 September 1994 (UK) Format = CD/Cassette/12 Recorded = 1994 Genre = Funk / Acid Jazz Length …   Wikipedia

  • Space-Invader — Space Invaders Space Invaders Éditeur JPN Taito AN …   Wikipédia en Français

  • Space-Marine — (Warhammer 40,000) Dans l univers de fiction Warhammer 40,000 créé par Games Workshop, les Space Marines (ou Adeptus Astartes) sont des soldats d élite humains, modifiés et équipés des derniers raffinements technologiques, et qui combattent au… …   Wikipédia en Français

  • Space-Marines — Space Marine (Warhammer 40,000) Dans l univers de fiction Warhammer 40,000 créé par Games Workshop, les Space Marines (ou Adeptus Astartes) sont des soldats d élite humains, modifiés et équipés des derniers raffinements technologiques, et qui… …   Wikipédia en Français

  • Space Invader — Space Invaders Space Invaders Éditeur JPN Taito AN …   Wikipédia en Français

  • Space Invaders: Fukkatsu No Hi — Space Invaders Space Invaders Éditeur JPN Taito AN …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”