Information Harvesting

Information Harvesting

Information Harvesting (IH) was an early data mining product from the 1990s. It was invented by Ralphe Wiggins and produced by the Ryan Corp, later Information Harvesting Inc., of Cambridge, Massachusetts. IH sought to infer rules from sets of data. It did this first by classifying various input variables into one of a number of bins, thereby putting some structure on the continuous variables in the input. IH then proceeds to generate rules, trading off generalization against memorization, that will infer the value of the prediction variable, possibly creating many levels of rules in the process. It included strategies for checking if overfitting took place and, if so, correcting for it. Because of its strategies for correcting for overfitting by considering more data, and refining the rules based on that data, IH might also be considered to be a form of machine learning.

The advantage of IH, as compared with other data mining products of its time and even later, was that it provided a mechanism for finding multiple rules that would classify the data and determining, according to set criteria, the best rules to use.

In addition, the term "Information Harvesting" has occasionally been used as a generic term for any data mining product--thus the product has had an impact on the field that outlasted the life of the company that designed it.

References

*


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • harvesting — UK US /ˈhɑːvɪstɪŋ/ noun [U] ► COMMERCE the process of continuing to make as much profit as possible from a product or business while spending as little as possible on it: »Companies may pursue a harvesting strategy when they have cash flow… …   Financial and business terms

  • Reports of organ harvesting from Falun Gong practitioners in China — In March 2006, Falun Gong affiliated media The Epoch Times published a number of articles alleging that the Chinese government and its agencies, including the People s Liberation Army, were conducting widespread and systematic organ harvesting of …   Wikipedia

  • Rainwater harvesting — is the gathering, or accumulating and storing, of rainwater. [ [http://www.harvesth2o.com/faq.shtml Definition of rainwater harvesting] ] Traditionally, rainwater harvesting has been practiced in arid and semi arid areas, and has provided… …   Wikipedia

  • Web harvesting — is an implementation of a Web crawler that uses human expertise or machine guidance to direct the crawler to URLs which compose a specialized collection or set of knowledge. Web harvesting can be thought of as focused or directed Web… …   Wikipedia

  • Open Archives Initiative Protocol for Metadata Harvesting — OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) is a protocol developed by the Open Archives Initiative. It is used to harvest (or collect) the metadata descriptions of the records in an archive so that services can be built… …   Wikipedia

  • Open archives initiative protocol for metadata harvesting — (OAI PMH) est un protocole informatique fondé par l Open Archives Initiative pour échanger des métadonnées. Il permet de constituer et de mettre à jour automatiquement des entrepôts centralisés où les métadonnées de sources diverses peuvent être… …   Wikipédia en Français

  • E-mail address harvesting — E mail harvesting is the process of obtaining lists of e mail addresses using various methods for use in bulk e mail or other purposes usually grouped as spam.MethodsThe simplest method involves spammers purchasing or trading lists of e mail… …   Wikipedia

  • Open Archives Initiative Protocol for Metadata Harvesting — (OAI PMH) est un protocole informatique fondé par l Open Archives Initiative pour échanger des métadonnées. Il permet de constituer et de mettre à jour automatiquement des entrepôts centralisés où les métadonnées de sources diverses peuvent être… …   Wikipédia en Français

  • Office of Scientific and Technical Information — The Office of Scientific and Technical Information (OSTI) is a component of the Office of Science within the U.S. Department of Energy (DOE). The Energy Policy Act PL 109 58, Section 982, called out the responsibility of OSTI: “The Secretary,… …   Wikipedia

  • Arid Lands Information Network — EAstern Africa (ALIN EA) is a Kenya based organisation that seeks to exchange ideas and experiences among grassroots change agents . It sees its goal as enabling such grassroot change agents to learn from one another, through capacity building… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”