ID3 algorithm

ID3 algorithm

ID3 (Iterative Dichotomiser 3) is an algorithm used to generate a decision tree invented by Ross Quinlan.

The algorithm is based on Occam's razor: it prefers smaller decision trees (simpler theories) over larger ones. However, it does not always produce the smallest tree, and is therefore a heuristic. Occam's razor is formalized using the concept of information entropy:

: I_{E}(i) = - sum^{m}_{j=1} f (i,j) log_{2} f (i, j).

The ID3 algorithm can be summarized as follows:

# Take all unused attributes and count their entropy concerning test samples
# Choose attribute for which entropy is minimum
# Make node containing that attribute

An explanation of the implementation of ID3 can be found at C4.5 algorithm, which is an extended version of ID3.

Algorithm

The actual algorithm is as follows:

ID3 (Examples, Target_Attribute, Attributes)
*Create a root node for the tree
*If all examples are positive, Return the single-node tree Root, with label = +.
*If all examples are negative, Return the single-node tree Root, with label = -.
*If number of predicting attributes is empty, then Return the single node tree Root, with label = most common value of the target attribute in the examples.
*Otherwise Begin
** A = The Attribute that best classifies examples.
** Decision Tree attribute for Root = A.
** For each possible value, v_i, of A,
*** Add a new tree branch below Root, corresponding to the test A = v_i.
*** Let Examples(v_i), be the subset of examples that have the value v_i for A
*** If Examples(v_i) is empty
**** Then below this new branch add a leaf node with label = most common target value in the examples
*** Else below this new branch add the subtree ID3 (Examples(v_i), Target_Attribute, Attributes – {A})
*End
*Return Root

See also

* C4.5 algorithm

References

* Mitchell, Tom M. "Machine Learning". McGraw-Hill, 1997.

External links

* Seminars - [http://www2.cs.uregina.ca/~hamilton/courses/831/notes/ml/dtrees/4_dtrees1.html http://www2.cs.uregina.ca/]
* Description and examples - [http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm http://www.cise.ufl.edu/]
* Description and examples - [http://www.cis.temple.edu/~ingargio/cis587/readings/id3-c45.html http://www.cis.temple.edu/]
* [http://www.onlamp.com/pub/a/python/2006/02/09/ai_decision_trees.html An implementation of ID3 in Python]
* [http://ai4r.rubyforge.org/machineLearning.html An implementation of ID3 in Ruby]
* [http://www.pvv.ntnu.no/~oyvinht/static/OSS/cl-id3/ An implementation of ID3 in Common Lisp]
* Implementation of ID3 algorithm in C# - [http://www.codeproject.com/cs/algorithms/id3.asp http://www.codeproject.com/cs/algorithms/id3.asp]
* Decision tree Perl Module - [http://search.cpan.org/~kwilliams/AI-DecisionTree An implementation of ID3 in Perl] atomic resolution.


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • ID3 — is a metadata container most often used in conjunction with the MP3 audio file format. It allows information such as the title, artist, album, track number, or other information about the file to be stored in the file itself.There are two… …   Wikipedia

  • C4.5 algorithm — C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan s earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often… …   Wikipedia

  • Algoritmo ID3 — El algoritmo ID3 es utilizado dentro del ámbito de la inteligencia artificial. Su uso se engloba en la búsqueda de hipótesis o reglas en él, dado un conjunto de ejemplos. El conjunto de ejemplos deberá estar conformado por una serie de tuplas de… …   Wikipedia Español

  • Ross Quinlan — John Ross Quinlan is a pioneering computer science researcher in data mining and decision theory. He has contributed extensively to the development of decision tree algorithms, including inventing the canonical C4.5 and ID3 algorithms. He is… …   Wikipedia

  • Decision tree learning — This article is about decision trees in machine learning. For the use of the term in decision analysis, see Decision tree. Decision tree learning, used in statistics, data mining and machine learning, uses a decision tree as a predictive model… …   Wikipedia

  • Corner detection — Feature detection Output of a typical corner detection algorithm …   Wikipedia

  • Structure mining — or structured data mining is the process of finding and extracting useful information from semi structured data sets. Graph mining is a special case of structured data mining[citation needed]. Contents 1 Description 2 See also …   Wikipedia

  • Glossaire du data mining — Exploration de données Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes …   Wikipédia en Français

  • Operator-precedence grammar — An operator precedence grammar is a kind of grammar for formal languages. Technically, an operator precedence grammar is a context free grammar that has the property (among others[1]) that no production has either an empty right hand side or two… …   Wikipedia

  • Vorbis — This article is about the audio compression codec. For the Discworld character, see Discworld characters. Vorbis Xiph.org Logo Filename extension .ogg .oga …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”