Ensembles of classifiers

Ensembles of classifiers

Recently in the area of Machine Learning the concept of combining classifiers is proposed as a new direction for the improvement of the performance of individual classifiers. These classifiers could be based on a variety of classification methodologies, and could achieve different rate of correctly classified individuals. The goal of classification result integration algorithms is to generate more certain, precise and accurate system results. Dietterich (2001) provides an accessible and informal reasoning, from statistical, computational and representational viewpoints, of why ensembles can improve results.

Methods

Numerous methods have been suggested for the creation of ensemble of classifiers.
* Using different subset of training data with a single learning method
* Using different training parameters with a single training method (e.g. using different initial weights for each neural network in an ensemble)
* Using different learning methods.

Weaknesses

* Increased storage
* Increased computation
* Decreased comprehensibility

The first weakness, increased storage, is a direct consequence of the requirement that all component classifiers, instead of a single classifier, need to be stored after training. The total storage depends on the size of each component classifier itself and the size of the ensemble (number of classifiers in the ensemble). The second weakness is increased computation: to classify an input query, all component classifiers (instead of a single classifier) must be processed, and thus it requires more execution time. The last weakness is decreased comprehensibility. With involvement of multiple classifiers in decision-making, it is more difficult for users to perceive the underlying reasoning process leading to a decision.dubious

Bagging

Bagging is a method of the first category (Breiman, 1996). If there is a training set of size t, then it is possible to draw t random instances from it with replacement (i.e. using a uniform distribution), these t instances can be learned, and this process can be repeated several times. Since the draw is with replacement, usually the instances drawn will contain some duplicates and some omissions as compared to the original training set. Each cycle through the process results in one classifier. After the construction of several classifiers, taking a vote of the predictions of each classifier performs the final prediction.

Boosting

Another method of the first category is called boosting. AdaBoost is a practical version of the boosting approach (Freund and Schapire, 1996). Boosting is similar in overall structure to bagging, except that one keeps track of the performance of the learning algorithm and forces it to concentrate its efforts on instances that have not been correctly learned. Instead of choosing the t training instances randomly using a uniform distribution, one chooses the training instances in such a manner as to favour the instances that have not been accurately learned. After several cycles, the prediction is performed by taking a weighted vote of the predictions of each classifier, with the weights being proportional to each classifier’s accuracy on its training set.

Boosting algorithms are considered stronger than bagging on noise free data. However, there are strong empirical indications that bagging is much more robust than boosting in noisy settings. For this reason, Kotsiantis and Pintelas (2004) built an ensemble using a voting methodology of bagging and boosting ensembles that give better classification accuracy.

References

Breiman L. (1996): Bagging Predictors. Machine Learning, 24(3), 123--140. Kluwer Academic Publishers.
Dietterich, T.G. (2001): Ensemble methods in machine learning. In Kittler, J., Roli, F., eds.: Multiple Classifier Systems. LNCS Vol. 1857, Springer (2001) 1–15
Yoav Freund and Robert E. Schapire, Experiments with a New Boosting Algorithm, Proceedings: ICML’96, p. 148-156, 1996
S. Kotsiantis, P. Pintelas, Combining Bagging and Boosting, International Journal of Computational Intelligence, Vol. 1, No. 4 (324-333), 2004.


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Supervised learning — is a machine learning technique for learning a function from training data. The training data consist of pairs of input objects (typically vectors), and desired outputs. The output of the functioncan be a continuous value (called regression), or… …   Wikipedia

  • Committee machine — A committee machine is a type of neural network using a divide and conquer strategy in which the responses of multiple neural networks (experts) are combined into a single response. The combined response of the committee machine is supposed to be …   Wikipedia

  • Concept drift — In predictive analytics and machine learning, the concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions… …   Wikipedia

  • Classification naïve bayesienne — La classification naïve bayésienne est un type de classification Bayésienne probabiliste simple basée sur le théorème de Bayes avec une forte indépendance (dite naïve) des hypothèses. Elle met en œuvre un classifieur bayésien naïf, ou classifieur …   Wikipédia en Français

  • Boosting — Le boosting est un domaine de l apprentissage automatique (branche de l intelligence artificielle). C est un principe qui regroupe de nombreux algorithmes qui s appuient sur des ensembles de classifieurs binaires : le boosting optimise leurs …   Wikipédia en Français

  • Dérive conceptuelle — En analyse prédictive et en apprentissage automatique, on parle de dérive conceptuelle lorsque les propriétés statistiques de la variable cible, que le modèle essaie de prédire, évoluent au cours du temps d une manière imprévue[1]. Ceci pose des… …   Wikipédia en Français

  • Glossaire du data mining — Exploration de données Articles principaux Exploration de données Fouille de données spatiales Fouille du web Fouille de flots de données Fouille de textes …   Wikipédia en Français

  • Machine a vecteurs de support — Machine à vecteurs de support Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de… …   Wikipédia en Français

  • Machine À Vecteurs De Support — Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de discrimination[1] et de régression. Les SVM… …   Wikipédia en Français

  • Machine à vecteur de support — Machine à vecteurs de support Les machines à vecteurs de support ou séparateurs à vaste marge (en anglais Support Vector Machine, SVM) sont un ensemble de techniques d apprentissage supervisé destinées à résoudre des problèmes de… …   Wikipédia en Français

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”