AdaBoost, short for Adaptive Boosting, is a machine learning algorithm, formulated by Yoav Freund and Robert Schapire. It is a meta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data and outliers. Otherwise, it is less susceptible to the overfitting problem than most learning algorithms.

AdaBoost calls a weak classifier repeatedly in a series of rounds $t = 1,ldots,T$. For each call a distribution of weights $D_\left\{t\right\}$ is updated that indicates the importance of examples in the data set for the classification. On each round, the weights of each incorrectly classified example are increased (or alternatively, the weights of each correctly classified example are decreased), so that the new classifier focuses more on those examples.

The algorithm for the binary classification task

Given: $\left(x_\left\{1\right\},y_\left\{1\right\}\right),ldots,\left(x_\left\{m\right\},y_\left\{m\right\}\right)$ where $x_\left\{i\right\} in X,, y_\left\{i\right\} in Y = \left\{-1, +1\right\}$

Initialise $D_\left\{1\right\}\left(i\right) = frac\left\{1\right\}\left\{m\right\}, i=1,ldots,m.$

For $t = 1,ldots,T$:

* Find the classifier $h_\left\{t\right\} : X o \left\{-1,+1\right\}$ that minimizes the error with respect to the distribution $D_\left\{t\right\}$:
$h_\left\{t\right\} = arg min_\left\{h_\left\{j\right\} in mathcal\left\{H epsilon_\left\{j\right\}$, where $epsilon_\left\{j\right\} = sum_\left\{i=1\right\}^\left\{m\right\} D_\left\{t\right\}\left(i\right) \left[y_i e h_\left\{j\right\}\left(x_\left\{i\right\}\right)\right]$
* Prerequisite: $epsilon_\left\{t\right\} < 0.5$, otherwise stop.
* Choose $alpha_\left\{t\right\} in mathbf\left\{R\right\}$, typically $alpha_\left\{t\right\}=frac\left\{1\right\}\left\{2\right\} extrm\left\{ln\right\}frac\left\{1-epsilon_\left\{t\left\{epsilon_\left\{t$ where $epsilon_\left\{t\right\}$ is the weighted error rate of classifier $h_\left\{t\right\}$.
* Update:
$D_\left\{t+1\right\}\left(i\right) = frac\left\{ D_\left\{t\right\}\left(i\right) , e^\left\{- alpha_\left\{t\right\} y_\left\{i\right\} h_\left\{t\right\}\left(x_\left\{i\right\}\right)\right\} \right\}\left\{ Z_\left\{t\right\} \right\}$
where $Z_\left\{t\right\}$ is a normalization factor (chosen so that $D_\left\{t+1\right\}$ will be a probability distribution, i.e. sum one over all x).

Output the final classifier:

$H\left(x\right) = extrm\left\{sign\right\}left\left( sum_\left\{t=1\right\}^\left\{T\right\} alpha_\left\{t\right\}h_\left\{t\right\}\left(x\right) ight\right)$

The equation to update the distribution $D_\left\{t\right\}$ is constructed so that:

Thus, after selecting an optimal classifier $h_\left\{t\right\} ,$ for the distribution $D_\left\{t\right\} ,$, the examples $x_\left\{i\right\} ,$ that the classifier $h_\left\{t\right\} ,$ identified correctly are weighted less and those that it identified incorrectly are weighted more. Therefore, when the algorithm is testing the classifiers on the distribution $D_\left\{t+1\right\} ,$, it will select a classifier that better identifies those examples that the previous classifer missed.

tatistical Understanding of Boosting

Boosting can be seen as minimization of a convex loss function over a convex set of functions. [T. Zhang, "Convex Risk Minimization", Annals of Statistics, 2004.] Specifically, the loss being minimized is the exponential loss

:$sum_i e^\left\{-y_i f\left(x_i\right)\right\}$

and we are seeking a function

:$f = sum_t alpha_t h_t$

ee also

* Bootstrap aggregating
* LPBoost
* GentleBoost

References

* [http://www.site.uottawa.ca/~stan/csi5387/boost-tut-ppr.pdf A Short Introduction to Boosting] Introduction to Adaboost by Freund and Schapire from 1999
* [http://citeseer.ist.psu.edu/cache/papers/cs/2215/http:zSzzSzwww.first.gmd.dezSzpersonszSzMueller.Klaus-RobertzSzseminarzSzFreundSc95.pdf/freund95decisiontheoretic.pdf A decision-theoretic generalization of on-line learning and an application to boosting] "Journal of Computer and System Sciences", no. 55. 1997 (Original paper of Yoav Freund and Robert E.Schapire where Adaboost is first introduced.)
* [http://engineering.rowan.edu/~polikar/RESEARCH/PUBLICATIONS/csm06.pdf Ensemble Based Systems in Decision Making] , R. Polikar, IEEE Circuits and Systems Magazine, vol.6, no.3, pp. 21-45, 2006. A tutorial article on ensemble systems including pseudocode, block diagrams and implementation issues for AdaBoost and other ensemble learning algorithms.
* [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.51.9525] by Jerome Friedman, Trevor Hastie, Robert Tibshirani. Paper introducing probabilistic theory for AdaBoost, and introducing GentleBoost

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• AdaBoost — (сокращение от Adaptive Boosting)  алгоритм машинного обучения, предложенный Йоавом Фройндом (en:Yoav Freund) и Робертом Шапирe (en:Robert Schapire). Этот алгоритм может использоваться в сочетании со многими другими алгоритмами обучения для… …   Википедия

• AdaBoost — (ou adaptive boosting) est une méthode de boosting (intelligence artificielle, apprentissage automatique) introduite par Yoav Freund et Robert Schapire. Ce fut l une des premières méthodes pleinement fonctionnelles permettant de mettre en œuvre… …   Wikipédia en Français

• Méthode de Viola et Jones — Un exemple de détection de visage par la méthode de Viola et Jones. La méthode de Viola et Jones est une méthode de détection d objet dans une image numérique, proposée par les chercheurs Paul Viola et …   Wikipédia en Français

• BrownBoost — is a boosting algorithm that may be robust to noisy datasets. BrownBoost is an adaptive version of the boost by majority algorithm. As is true for all boosting algorithms, BrownBoost is used in conjunction with other machine learning methods.… …   Wikipedia

• BrownBoost — BrownBoost  алгоритм бустинга, который показал свою эффективность на зашумленных наборах данных. Как и все алгоритмы бустинга, BrownBoost используется в сочетании с другими алгоритмами машинного обучения. Алгоритм BrownBoost был предложен… …   Википедия

• Yoav Freund — Infobox Scientist name = Yoav Freund image width = caption = birth date = birth place = residence = nationality = field = Computer Science work institution = University of California, San Diego alma mater = University of California, Santa Cruz… …   Wikipedia

• Boosting methods for object categorization — Given images containing various known objects in the world, a classifier can be learned from them to automatically categorize the objects in future images. Simple classifiers built based on some image feature of the object tend to be weak in… …   Wikipedia

• Robert Schapire — Infobox Scientist name = Robert Elias Schapire image width = caption = birth date = birth place = residence = nationality = field = Computer Science work institution = AT T LabsPrinceton University alma mater = Brown UniversityMassachusetts… …   Wikipedia

• Boosting — is a machine learning meta algorithm for performing supervised learning. Boosting is based on the question posed by KearnsMichael Kearns. Thoughts on hypothesis boosting. Unpublished manuscript. 1988] : can a set of weak learners create a single… …   Wikipedia

• LPBoost — Linear Programming Boosting (LPBoost) is a supervised classifier from the Boosting family of classifiers. LPBoost maximizes a margin between training samples of different classes and hence also belongs to the class of margin maximizing supervised …   Wikipedia