- Bootstrap aggregating
Bootstrap aggregating (bagging) is a
meta-algorithmto improve machine learningof classificationand regressionmodels in terms of stability and classificationaccuracy. It also reduces varianceand helps to avoid overfitting. Although it is usually applied to decision treemodels, it can be used with any type of model. Bagging is a special case of the model averagingapproach.
Given a standard
training set"D" of size "n", bagging generates "m" new training sets of size "n"' ≤ "n", by sampling examples from "D" uniformly and with replacement. By sampling with replacement it is likely that some examples will be repeated in each . If "n"'="n", then for large "n" the set expected to have 63.2% of the unique examples of "D", the rest being duplicates. This kind of sample is known as a bootstrap sample. The "m" models are fitted using the above "m" bootstrap samples and combined by averaging the output (for regression) or voting (for classification).
Since the method averages several predictors, it is not useful for improving linear models.
Example: Ozone data
This example is rather artificial, but illustrates the basic principles of bagging.
Rousseeuw and Leroy (1986) describe a data set concerning ozone levels. The data are available via the
classic data setspage. All computations were performed in R.
A scatter plot reveals an apparently non-linear relationship between temperature and ozone. One way to model the relationship is to use a loess smoother. Such a smoother requires that a span parameter be chosen. In this example, a span of 0.5 was used.
One hundred bootstrap samples of the data were taken, and the
LOESS smootherwas fit to each sample. Predictions from these 100 smoothers were then made across the range of the data. The first 10 predicted smooth fits appear as grey lines in the figure below. The lines are clearly very "wiggly" and they overfit the data - a result of the span being too low.
The red line on the plot below represents the mean of the 100 smoothers. Clearly, the mean is more stable and there is less overfit. This is the bagged predictor.
Bagging (Bootstrap aggregating) was proposed by
Leo Breimanin 1994 to improve the classification by combining classifications of randomly generated training sets. See Breiman, 1994. Technical Report No. 421.
* Cite journal
title = Bagging predictors
journal = Machine Learning
volume = 24
issue = 2
pages = 123140
year = 1996
url = http://citeseer.ist.psu.edu/breiman96bagging.html
doi = 10.1007/BF00058655
* Cite journal
author = S. Kotsiantis, P. Pintelas
title = Combining Bagging and Boosting
International Journal of Computational Intelligence
volume = 1
issue = 4
pages = 324–333
year = 2004
url = http://www.math.upatras.gr/~esdlab/en/members/kotsiantis/ijci%20paper%20kotsiantis.pdf
Wikimedia Foundation. 2010.