- Winnow (algorithm)
The winnow algorithm [Littlestone, N. (1988) "Learning Quickly When Irrelevant Attributes About: A New Linear-threshold Algorithm" Machine Learning 285-318(2)] is a technique from machine learning. It is closely related to the
Perceptron , but it uses a multiplicative weight-update scheme that allows it perform much better than the perceptron when many dimensions are irrelevant (hence its name). It is not a sophisticated algorithm but it scales well to high-dimensional spaces. During training, winnow is shown a sequence of positive and negative examples. From these it learns a decision hyperplane. It can also be used in the online learning setting, where the learning phase is not separated from the training phase.The algorithm
The basic algorithm, Winnow1, is given as follows.The instance space is X={0,1}^n. The algorithm maintains non-negative weights w_i for iin {1...n} which are initially set to 1. When the learner is given an example x_1,...x_n), the learner follows the following prediction rule:
* If sum_{i=1}^n w_i x_i > Theta , then it predicts 1
* Otherwise it predicts 0Where Theta is a real number that is called the 'threshold'. Good bounds are obtained if Theta=n/2.
The update rule is (loosely):
* If an example is correctly classified, do nothing.
* If an example is predicted to be 1 but the correct result was 0, all of the weights involved in the mistake are set to zero (demotion step).
* If an example is predicted to be 0 but the correct result was 1, all of the weights involved in the mistake are multiplied by alpha (promotion step).A good value for alpha is 2.
Variations are also used. For example, Winnow2 is the same as Winnow1 except that in the demotion step the weights are multiplied by frac{1}{alpha} instead of being set to 0.
Mistake bounds
If Winnow1 is run with alpha > 1 and Thetageq 1/alpha on a target function that is a k-literal monotone disjunction given by f(x_1,...x_n)=x_{i_1}cup ... cup x_{i_k}, then for any sequence of instances the total number of mistakes is bounded by alpha k ( log_alpha Theta+1)+frac{n}{Theta}.
References
Citations and notes
Wikimedia Foundation. 2010.