MATLAB File Help: cv.Boost

Boosting

Boosted tree classifier derived from cv.DTrees.

Boosting

A common machine learning task is supervised learning. In supervised learning, the goal is to learn the functional relationship F: y = F(x) between the input x and the output y. Predicting the qualitative output is called classification, while predicting the quantitative output is called regression.

Boosting is a powerful learning concept that provides a solution to the supervised classification learning task. It combines the performance of many weak classifiers to produce a powerful committee [HTF01]. A weak classifier is only required to be better than chance, and thus can be very simple and computationally inexpensive. However, many of them smartly combine results to a strong classifier that often outperforms most monolithic strong classifiers such as SVMs and Neural Networks.

Decision trees are the most popular weak classifiers used in boosting schemes. Often the simplest decision trees with only a single split node per tree (called stumps) are sufficient.

The boosted model is based on N training examples (xi,yi), i=1..N with xi IN R^k and yi IN {-1,+1}. xi is a K-component vector. Each component encodes a feature relevant to the learning task at hand. The desired two-class output is encoded as -1 and +1.

Different variants of boosting are known as Discrete Adaboost, Real AdaBoost, LogitBoost, and Gentle AdaBoost [FHT98]. All of them are very similar in their overall structure. Therefore, this chapter focuses only on the standard two-class Discrete AdaBoost algorithm, outlined below. Initially the same weight is assigned to each sample (step 2). Then, a weak classifier f_m(x) is trained on the weighted training data (step 3a). Its weighted training error and scaling factor c_m is computed (step 3b). The weights are increased for training samples that have been misclassified (step 3c). All weights are then normalized, and the process of finding the next weak classifier continues for another M-1 times. The final classifier F(x) is the sign of the weighted sum over the individual weak classifiers (step 4).

Two-class Discrete AdaBoost Algorithm

Set N examples (xi,yi), i=1..N with xi IN R^K, yi IN {-1,+1}.
Assign weights as wi = 1/N, i=1..N.
Repeat for m=1,2,...,M:
1. Fit the classifier f_m(x) IN {-1,1}, using weights wi on the training data.
2. Compute err_m = Ew[1_(y!=f_m(x))], c_m = log((1-err_m)/err_m).
3. Set wi = wi * exp[c_m * 1(yi!=f_m(xi))], i=1,2,...,N, and renormalize so that sum_i{wi}=1.
Classify new samples x using the formula: sign(sum{m} = 1 * M * c_m * f_m(x)).

Note: Similar to the classical boosting methods, the current implementation supports two-class classifiers only. For M > 2 classes, there is the AdaBoost.MH algorithm (described in [FHT98]) that reduces the problem to the two-class problem, yet with a much larger training set.

To reduce computation time for boosted models without substantially losing accuracy, the influence trimming technique can be employed. As the training algorithm proceeds and the number of trees in the ensemble is increased, a larger number of the training samples are classified correctly and with increasing confidence, thereby those samples receive smaller weights on the subsequent iterations. Examples with a very low relative weight have a small impact on the weak classifier training. Thus, such examples may be excluded during the weak classifier training without having much effect on the induced classifier. This process is controlled with the WeightTrimRate parameter. Only examples with the summary fraction WeightTrimRate of the total weight mass are used in the weak classifier training. Note that the weights for all training examples are recomputed at each training iteration. Examples deleted at a particular iteration may be used again for learning some of the weak classifiers further [FHT98].

Prediction with Boost

The cv.Boost.predict method should be used. Pass RawOutput=true to get the raw sum from Boost classifier.

References

[HTF01]:

Hastie Trevor, Tibshirani Robert, and Friedman Jerome. "The elements of statistical learning: data mining, inference and prediction". New York: Springer-Verlag, 1(8):371-406, 2001.

[FHT98]:

Jerome Friedman, Trevor Hastie, and Robert Tibshirani. "Additive Logistic Regression: a Statistical View of Boosting". Technical Report, Dept. of Statistics, Stanford University, 1998.

Superclasses	handle
Sealed	false
Construct on load	false

BoostType	Type of the boosting algorithm.
CVFolds	If `CVFolds > 1` then algorithms prunes the built decision tree
MaxCategories	Cluster possible values of a categorical variable into
MaxDepth	The maximum possible depth of the tree.
MinSampleCount	If the number of samples in a node is less than this parameter then
Priors	The array of a priori class probabilities, sorted by the class label
RegressionAccuracy	Termination criteria for regression trees.
TruncatePrunedTree	If true then pruned branches are physically removed from the tree.
Use1SERule	If true then a pruning will be harsher.
UseSurrogates	If true then surrogate splits will be built.
WeakCount	The number of weak classifiers.
WeightTrimRate	A threshold between 0 and 1 used to save computational time.
id	Object ID

	addlistener	Add listener for event.
	calcError	Computes error on the training or test dataset
	clear	Clears the algorithm state
	delete	Destructor
	empty	Returns true if the algorithm is empty
	eq	== (EQ) Test handle equality.
	findobj	Find objects matching specified conditions.
	findprop	Find property of MATLAB handle object.
	ge	>= (GE) Greater than or equal relation for handles.
	getDefaultName	Returns the algorithm string identifier
	getNodes	Returns all the nodes
	getRoots	GETROOS Returns indices of root nodes
	getSplits	Returns all the splits
	getSubsets	Returns all the bitsets for categorical splits
	getVarCount	Returns the number of variables in training samples
	gt	> (GT) Greater than relation for handles.
	isClassifier	Returns true if the model is a classifier
	isTrained	Returns true if the model is trained
Sealed	isvalid	Test handle validity.
	le	<= (LE) Less than or equal relation for handles.
	load	Loads algorithm from a file or a string
	lt	< (LT) Less than relation for handles.
	ne	~= (NE) Not equal relation for handles.
	notify	Notify listeners of event.
	predict	Predicts response(s) for the provided sample(s)
	save	Saves the algorithm parameters to a file or a string
	train	Trains a boosted tree classifier