MATLAB File Help: cv.RTrees | Index |
Random Trees
The class implements the random forest predictor.
Random trees have been introduced by [BreimanCutler]. The algorithm can deal with both classification and regression problems. Random trees is a collection (ensemble) of tree predictors that is called forest further in this section (the term has been also introduced by L. Breiman). The classification works as follows: the random trees classifier takes the input feature vector, classifies it with every tree in the forest, and outputs the class label that recieved the majority of votes.In case of a regression, the classifier response is the average of the responses over all the trees in the forest.
All the trees are trained with the same parameters but on different
training sets. These sets are generated from the original training set
using the bootstrap procedure: for each training set, you randomly
select the same number of vectors as in the original set (=N
). The
vectors are chosen with replacement. That is, some vectors will occur
more than once and some will be absent. At each node of each trained
tree, not all the variables are used to find the best split, but a
random subset of them. With each node a new subset is generated.
However, its size is fixed for all the nodes and all the trees. It is a
training parameter set to sqrt(#variables)
by default. None of the
built trees are pruned.
In random trees there is no need for any accuracy estimation procedures,
such as cross-validation or bootstrap, or a separate test set to get an
estimate of the training error. The error is estimated internally during
the training. When the training set for the current tree is drawn by
sampling with replacement, some vectors are left out (so-called
oob (out-of-bag) data). The size of oob data is about N/3
. The
classification error is estimated by using this oob-data as follows:
[BreimanCutler]:
Leo Breiman and Adele Cutler: http://www.stat.berkeley.edu/users/breiman/RandomForests/
[1]:
Machine Learning, Wald I, July 2002. http://stat-www.berkeley.edu/users/breiman/wald2002-1.pdf
[2]:
Looking Inside the Black Box, Wald II, July 2002. http://stat-www.berkeley.edu/users/breiman/wald2002-2.pdf
[3]:
Software for the Masses, Wald III, July 2002. http://stat-www.berkeley.edu/users/breiman/wald2002-3.pdf
[4]:
And other articles from the web site http://www.stat.berkeley.edu/users/breiman/RandomForests/cc_home.htm
Superclasses | handle |
Sealed | false |
Construct on load | false |
RTrees | Creates/trains a new Random Trees model |
ActiveVarCount | The size of the randomly selected subset of features at each tree |
CVFolds | If `CVFolds > 1` then algorithms prunes the built decision tree |
CalculateVarImportance | Whether to compute variables importance. |
MaxCategories | Cluster possible values of a categorical variable into |
MaxDepth | The maximum possible depth of the tree. |
MinSampleCount | If the number of samples in a node is less than this parameter then |
Priors | The array of a priori class probabilities, sorted by the class label |
RegressionAccuracy | Termination criteria for regression trees. |
TermCriteria | The termination criteria that specifies when the training algorithm |
TruncatePrunedTree | If true then pruned branches are physically removed from the tree. |
Use1SERule | If true then a pruning will be harsher. |
UseSurrogates | If true then surrogate splits will be built. |
id | Object ID |
addlistener | Add listener for event. | |
calcError | Computes error on the training or test dataset | |
clear | Clears the algorithm state | |
delete | Destructor | |
empty | Returns true if the algorithm is empty | |
eq | == (EQ) Test handle equality. | |
findobj | Find objects matching specified conditions. | |
findprop | Find property of MATLAB handle object. | |
ge | >= (GE) Greater than or equal relation for handles. | |
getDefaultName | Returns the algorithm string identifier | |
getNodes | Returns all the nodes | |
getRoots | GETROOS Returns indices of root nodes | |
getSplits | Returns all the splits | |
getSubsets | Returns all the bitsets for categorical splits | |
getVarCount | Returns the number of variables in training samples | |
getVarImportance | Returns the variable importance array | |
gt | > (GT) Greater than relation for handles. | |
isClassifier | Returns true if the model is a classifier | |
isTrained | Returns true if the model is trained | |
Sealed | isvalid | Test handle validity. |
le | <= (LE) Less than or equal relation for handles. | |
load | Loads algorithm from a file or a string | |
lt | < (LT) Less than relation for handles. | |
ne | ~= (NE) Not equal relation for handles. | |
notify | Notify listeners of event. | |
predict | Predicts response(s) for the provided sample(s) | |
save | Saves the algorithm parameters to a file or a string | |
train | Trains the Random Trees model |