|
| 1 | +### Training Algorithm Details |
| 2 | +FastTree is an efficient implementation of the |
| 3 | +[MART](https://arxiv.org/abs/1505.01866) gradient boosting algorithm. Gradient |
| 4 | +boosting is a machine learning technique for regression problems. It builds each |
| 5 | +regression tree in a step-wise fashion, using a predefined loss function to |
| 6 | +measure the error for each step and corrects for it in the next. So this |
| 7 | +prediction model is actually an ensemble of weaker prediction models. In |
| 8 | +regression problems, boosting builds a series of such trees in a step-wise |
| 9 | +fashion and then selects the optimal tree using an arbitrary differentiable loss |
| 10 | +function. |
| 11 | + |
| 12 | +MART learns an ensemble of regression trees, which is a decision tree with |
| 13 | +scalar values in its leaves. A decision (or regression) tree is a binary |
| 14 | +tree-like flow chart, where at each interior node one decides which of the two |
| 15 | +child nodes to continue to based on one of the feature values from the input. At |
| 16 | +each leaf node, a value is returned. In the interior nodes, the decision is |
| 17 | +based on the test x <= v where x is the value of the feature in the input |
| 18 | +sample and v is one of the possible values of this feature. The functions that |
| 19 | +can be produced by a regression tree are all the piece-wise constant functions. |
| 20 | + |
| 21 | +The ensemble of trees is produced by computing, in each step, a regression tree |
| 22 | +that approximates the gradient of the loss function, and adding it to the |
| 23 | +previous tree with coefficients that minimize the loss of the new tree. The |
| 24 | +output of the ensemble produced by MART on a given instance is the sum of the |
| 25 | +tree outputs. |
| 26 | + |
| 27 | +* In case of a binary classification problem, the output is converted to a |
| 28 | + probability by using some form of calibration. |
| 29 | +* In case of a regression problem, the output is the predicted value of the |
| 30 | + function. |
| 31 | +* In case of a ranking problem, the instances are ordered by the output value of |
| 32 | + the ensemble. |
| 33 | + |
| 34 | +For more information see: |
| 35 | +* [Wikipedia: Gradient boosting (Gradient tree |
| 36 | +boosting).](https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting) |
| 37 | +* [Greedy function approximation: A gradient boosting |
| 38 | +machine.](https://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aos/1013203451) |
0 commit comments