Skip to content

XML documentation for FastForest binary classification. #3399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 19, 2019
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions docs/api-reference/fastforest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
### Training Algorithm Details
Decision trees are non-parametric models that perform a sequence of simple tests
on inputs. This decision procedure maps them to outputs found in the training
dataset whose inputs were similar to the instance being processed. A decision is
made at each node of the binary tree data structure based on a measure of
similarity that maps each instance recursively through the branches of the tree
until the appropriate leaf node is reached and the output decision returned.

Decision trees have several advantages:
* They are efficient in both computation and memory usage during training and
prediction.
* They can represent non-linear decision boundaries.
* They perform integrated feature selection and classification.
* They are resilient in the presence of noisy features.

Fast forest is a random forest implementation. The model consists of an ensemble
of decision trees. Each tree in a decision forest outputs a Gaussian
distribution by way of prediction. An aggregation is performed over the ensemble
of trees to find a Gaussian distribution closest to the combined distribution
for all trees in the model. This decision forest classifier consists of an
ensemble of decision trees.

Generally, ensemble models provide better coverage and accuracy than single
decision trees. Each tree in a decision forest outputs a Gaussian distribution.

For more see:
* [Wikipedia: Random forest](https://en.wikipedia.org/wiki/Random_forest)
* [Quantile regression
forest](http://jmlr.org/papers/volume7/meinshausen06a/meinshausen06a.pdf)
* [From Stumps to Trees to
Forests](https://blogs.technet.microsoft.com/machinelearning/2014/09/10/from-stumps-to-trees-to-forests/)
18 changes: 16 additions & 2 deletions src/Microsoft.ML.FastTree/RandomForestClassification.cs
Original file line number Diff line number Diff line change
Expand Up @@ -113,12 +113,26 @@ private static IPredictorProducing<float> Create(IHostEnvironment env, ModelLoad
/// <summary>
/// The <see cref="IEstimator{TTransformer}"/> for training a decision tree binary classification model using Fast Forest.
/// </summary>
/// <include file='doc.xml' path='doc/members/member[@name="FastForest_remarks"]/*' />
/// <remarks>
/// <format type="text/markdown"><![CDATA[
/// To create this trainer, use [FastForest](xref:Microsoft.ML.TreeExtensions.FastForest(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,System.String,System.String,System.String,System.Int32,System.Int32,System.Int32))
/// or [FastForest(Options)](xref:Microsoft.ML.TreeExtensions.FastForest(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer.Options)).
///
/// [!include[io](~/../docs/samples/docs/api-reference/io-columns-binary-classification.md)]
///
/// [!include[algorithm](~/../docs/samples/docs/api-reference/fastforest.md)]
Copy link

@shmoradims shmoradims Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fastforest [](start = 66, length = 10)

use 'algo-details-xxx' pattern. in this case 'algo-details-fastforest.md' #Resolved

/// ]]>
/// </format>
/// </remarks>
/// <seealso cref="Microsoft.ML.TreeExtensions.FastForest(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,System.String,System.String,System.String,System.Int32,System.Int32,System.Int32)"/>
/// <seealso cref="Microsoft.ML.TreeExtensions.FastForest(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer.Options)"/>
/// <seealso cref="Options"/>
public sealed partial class FastForestBinaryTrainer :
RandomForestTrainerBase<FastForestBinaryTrainer.Options, BinaryPredictionTransformer<FastForestBinaryModelParameters>, FastForestBinaryModelParameters>
{
/// <summary>
/// Options for the <see cref="FastForestBinaryTrainer"/>.
/// Options for the <see cref="FastForestBinaryTrainer"/> as used in
/// [FastForest(Options)](xref:Microsoft.ML.TreeExtensions.FastForest(Microsoft.ML.BinaryClassificationCatalog.BinaryClassificationTrainers,Microsoft.ML.Trainers.FastTree.FastForestBinaryTrainer.Options)).
/// </summary>
public sealed class Options : FastForestOptionsBase
{
Expand Down
8 changes: 4 additions & 4 deletions src/Microsoft.ML.FastTree/TreeTrainersCatalog.cs
Original file line number Diff line number Diff line change
Expand Up @@ -384,11 +384,11 @@ public static FastForestRegressionTrainer FastForest(this RegressionCatalog.Regr
}

/// <summary>
/// Predict a target using a decision tree regression model trained with the <see cref="FastForestBinaryTrainer"/>.
/// Creates <see cref="FastForestBinaryTrainer"/>, which predicts a target using a decision tree regression model.
/// </summary>
/// <param name="catalog">The <see cref="BinaryClassificationCatalog"/>.</param>
/// <param name="labelColumnName">The name of the label column.</param>
/// <param name="featureColumnName">The name of the feature column.</param>
/// <param name="labelColumnName">The name of the label column. The column data must be <see cref="System.Boolean"/>.</param>
/// <param name="featureColumnName">The name of the feature column. The column data must be a known-sized vector of <see cref="System.Single"/>.</param>
/// <param name="exampleWeightColumnName">The name of the example weight column (optional).</param>
/// <param name="numberOfTrees">Total number of decision trees to create in the ensemble.</param>
/// <param name="numberOfLeaves">The maximum number of leaves per decision tree.</param>
Expand All @@ -414,7 +414,7 @@ public static FastForestBinaryTrainer FastForest(this BinaryClassificationCatalo
}

/// <summary>
/// Predict a target using a decision tree regression model trained with the <see cref="FastForestBinaryTrainer"/> and advanced options.
/// Create <see cref="FastForestBinaryTrainer"/> with <see cref="FastForestBinaryTrainer.Options"/>, which predicts a target using a decision tree regression model.
Copy link

@shmoradims shmoradims Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 62, length = 45)

just say "with advanced options" to be consistent with the template #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


In reply to: 277064586 [](ancestors = 277064586)

/// </summary>
/// <param name="catalog">The <see cref="BinaryClassificationCatalog"/>.</param>
/// <param name="options">Trainer options.</param>
Expand Down