Skip to content

Towards #3204 - documentation for FeatureContributionCalculatingEstimator #3384

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 20, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions src/Microsoft.ML.Data/Transforms/ExplainabilityCatalog.cs
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,10 @@ namespace Microsoft.ML
public static class ExplainabilityCatalog
{
/// <summary>
/// Feature Contribution Calculation computes model-specific contribution scores for each feature.
/// Note that this functionality is not supported by all the models. See <see cref="FeatureContributionCalculatingTransformer"/> for a list of the suported models.
/// Create a <see cref="FeatureContributionCalculatingEstimator"/> that computes model-specific contribution scores for
/// each feature of the input vector.
/// </summary>
/// <param name="catalog">The model explainability operations catalog.</param>
/// <param name="catalog">The transforms catalog.</param>
/// <param name="predictionTransformer">A <see cref="ISingleFeaturePredictionTransformer{TModel}"/> that supports Feature Contribution Calculation,
/// and which will also be used for scoring.</param>
/// <param name="numberOfPositiveContributions">The number of positive contributions to report, sorted from highest magnitude to lowest magnitude.
Expand All @@ -40,10 +40,10 @@ public static FeatureContributionCalculatingEstimator CalculateFeatureContributi
=> new FeatureContributionCalculatingEstimator(CatalogUtils.GetEnvironment(catalog), predictionTransformer.Model, numberOfPositiveContributions, numberOfNegativeContributions, predictionTransformer.FeatureColumnName, normalize);

/// <summary>
/// Feature Contribution Calculation computes model-specific contribution scores for each feature.
/// Note that this functionality is not supported by all the models. See <see cref="FeatureContributionCalculatingTransformer"/> for a list of the suported models.
/// Create a <see cref="FeatureContributionCalculatingEstimator"/> that computes model-specific contribution scores for
/// each feature of the input vector.
/// </summary>
/// <param name="catalog">The model explainability operations catalog.</param>
/// <param name="catalog">The transforms catalog.</param>
/// <param name="predictionTransformer">A <see cref="ISingleFeaturePredictionTransformer{TModel}"/> that supports Feature Contribution Calculation,
/// and which will also be used for scoring.</param>
/// <param name="numberOfPositiveContributions">The number of positive contributions to report, sorted from highest magnitude to lowest magnitude.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,47 +26,8 @@
namespace Microsoft.ML.Transforms
{
/// <summary>
/// The FeatureContributionCalculationTransformer computes model-specific per-feature contributions to the score of each example.
/// See the list of currently supported models below.
/// <see cref="ITransformer"/> resulting from fitting a <see cref="FeatureContributionCalculatingEstimator"/>.
/// </summary>
/// <remarks>
/// <para>
/// Scoring a dataset with a trained model produces a score, or prediction, for each example. To understand and explain these predictions
/// it can be useful to inspect which features influenced them most significantly. FeatureContributionCalculationTransformer computes a model-specific
/// list of per-feature contributions to the score for each example. These contributions can be positive (they make the score higher) or negative
/// (they make the score lower).
/// </para>
/// <para>
/// Feature Contribution Calculation is currently supported for the following models:
/// Regression:
/// OrdinaryLeastSquares, StochasticDualCoordinateAscent (SDCA), OnlineGradientDescent, PoissonRegression,
/// GeneralizedAdditiveModels (GAM), LightGbm, FastTree, FastForest, FastTreeTweedie
/// Binary Classification:
/// AveragedPerceptron, LinearSupportVectorMachines, LogisticRegression, StochasticDualCoordinateAscent (SDCA),
/// StochasticGradientDescent (SGD), SymbolicStochasticGradientDescent, GeneralizedAdditiveModels (GAM),
/// FastForest, FastTree, LightGbm
/// Ranking:
/// FastTree, LightGbm
/// </para>
/// <para>
/// For linear models, the contribution of a given feature is equal to the product of feature value times the corresponding weight. Similarly,
/// for Generalized Additive Models (GAM), the contribution of a feature is equal to the shape function for the given feature evaluated at
/// the feature value.
/// </para>
/// <para>
/// For tree-based models, the calculation of feature contribution essentially consists in determining which splits in the tree have the most impact
/// on the final score and assigning the value of the impact to the features determining the split. More precisely, the contribution of a feature
/// is equal to the change in score produced by exploring the opposite sub-tree every time a decision node for the given feature is encountered.
/// Consider a simple case with a single decision tree that has a decision node for the binary feature F1. Given an example that has feature F1
/// equal to true, we can calculate the score it would have obtained if we chose the subtree corresponding to the feature F1 being equal to false
/// while keeping the other features constant. The contribution of feature F1 for the given example is the difference between the original score
/// and the score obtained by taking the opposite decision at the node corresponding to feature F1. This algorithm extends naturally to models with
/// many decision trees.
/// </para>
/// <para>
/// See the sample below for an example of how to compute feature importance using the FeatureContributionCalculatingTransformer.
/// </para>
/// </remarks>
public sealed class FeatureContributionCalculatingTransformer : OneToOneTransformerBase
{
internal sealed class Options : TransformInputBase
Expand Down Expand Up @@ -266,9 +227,57 @@ private Delegate GetValueGetter<TSrc>(DataViewRow input, int colSrc)
}

/// <summary>
/// Estimator producing a FeatureContributionCalculatingTransformer which scores the model on an input dataset and
/// computes model-specific contribution scores for each feature.
/// Computes model-specific per-feature contributions to the score of each input vector.
/// See the list of currently supported models below.
Copy link
Member

@sfilipi sfilipi Apr 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the list of currently supported models below. [](start = 8, length = 49)

I would remove this, because this line displays on the IntelliSense, and there won't be an option to see the remarks section there. #Resolved

/// </summary>
/// <remarks>
/// <format type="text/markdown"><![CDATA[
///
/// ### Estimator Characteristics
/// | | |
/// | -- | -- |
/// | Does this estimator need to look at the data to train its parameters? | No |
/// | Input column data type | Vector of floats |
Copy link
Contributor

@natke natke Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sfilipi @shmoradims were we going to use Single instead of float? #Resolved

Copy link

@shmoradims shmoradims Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Use System.Single instead of 'float'. 'float' is a C# keywork, not a .NET type, and F# uses different terminology.

in xml
xref:System.Single in markdown

Same as above for 'double'


In reply to: 276422052 [](ancestors = 276422052)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's my recollection too.


In reply to: 276470283 [](ancestors = 276470283,276422052)

/// | Output column data type | Vector of floats |
///
/// <para>
Copy link
Contributor

@natke natke Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't be processed or will cause an error, as you are inside a markdown block here. #Resolved

/// Scoring a dataset with a trained model produces a score, or prediction, for each example. To understand and explain these predictions
/// it can be useful to inspect which features influenced them most significantly. This transformer computes a model-specific
/// list of per-feature contributions to the score for each example. These contributions can be positive (they make the score higher) or negative
/// (they make the score lower).
/// </para>
Copy link
Contributor

@natke natke Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove #Resolved

/// <para>
Copy link
Contributor

@natke natke Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same #Resolved

/// Feature Contribution Calculation is currently supported for the following models:
Copy link
Contributor

@natke natke Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make this a bulleted list in markdown
Regression

  • OrdinaryLeastSquares
  • etc

Would also be good to get the algorithm names to be exactly the same as the name of the trainer classes #Resolved

/// Regression:
/// OrdinaryLeastSquares, StochasticDualCoordinateAscent (SDCA), OnlineGradientDescent, PoissonRegression,
/// GeneralizedAdditiveModels (GAM), LightGbm, FastTree, FastForest, FastTreeTweedie
/// Binary Classification:
/// AveragedPerceptron, LinearSupportVectorMachines, LogisticRegression, StochasticDualCoordinateAscent (SDCA),
/// StochasticGradientDescent (SGD), SymbolicStochasticGradientDescent, GeneralizedAdditiveModels (GAM),
/// FastForest, FastTree, LightGbm
/// Ranking:
/// FastTree, LightGbm
Copy link
Contributor

@artidoro artidoro Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this does not seem to look good in the xml doc:
https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transforms.featurecontributioncalculatingtransformer?view=ml-dotnet

Could you find a way to itemize these points? Or another approach that will make it easier to read? #Resolved

/// </para>
/// <para>
/// For linear models, the contribution of a given feature is equal to the product of feature value times the corresponding weight. Similarly,
/// for Generalized Additive Models (GAM), the contribution of a feature is equal to the shape function for the given feature evaluated at
/// the feature value.
/// </para>
/// <para>
/// For tree-based models, the calculation of feature contribution essentially consists in determining which splits in the tree have the most impact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives a good description of tree based models. Worth mentioning how it works for the other models?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description for linear models and GAM is above. Do you think it should be more detailed?


In reply to: 276423670 [](ancestors = 276423670)

/// on the final score and assigning the value of the impact to the features determining the split. More precisely, the contribution of a feature
/// is equal to the change in score produced by exploring the opposite sub-tree every time a decision node for the given feature is encountered.
/// Consider a simple case with a single decision tree that has a decision node for the binary feature F1. Given an example that has feature F1
/// equal to true, we can calculate the score it would have obtained if we chose the subtree corresponding to the feature F1 being equal to false
/// while keeping the other features constant. The contribution of feature F1 for the given example is the difference between the original score
/// and the score obtained by taking the opposite decision at the node corresponding to feature F1. This algorithm extends naturally to models with
/// many decision trees.
/// </para>
/// See the See Also section for links to examples of the usage.
Copy link
Contributor

@natke natke Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this line is necessary #Resolved

/// ]]></format>
/// </remarks>
/// <seealso cref="ExplainabilityCatalog.CalculateFeatureContribution(TransformsCatalog, ISingleFeaturePredictionTransformer{ICalculateFeatureContribution}, int, int, bool)"/>
/// <seealso cref="ExplainabilityCatalog.CalculateFeatureContribution{TModelParameters, TCalibrator}(TransformsCatalog, ISingleFeaturePredictionTransformer{Calibrators.CalibratedModelParametersBase{TModelParameters, TCalibrator}}, int, int, bool)"/>
public sealed class FeatureContributionCalculatingEstimator : TrivialEstimator<FeatureContributionCalculatingTransformer>
{
private readonly string _featureColumn;
Expand Down