-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Towards #3204 - documentation for FeatureContributionCalculatingEstimator #3384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,47 +26,8 @@ | |
namespace Microsoft.ML.Transforms | ||
{ | ||
/// <summary> | ||
/// The FeatureContributionCalculationTransformer computes model-specific per-feature contributions to the score of each example. | ||
/// See the list of currently supported models below. | ||
/// <see cref="ITransformer"/> resulting from fitting a <see cref="FeatureContributionCalculatingEstimator"/>. | ||
/// </summary> | ||
/// <remarks> | ||
/// <para> | ||
/// Scoring a dataset with a trained model produces a score, or prediction, for each example. To understand and explain these predictions | ||
/// it can be useful to inspect which features influenced them most significantly. FeatureContributionCalculationTransformer computes a model-specific | ||
/// list of per-feature contributions to the score for each example. These contributions can be positive (they make the score higher) or negative | ||
/// (they make the score lower). | ||
/// </para> | ||
/// <para> | ||
/// Feature Contribution Calculation is currently supported for the following models: | ||
/// Regression: | ||
/// OrdinaryLeastSquares, StochasticDualCoordinateAscent (SDCA), OnlineGradientDescent, PoissonRegression, | ||
/// GeneralizedAdditiveModels (GAM), LightGbm, FastTree, FastForest, FastTreeTweedie | ||
/// Binary Classification: | ||
/// AveragedPerceptron, LinearSupportVectorMachines, LogisticRegression, StochasticDualCoordinateAscent (SDCA), | ||
/// StochasticGradientDescent (SGD), SymbolicStochasticGradientDescent, GeneralizedAdditiveModels (GAM), | ||
/// FastForest, FastTree, LightGbm | ||
/// Ranking: | ||
/// FastTree, LightGbm | ||
/// </para> | ||
/// <para> | ||
/// For linear models, the contribution of a given feature is equal to the product of feature value times the corresponding weight. Similarly, | ||
/// for Generalized Additive Models (GAM), the contribution of a feature is equal to the shape function for the given feature evaluated at | ||
/// the feature value. | ||
/// </para> | ||
/// <para> | ||
/// For tree-based models, the calculation of feature contribution essentially consists in determining which splits in the tree have the most impact | ||
/// on the final score and assigning the value of the impact to the features determining the split. More precisely, the contribution of a feature | ||
/// is equal to the change in score produced by exploring the opposite sub-tree every time a decision node for the given feature is encountered. | ||
/// Consider a simple case with a single decision tree that has a decision node for the binary feature F1. Given an example that has feature F1 | ||
/// equal to true, we can calculate the score it would have obtained if we chose the subtree corresponding to the feature F1 being equal to false | ||
/// while keeping the other features constant. The contribution of feature F1 for the given example is the difference between the original score | ||
/// and the score obtained by taking the opposite decision at the node corresponding to feature F1. This algorithm extends naturally to models with | ||
/// many decision trees. | ||
/// </para> | ||
/// <para> | ||
/// See the sample below for an example of how to compute feature importance using the FeatureContributionCalculatingTransformer. | ||
/// </para> | ||
/// </remarks> | ||
public sealed class FeatureContributionCalculatingTransformer : OneToOneTransformerBase | ||
{ | ||
internal sealed class Options : TransformInputBase | ||
|
@@ -266,9 +227,57 @@ private Delegate GetValueGetter<TSrc>(DataViewRow input, int colSrc) | |
} | ||
|
||
/// <summary> | ||
/// Estimator producing a FeatureContributionCalculatingTransformer which scores the model on an input dataset and | ||
/// computes model-specific contribution scores for each feature. | ||
/// Computes model-specific per-feature contributions to the score of each input vector. | ||
/// See the list of currently supported models below. | ||
/// </summary> | ||
/// <remarks> | ||
/// <format type="text/markdown">< There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
/// | Output column data type | Vector of floats | | ||
/// | ||
/// <para> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This won't be processed or will cause an error, as you are inside a markdown block here. #Resolved |
||
/// Scoring a dataset with a trained model produces a score, or prediction, for each example. To understand and explain these predictions | ||
/// it can be useful to inspect which features influenced them most significantly. This transformer computes a model-specific | ||
/// list of per-feature contributions to the score for each example. These contributions can be positive (they make the score higher) or negative | ||
/// (they make the score lower). | ||
/// </para> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove #Resolved |
||
/// <para> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same #Resolved |
||
/// Feature Contribution Calculation is currently supported for the following models: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could make this a bulleted list in markdown
Would also be good to get the algorithm names to be exactly the same as the name of the trainer classes #Resolved |
||
/// Regression: | ||
/// OrdinaryLeastSquares, StochasticDualCoordinateAscent (SDCA), OnlineGradientDescent, PoissonRegression, | ||
/// GeneralizedAdditiveModels (GAM), LightGbm, FastTree, FastForest, FastTreeTweedie | ||
/// Binary Classification: | ||
/// AveragedPerceptron, LinearSupportVectorMachines, LogisticRegression, StochasticDualCoordinateAscent (SDCA), | ||
/// StochasticGradientDescent (SGD), SymbolicStochasticGradientDescent, GeneralizedAdditiveModels (GAM), | ||
/// FastForest, FastTree, LightGbm | ||
/// Ranking: | ||
/// FastTree, LightGbm | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unfortunately this does not seem to look good in the xml doc: Could you find a way to itemize these points? Or another approach that will make it easier to read? #Resolved |
||
/// </para> | ||
/// <para> | ||
/// For linear models, the contribution of a given feature is equal to the product of feature value times the corresponding weight. Similarly, | ||
/// for Generalized Additive Models (GAM), the contribution of a feature is equal to the shape function for the given feature evaluated at | ||
/// the feature value. | ||
/// </para> | ||
/// <para> | ||
/// For tree-based models, the calculation of feature contribution essentially consists in determining which splits in the tree have the most impact | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This gives a good description of tree based models. Worth mentioning how it works for the other models? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The description for linear models and GAM is above. Do you think it should be more detailed? In reply to: 276423670 [](ancestors = 276423670) |
||
/// on the final score and assigning the value of the impact to the features determining the split. More precisely, the contribution of a feature | ||
/// is equal to the change in score produced by exploring the opposite sub-tree every time a decision node for the given feature is encountered. | ||
/// Consider a simple case with a single decision tree that has a decision node for the binary feature F1. Given an example that has feature F1 | ||
/// equal to true, we can calculate the score it would have obtained if we chose the subtree corresponding to the feature F1 being equal to false | ||
/// while keeping the other features constant. The contribution of feature F1 for the given example is the difference between the original score | ||
/// and the score obtained by taking the opposite decision at the node corresponding to feature F1. This algorithm extends naturally to models with | ||
/// many decision trees. | ||
/// </para> | ||
/// See the See Also section for links to examples of the usage. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure this line is necessary #Resolved |
||
/// ]]></format> | ||
/// </remarks> | ||
/// <seealso cref="ExplainabilityCatalog.CalculateFeatureContribution(TransformsCatalog, ISingleFeaturePredictionTransformer{ICalculateFeatureContribution}, int, int, bool)"/> | ||
/// <seealso cref="ExplainabilityCatalog.CalculateFeatureContribution{TModelParameters, TCalibrator}(TransformsCatalog, ISingleFeaturePredictionTransformer{Calibrators.CalibratedModelParametersBase{TModelParameters, TCalibrator}}, int, int, bool)"/> | ||
public sealed class FeatureContributionCalculatingEstimator : TrivialEstimator<FeatureContributionCalculatingTransformer> | ||
{ | ||
private readonly string _featureColumn; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this, because this line displays on the IntelliSense, and there won't be an option to see the remarks section there. #Resolved