-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Adding a binary classification PFI Example #1793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a binary classification PFI Example #1793
Conversation
docs/samples/Microsoft.ML.Samples/Dynamic/PermutationFeatureImportance.cs
Outdated
Show resolved
Hide resolved
@@ -5,57 +5,22 @@ | |||
|
|||
namespace Microsoft.ML.Samples.Dynamic | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should leave the samples in this project small and simple; they are meant to illustrate the API usage; not give a complete overview/tutorial on how to solve a problem.
Longer, more complete samples can go in the machinelarning-samples repo.
It might be overwhelming, and hard to digest, to see a long, long example in a page that is supposed to just tell you how to call the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this conceptual separation - that short code snippets to illustrate the API show up on the docs site, and that end-to-end samples go in the separate repo. Is there a way that we can add links to relevant end-to-end samples in the docs site though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. However, PFI
is a bit challenging because the return values also need to be explained. What I've done here is try to make these short and easy, while showing how to use the API and interpret the results.
The latest commits (with rebase) refactor each task (e.g. Regression
, BinaryClassification
) into its own short file, adding a PfiHelper
class to hold common things like data loading and linear-model-weight extraction.
One thing that adds to the length is a discussion of how to interpret the results. I believe these are appropriate for the API because these are the first sorts of questions people ask when you present PFI to them.
What do you think @sfilipi @montebhoover and @glebuk ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this - the separation of the files makes each sample more manageable. I'd be curious if the prose explanation at the end of the sample would be better suited to show up in the docs outside of the example code, like in some sort of remarks section, but that is a minor improvement in my mind and we could make that later if desired.
docs/samples/Microsoft.ML.Samples/Dynamic/PermutationFeatureImportance.cs
Outdated
Show resolved
Hide resolved
@@ -5,57 +5,22 @@ | |||
|
|||
namespace Microsoft.ML.Samples.Dynamic | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this conceptual separation - that short code snippets to illustrate the API show up on the docs site, and that end-to-end samples go in the separate repo. Is there a way that we can add links to relevant end-to-end samples in the docs site though?
docs/samples/Microsoft.ML.Samples/Dynamic/PermutationFeatureImportance.cs
Outdated
Show resolved
Hide resolved
1dc2423
to
9a2ff1a
Compare
Note: Rebased to bring in changes from |
using System; | ||
using System.Linq; | ||
|
||
namespace Microsoft.ML.Samples.Dynamic.PermutationFeatureImportance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you need to merge with master anyway...
You already inside PermutationFeatureImportance namespace, I don't see much reason behind adding Pfi prefix to names, classes and methods.
|
||
namespace Microsoft.ML.Samples.Dynamic.PermutationFeatureImportance | ||
{ | ||
public class PfiHelper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PfiHelper [](start = 17, length = 9)
make it internal as well?
|
||
namespace Microsoft.ML.Samples.Dynamic.PermutationFeatureImportance | ||
{ | ||
public class PfiHelper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you add BinaryClassificationExample and Regression to our documentation.
Is it ok to have reference to helper class in documentation?
I doubt content of this file would be visible to user.
@rogancarr Is this still relevant and you just don't have time to update it, or this is obsolete now? |
This is relevant, but I'm waiting for the next PFI PR (#1844 ) to go through before I do merge fixes, since that will cause more merge issues. |
458c44d
to
c471bbb
Compare
This PR adds a binary classification example. I added it to the same file as the regression example, and refactored out the file loading so that the examples focus more on the technique and less on file-loading. I also added a discussion on random fluctuations in PFI values.
Fixes #1766