-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add pb method for PR curves #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change adds a `pb` implementation for the PR curves summary, which, like all `pb` implementations, lets users generate summaries without having to use TensorFlow.
aa47a5f
to
0cd2162
Compare
This change adds a `pb` implementation for the PR curves summary, which, like all `pb` implementations, lets users generate summaries without having to use TensorFlow. Also modified the test for PR curve summaries to use small test cases that are easy for a developer to reason through instead of using the demo data. This allows us to use the compute_and_check_summary_pb paradigm for the PR curves summary test, just like for other plugins. Updated the summary module to include `pr_curve_pb`. Fixes tensorflow#445.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent. Thanks for writing this. Very glad that plugins now consistently implement pb
, and that we can use compute_and_check_summary_pb
here.
self.assertProtoEquals(pb, pb_via_op) | ||
return pb | ||
|
||
def verify_float_arrays_are_equal(self, expected, gotten): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we rename gotten
to the more conventional term actual
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# graph. This set should ideally be empty; any entries here should be | ||
# considered temporary. | ||
PLUGINS_WITHOUT_PB_FUNCTIONS = frozenset([ | ||
'pr_curve', # TODO(@chihuahua, #445): Fix this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you whether to delete this set, changing the code below, or simply make it empty and leave the check in place. Either is fine with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ... I suppose if we now have a chance to pursue the ideal, how about we remove this structure, so that anyone who wishes to make an exception would have to reintroduce a similar structure ... more impetus for implementing a pb
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me!
# division by 0. 1 suffices because counts of course must be whole numbers. | ||
_MINIMUM_COUNT = 1.0 | ||
# division by 0. | ||
_MINIMUM_COUNT = 1e-7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the reason for this change. Why does 1
no longer suffice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I added a comment.
"""Creates a PR curves summary protobuf | ||
|
||
Arguments: | ||
tag: A name for the generated node. Will also serve as a series name in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like other summaries call this name
instead of tag
. Can we stay consistent with them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed the tag
parameter to name
for summaries.
weights=None, | ||
display_name=None, | ||
description=None): | ||
"""Creates a PR curves summary protobuf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with other pb
functions: "Create a PR curves summary protobuf.
" (imperative mood, full stop at end)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Fine with me once you fix failures (see Jenkins). |
s/Jenkins/Travis |
Ah done. |
import tensorflow as tf | ||
|
||
from tensorboard.plugins.pr_curve import metadata | ||
|
||
# A value that we use as the minimum value during division of counts to prevent | ||
# division by 0. 1 suffices because counts of course must be whole numbers. | ||
_MINIMUM_COUNT = 1.0 | ||
# division by 0. 1.0 does not suffice here because certain weights could push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this comment helps, but I still don't understand why the addition of the pb
function forced this change. Or was op
broken previously (and if so, would that not merit a separate change with regression tests?)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted this - I'll send out a separate PR where this belongs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it doesn't look reverted in this code? I still see _MINIMUM_COUNT = 1e-7
and the comment about 1.0 not sufficing. (commit=2e7f5a2f3776768075ca499635e0eb73b4c30653)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh you're right. Should be fixed now. Thanks!
description=None): | ||
"""Create a PR curves summary protobuf. | ||
|
||
Arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentions of "python ints" and "constant str
s" in the pb
function seem potentially confusing; this is not in a TensorFlow context, so the distinction that you intend doesn't exist, and instead folks might wonder why they can't pass a string from a variable or something (they of course can). If you look at another summary's pb
function, you'll see that the documentation is changed appropriately.
Suggested changes:
num_thresholds
: Optional […] metrics for. When provided, should be anint
of value at least2
. Defaults to200
.weights
: Optionalfloat
or float32 numpy array. […] This value must be […].display_name
: […] as astr
. […]description
: […] as astr
. […]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Indeed, that seems much clearer, and using python
could be confusing here because pb
methods inherently don't rely on TensorFlow.
@@ -263,7 +336,7 @@ def compute_summary(tp, fp, tn, fn, collections): | |||
|
|||
|
|||
def raw_data_op( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we probably want to provide a raw_data_pb
—do you agree? (Could be a separate PR if you want.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG. Yeah - maybe seperate PR? Just because this one's getting big, albeit that is related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, fine with me.
# Test the output for the red classifier. The red classifier has the | ||
# narrowest standard deviation. | ||
tensor_events = accumulator.Tensors('red/pr_curves') | ||
self.validateTensorEvent(0, [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like these extensive tests with nontrivial data actually went away. That worries me, especially because some of them are regression tests against previously-introduced numerical errors.
I understand that you removed these because the pb
can't directly rely on the output of pr_curve_demo.run_all()
. But you could still get the data some other way: for instance, expose a function in pr_curve_demo.py
that creates the salient ops (color
and labels
, it looks like), so that you can directly run them in this test and pass the results to compute_and_check_summary_pb
.
Or get some nontrivial data in a way that does not involve using the demo. That'd be fine, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added test_exhaustive_random_values
- thoughts? I like how the small, crafted test cases are easy to reason through and explicitly cover certain (often corner) cases, albeit like you noted they are less realistic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine with me as long as this case suffices to cover regressions like the one fixed in #316.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - this should cover that! That regression was overlooked because no test case exhibited a bin having more than 1 value within them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approval once 1.0
→1e-7
change is actually reverted.
# Test the output for the red classifier. The red classifier has the | ||
# narrowest standard deviation. | ||
tensor_events = accumulator.Tensors('red/pr_curves') | ||
self.validateTensorEvent(0, [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine with me as long as this case suffices to cover regressions like the one fixed in #316.
import tensorflow as tf | ||
|
||
from tensorboard.plugins.pr_curve import metadata | ||
|
||
# A value that we use as the minimum value during division of counts to prevent | ||
# division by 0. 1 suffices because counts of course must be whole numbers. | ||
_MINIMUM_COUNT = 1.0 | ||
# division by 0. 1.0 does not suffice here because certain weights could push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it doesn't look reverted in this code? I still see _MINIMUM_COUNT = 1e-7
and the comment about 1.0 not sufficing. (commit=2e7f5a2f3776768075ca499635e0eb73b4c30653)
@@ -263,7 +336,7 @@ def compute_summary(tp, fp, tn, fn, collections): | |||
|
|||
|
|||
def raw_data_op( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, fine with me.
This change adds a
pb
implementation for the PR curves summary,which, like all
pb
implementations, lets users generate summarieswithout having to use TensorFlow.
Also modified the test for PR curve summaries to use small test cases
that are easy for a developer to reason through instead of using the
demo data. This allows us to use the
compute_and_check_summary_pb
paradigm for the PR curves summary test, just like for other plugins.
Updated the summary module to include
pr_curve_pb
. Fixes #445.As part of this change, renamed the
tag
parameter of summary ops toname
to be consistent with other summaries.