Add pb method for PR curves #633

chihuahua · 2017-10-10T01:57:18Z

This change adds a pb implementation for the PR curves summary,
which, like all pb implementations, lets users generate summaries
without having to use TensorFlow.

Also modified the test for PR curve summaries to use small test cases
that are easy for a developer to reason through instead of using the
demo data. This allows us to use the compute_and_check_summary_pb
paradigm for the PR curves summary test, just like for other plugins.

Updated the summary module to include pr_curve_pb. Fixes #445.

As part of this change, renamed the tag parameter of summary ops to
name to be consistent with other summaries.

This change adds a `pb` implementation for the PR curves summary, which, like all `pb` implementations, lets users generate summaries without having to use TensorFlow.

This change adds a `pb` implementation for the PR curves summary, which, like all `pb` implementations, lets users generate summaries without having to use TensorFlow. Also modified the test for PR curve summaries to use small test cases that are easy for a developer to reason through instead of using the demo data. This allows us to use the compute_and_check_summary_pb paradigm for the PR curves summary test, just like for other plugins. Updated the summary module to include `pr_curve_pb`. Fixes tensorflow#445.

wchargin

Excellent. Thanks for writing this. Very glad that plugins now consistently implement pb, and that we can use compute_and_check_summary_pb here.

wchargin · 2017-10-10T11:37:45Z

tensorboard/plugins/pr_curve/summary_test.py

+    self.assertProtoEquals(pb, pb_via_op)
+    return pb
+
+  def verify_float_arrays_are_equal(self, expected, gotten):


nit: Can we rename gotten to the more conventional term actual?

wchargin · 2017-10-10T11:39:18Z

tensorboard/summary_test.py

-# graph. This set should ideally be empty; any entries here should be
-# considered temporary.
-PLUGINS_WITHOUT_PB_FUNCTIONS = frozenset([
-    'pr_curve',  # TODO(@chihuahua, #445): Fix this.


Up to you whether to delete this set, changing the code below, or simply make it empty and leave the check in place. Either is fine with me.

Hmm ... I suppose if we now have a chance to pursue the ideal, how about we remove this structure, so that anyone who wishes to make an exception would have to reintroduce a similar structure ... more impetus for implementing a pb method.

Makes sense to me!

wchargin · 2017-10-10T11:40:25Z

tensorboard/plugins/pr_curve/summary.py

-# division by 0. 1 suffices because counts of course must be whole numbers.
-_MINIMUM_COUNT = 1.0
+# division by 0.
+_MINIMUM_COUNT = 1e-7


I don't understand the reason for this change. Why does 1 no longer suffice?

Ah, I added a comment.

wchargin · 2017-10-10T11:42:19Z

tensorboard/plugins/pr_curve/summary.py

+  """Creates a PR curves summary protobuf
+
+  Arguments:
+    tag: A name for the generated node. Will also serve as a series name in


Looks like other summaries call this name instead of tag. Can we stay consistent with them?

I renamed the tag parameter to name for summaries.

wchargin · 2017-10-10T11:43:18Z

tensorboard/plugins/pr_curve/summary.py

+       weights=None,
+       display_name=None,
+       description=None):
+  """Creates a PR curves summary protobuf


For consistency with other pb functions: "Create a PR curves summary protobuf." (imperative mood, full stop at end)

wchargin · 2017-10-11T00:00:27Z

Fine with me once you fix failures (see Jenkins).

wchargin · 2017-10-11T00:00:32Z

s/Jenkins/Travis

chihuahua · 2017-10-11T00:23:04Z

Ah done.

wchargin · 2017-10-11T15:54:13Z

tensorboard/plugins/pr_curve/summary.py

 import tensorflow as tf

 from tensorboard.plugins.pr_curve import metadata

 # A value that we use as the minimum value during division of counts to prevent
-# division by 0. 1 suffices because counts of course must be whole numbers.
-_MINIMUM_COUNT = 1.0
+# division by 0. 1.0 does not suffice here because certain weights could push


Okay, this comment helps, but I still don't understand why the addition of the pb function forced this change. Or was op broken previously (and if so, would that not merit a separate change with regression tests?)?

I reverted this - I'll send out a separate PR where this belongs.

Hmm, it doesn't look reverted in this code? I still see _MINIMUM_COUNT = 1e-7 and the comment about 1.0 not sufficing. (commit=2e7f5a2f3776768075ca499635e0eb73b4c30653)

Oh you're right. Should be fixed now. Thanks!

wchargin · 2017-10-11T15:59:29Z

tensorboard/plugins/pr_curve/summary.py

+       description=None):
+  """Create a PR curves summary protobuf.
+
+  Arguments:


Mentions of "python ints" and "constant strs" in the pb function seem potentially confusing; this is not in a TensorFlow context, so the distinction that you intend doesn't exist, and instead folks might wonder why they can't pass a string from a variable or something (they of course can). If you look at another summary's pb function, you'll see that the documentation is changed appropriately.

Suggested changes:

num_thresholds: Optional […] metrics for. When provided, should be an int of value at least 2. Defaults to 200.

weights: Optional float or float32 numpy array. […] This value must be […].

display_name: […] as a str. […]

description: […] as a str. […]

Done. Indeed, that seems much clearer, and using python could be confusing here because pb methods inherently don't rely on TensorFlow.

wchargin · 2017-10-11T16:00:58Z

tensorboard/plugins/pr_curve/summary.py

@@ -263,7 +336,7 @@ def compute_summary(tp, fp, tn, fn, collections):


 def raw_data_op(


I think that we probably want to provide a raw_data_pb—do you agree? (Could be a separate PR if you want.)

SG. Yeah - maybe seperate PR? Just because this one's getting big, albeit that is related.

Sure, fine with me.

wchargin · 2017-10-11T16:03:05Z

tensorboard/plugins/pr_curve/summary_test.py

-    # Test the output for the red classifier. The red classifier has the
-    # narrowest standard deviation.
-    tensor_events = accumulator.Tensors('red/pr_curves')
-    self.validateTensorEvent(0, [


It looks like these extensive tests with nontrivial data actually went away. That worries me, especially because some of them are regression tests against previously-introduced numerical errors.

I understand that you removed these because the pb can't directly rely on the output of pr_curve_demo.run_all(). But you could still get the data some other way: for instance, expose a function in pr_curve_demo.py that creates the salient ops (color and labels, it looks like), so that you can directly run them in this test and pass the results to compute_and_check_summary_pb.

Or get some nontrivial data in a way that does not involve using the demo. That'd be fine, too.

I added test_exhaustive_random_values - thoughts? I like how the small, crafted test cases are easy to reason through and explicitly cover certain (often corner) cases, albeit like you noted they are less realistic.

It's fine with me as long as this case suffices to cover regressions like the one fixed in #316.

Yes - this should cover that! That regression was overlooked because no test case exhibited a bin having more than 1 value within them.

wchargin

Approval once 1.0→1e-7 change is actually reverted.

wchargin · 2017-10-12T21:31:39Z

tensorboard/plugins/pr_curve/summary_test.py

-    # Test the output for the red classifier. The red classifier has the
-    # narrowest standard deviation.
-    tensor_events = accumulator.Tensors('red/pr_curves')
-    self.validateTensorEvent(0, [


It's fine with me as long as this case suffices to cover regressions like the one fixed in #316.

wchargin · 2017-10-12T21:31:45Z

tensorboard/plugins/pr_curve/summary.py

 import tensorflow as tf

 from tensorboard.plugins.pr_curve import metadata

 # A value that we use as the minimum value during division of counts to prevent
-# division by 0. 1 suffices because counts of course must be whole numbers.
-_MINIMUM_COUNT = 1.0
+# division by 0. 1.0 does not suffice here because certain weights could push


Hmm, it doesn't look reverted in this code? I still see _MINIMUM_COUNT = 1e-7 and the comment about 1.0 not sufficing. (commit=2e7f5a2f3776768075ca499635e0eb73b4c30653)

wchargin · 2017-10-12T21:33:07Z

tensorboard/plugins/pr_curve/summary.py

@@ -263,7 +336,7 @@ def compute_summary(tp, fp, tn, fn, collections):


 def raw_data_op(


Sure, fine with me.

Add pb method for PR curves

9bf6095

This change adds a `pb` implementation for the PR curves summary, which, like all `pb` implementations, lets users generate summaries without having to use TensorFlow.

chihuahua force-pushed the summary branch 4 times, most recently from aa47a5f to 0cd2162 Compare October 10, 2017 02:56

chihuahua requested a review from wchargin October 10, 2017 02:57

chihuahua added the plugin:pr-curves label Oct 10, 2017

chihuahua force-pushed the summary branch from 0cd2162 to 9a0d4d4 Compare October 10, 2017 03:00

chihuahua force-pushed the summary branch from 9a0d4d4 to b840408 Compare October 10, 2017 03:56

Remove un-needed tf.Session()

0c4057a

wchargin reviewed Oct 10, 2017

View reviewed changes

Respond to changes

10793bf

Fix tests and demo

22f4e29

chihuahua force-pushed the summary branch from 026ed26 to 22f4e29 Compare October 11, 2017 00:40

wchargin reviewed Oct 11, 2017

View reviewed changes

chihuahua added 2 commits October 11, 2017 23:03

.

fef4b13

Add test_exhaustive_random_values

2e7f5a2

wchargin approved these changes Oct 12, 2017

View reviewed changes

chihuahua force-pushed the summary branch from 8a50df6 to 2e7f5a2 Compare October 13, 2017 05:06

chihuahua added 2 commits October 12, 2017 22:23

Revert change to min count

aaa3212

Set to 1

7c65864

chihuahua merged commit 18b7e41 into tensorflow:master Oct 13, 2017

chihuahua deleted the summary branch October 13, 2017 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pb method for PR curves #633

Add pb method for PR curves #633

chihuahua commented Oct 10, 2017 •

edited

Loading

wchargin left a comment

wchargin Oct 10, 2017

chihuahua Oct 10, 2017

wchargin Oct 10, 2017

chihuahua Oct 10, 2017

wchargin Oct 10, 2017

wchargin Oct 10, 2017

chihuahua Oct 10, 2017

wchargin Oct 10, 2017

chihuahua Oct 10, 2017

wchargin Oct 10, 2017

chihuahua Oct 10, 2017

wchargin commented Oct 11, 2017

wchargin commented Oct 11, 2017

chihuahua commented Oct 11, 2017

wchargin Oct 11, 2017

chihuahua Oct 12, 2017

wchargin Oct 12, 2017

chihuahua Oct 12, 2017

wchargin Oct 11, 2017

chihuahua Oct 12, 2017

wchargin Oct 11, 2017

chihuahua Oct 12, 2017

wchargin Oct 12, 2017

wchargin Oct 11, 2017

chihuahua Oct 12, 2017

wchargin Oct 12, 2017

chihuahua Oct 12, 2017

wchargin left a comment

wchargin Oct 12, 2017

wchargin Oct 12, 2017

wchargin Oct 12, 2017

		@@ -263,7 +336,7 @@ def compute_summary(tp, fp, tn, fn, collections):


		def raw_data_op(

Add pb method for PR curves #633

Add pb method for PR curves #633

Conversation

chihuahua commented Oct 10, 2017 • edited Loading

wchargin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wchargin commented Oct 11, 2017

wchargin commented Oct 11, 2017

chihuahua commented Oct 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wchargin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chihuahua commented Oct 10, 2017 •

edited

Loading