[WIP] Proposed revision to sample_posterior_predictive() #3468

rpgoldman · 2019-05-06T17:50:48Z

Most other sampling functions take varnames, rather than var objects. Extend the set of parameters to accept varnames as an alternative to vars, preserving backwards compatibility.
Also revise the docstring, to clarify the return type and add type comments.

lucianopaz

Overall, I like that this unifies the API consistently. IMHO, I would have left vars and deprecated varnames. However, I would like to hear what others think about this. Maybe varnames is safer because you ensure that the tensors and RVs are part of the model, whereas, varscould hold anything out of the model's context.

Finally, if this goes through, you should add a line at the end of the deprecations section of release-notes.

pymc3/sampling.py

lucianopaz · 2019-05-06T19:38:41Z

I think that this change is nice but I think we should discuss two things:

Whether we want to change the function parameters to make the API more consistent.
Whether to add a very small test environment for python 3.5, so we ensure that we build and at least sample correctly in that version of python. After Python 3.5 support? #3465 and the variables annotations problems, I think that we need this.

@twiecki, @junpenglao, @ericmjl, @ColCarroll

rpgoldman · 2019-05-07T00:29:28Z

The deprecation warning was too aggressive. Fixing that now.

lucianopaz · 2019-05-07T12:05:00Z

pymc3/sampling.py

+            raise Exception("Should not specify both vars and varnames arguments.")
+        vars = (model[x] for x in varnames)
+    else:
+        raise DeprecationWarning("vars argument is deprecated in favor of varnames.")


Use the warnings package instead of raising the DeprecationWarning. Also, this warning should be raised if varnames is None and vars is not None.

Got this now, I believe. Waiting for the tests to complete.
There's enough uninteresting history on this topic branch that it's probably worth squash-merging it, or I could clean up the history. Need to rebase it, anyway.

fonnesbeck · 2019-05-07T16:17:54Z

Its not clear (to me) that these need to be unified. Sampling takes place in a model context, so it makes sense to pass the PyMC variables directly. Whereas, plotting occurs outside the model context, so the variables are not generally available. To me, there is a clear line between the two, because they are associated with distinct tasks in different phases of the modeling workflow.

rpgoldman · 2019-05-07T16:23:32Z

Its not clear (to me) that these need to be unified. Sampling takes place in a model context, so it makes sense to pass the PyMC variables directly. Whereas, plotting occurs outside the model context, so the variables are not generally available. To me, there is a clear line between the two, because they are associated with distinct tasks in different phases of the modeling workflow.

That may be true, but in addition to being confusing, and having the potential for error that @lucianopaz points out, there's also the issue that since the user is typically working with variable names, they just end up having to write code that looks like:

vars=(model[x] for x in name_list)

...at least that has been my experience using this API (more accurately, I do that after I have encountered the wrong argument error! 😜). This is inconvenient and potentially error-prone.

Typically I write a function that builds the model and I don't "hang onto" the variable objects, and it's easier to work with the variable names.

lucianopaz · 2019-05-08T04:49:06Z

pymc3/sampling.py

@@ -24,6 +27,8 @@
 import sys
 sys.setrecursionlimit(10000)

+assert Any # ugly way to make pyflakes ignore "unused" Any import


Better to just not import Any. It's only used in the commented variable annotation

I have found a way to import this symbol only during type checking, which seems to make pylint and mypy both happy.
Rebased and force-pushed this change for review.

pymc3/sampling.py

lucianopaz · 2019-05-08T04:57:29Z

@fonnesbeck, sampling from the posterior predictive should also be done within the model's context. Once we get the sample, we can plot outside the context. I'm also hesitant of making this change so I labeled it for discussion.

I now think that using names is safer because that way, we ensure that the variables are defined in the model. If we accept tensors, the users could mix-up tensors that come from two separate models, and could make a strange mess.

junpenglao · 2019-05-08T05:08:04Z

I agree with @lucianopaz and this sounds like a good idea to me in general. However, we might want to separate the type hint part with the actual PR? Or is it the plan now to do this for each PR?

You should also do the same change for sample_posterior_predictive_w
https://github.com/pymc-devs/pymc3/blob/master/pymc3/sampling.py#L1114

rpgoldman · 2019-05-08T14:01:10Z

I agree with @lucianopaz and this sounds like a good idea to me in general. However, we might want to separate the type hint part with the actual PR? Or is it the plan now to do this for each PR?

I see your point about the type hint part, but separating out the type hints will be quite burdensome (it's not something diff is good at). Perhaps someone who has a better diff tool could split out the type hints as part of squash-merging this PR?

I use type hints both to provide valuable checking, and as a tool in figuring out PyMC3, which is a very complex code base; they have become an integral part of my Python coding process.

You should also do the same change for sample_posterior_predictive_w

But this function doesn't have a vars argument at all, so I'm not sure what you would like to have done to it. I'm happy to make whatever changes, but I need some more direction.

junpenglao · 2019-05-08T16:46:03Z

oh right sample_posterior_predictive_w only works for observed variable - never mind :-)

rpgoldman · 2019-05-08T18:14:38Z

Integrated the history, with blind alleys, into a single commit.
Retained the history with tag.

lucianopaz

You still need to add this change to the bottom of the deprecations section of the release notes

pymc3/sampling.py

lucianopaz · 2019-05-09T12:00:59Z

Once #3470 gets sorted out and merged, you'll have to rebase this PR to get the minimum syntax check of python3.5 working

rpgoldman · 2019-05-09T15:05:36Z

Once #3470 gets sorted out and merged, you'll have to rebase this PR to get the minimum syntax check of python3.5 working

I subscribed to that PR. When I hear that it's merged, I will rebase and someone can merge this and close it out.

rpgoldman · 2019-05-10T02:25:54Z

You still need to add this change to the bottom of the deprecations section of the release notes

Just did that and pushed.

lucianopaz

The PR looks fine now and could be merged. Do most of us agree that it's better to use varnames vs vars? I am in favor of varnames but I would like know the rest of us approved this change. @twiecki, @fonnesbeck, @junpenglao, @ColCarroll, and anyone else care to give their final opinion?

twiecki · 2019-05-11T06:05:36Z

I approve, it's more consistent. But didn't we switched to var_names?

…

On Sat, May 11, 2019, 07:50 Luciano Paz ***@***.***> wrote: ***@***.**** approved this pull request. The PR looks fine now and could be merged. Do most of us agree that it's better to use varnames vs vars? I am in favor of varnames but I would like know the rest of us approved this change. @twiecki <https://github.com/twiecki>, @fonnesbeck <https://github.com/fonnesbeck>, @junpenglao <https://github.com/junpenglao>, @ColCarroll <https://github.com/ColCarroll>, and anyone else care to give their final opinion? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3468 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFETGBIA6KPCQGHXJJOM3TPUZNBJANCNFSM4HLCBZAA> .

lucianopaz · 2019-05-11T06:12:56Z

For now it's still mostly inconsistent but we made the switch to var_names in the plots. We can do it here too. Also, I just checked sample_prior_predictive and that uses vars too. It should be dealt with just like in sample_posterior_predictive

twiecki · 2019-05-11T18:17:56Z

Yeah let's move everything to var_names.

…

On Sat, May 11, 2019, 08:12 Luciano Paz ***@***.***> wrote: For now it's still mostly inconsistent but we made the switch to var_names in the plots. We can do it here too. Also, I just checked sample_prior_predictive and that uses vars too. It should be dealt with just like in sample_posterior_predictive — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3468 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFETGBIZQRYRYBUJQNT4HLPUZPWVANCNFSM4HLCBZAA> .

rpgoldman · 2019-05-13T02:51:27Z

For now it's still mostly inconsistent but we made the switch to var_names in the plots. We can do it here too. Also, I just checked sample_prior_predictive and that uses vars too. It should be dealt with just like in sample_posterior_predictive

I have pushed the changes to sample_prior_predictive(). I am waiting to make sure they pass the tests. If so, I will do the varnames -> var_names change, and then I hope we can do the merge.

rpgoldman · 2019-05-13T16:56:11Z

OK, I think that this is all ready to go now. varnames have become var_names. If the tests all pass, I will make sure it's properly rebased and squash it down to a single commit for merging, unless there are any further issues?

ColCarroll

looks generally good! can you tighten up the exceptions before merge,though?

ColCarroll · 2019-05-13T17:00:11Z

pymc3/model.py

@@ -3,6 +3,7 @@
 import itertools
 import threading
 import warnings
+from typing import Optional, Dict, Any


only Optional used here?

See 087ea2d8ebfaa2e34474eba0ce79f3420b78a4b4

ColCarroll · 2019-05-13T17:01:31Z

pymc3/sampling.py

@@ -1070,6 +1083,13 @@ def sample_posterior_predictive(trace, samples=None, model=None, vars=None, size

    model = modelcontext(model)

+    if var_names is not None:
+        if vars is not None:
+            raise Exception("Should not specify both vars and var_names arguments.")


ValueError for a bad argument here (Exception is too broad, I think)

See 5e546b252ec20b5b202037de99f377da91b3356c

pymc3/sampling.py

rpgoldman · 2019-05-16T22:10:09Z

@ColCarroll @lucianopaz @twiecki I think I've fixed everything requested. If the tests go fine, and you like the current version, I will squash this down (or you can do a squash merge), and I hope it will be done.

ColCarroll

looks good to me once test pass! thanks for sticking with this @rpgoldman!

ColCarroll · 2019-05-16T23:37:11Z

pymc3/sampling.py

@@ -1283,8 +1322,9 @@ def sample_prior_predictive(samples=500, model=None, vars=None, random_seed=None
    values = draw_values([model[name] for name in names], size=samples)

    data = {k: v for k, v in zip(names, values)}
+    assert data is not None


raw asserts are considered bad form (they can be ignored if you use some flag that no one actually uses) -- if data is None: raise AssertionError

Fixed in 65439179b1abe7d8e14b2dbf01d257034f669e7e

@ColCarroll --
OK, I have no idea why this happened: when I replaced that line per above I now get a bunch of test failures involving failure to find scipy.misc.factorial that look like this:

pymc3/tests/test_distributions.py:500: in check_dlogp assert_almost_equal(dlogp(pt), ndlogp(pt), decimal=decimals, err_msg=str(pt)) ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:723: in __call__ *args, **kwds).squeeze() ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:666: in __call__ return super(Jacobian, self).__call__(np.atleast_1d(x), *args, **kwds) ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:297: in __call__ results = self._derivative(xi, args, kwds) ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:663: in _derivative_nonzero_order return self._apply_fd_rule(step_ratio, results, steps2) ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:394: in _apply_fd_rule fd_rule = self._get_finite_difference_rule(step_ratio) ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:378: in _get_finite_difference_rule fd_mat = self._fd_matrix(step_ratio, parity, num_terms) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ step_ratio = 2.0, parity = 1, nterms = 1 @staticmethod def _fd_matrix(step_ratio, parity, nterms): """ Return matrix for finite difference and complex step derivation. Parameters ---------- step_ratio : real scalar ratio between steps in unequally spaced difference rule. parity : scalar, integer 0 (one sided, all terms included but zeroth order) 1 (only odd terms included) 2 (only even terms included) 3 (only every 4'th order terms included starting from order 2) 4 (only every 4'th order terms included starting from order 4) 5 (only every 4'th order terms included starting from order 1) 6 (only every 4'th order terms included starting from order 3) nterms : scalar, integer number of terms """ _assert(0 <= parity <= 6, 'Parity must be 0, 1, 2, 3, 4, 5 or 6! ({0:d})'.format(parity)) step = [1, 2, 2, 4, 4, 4, 4][parity] inv_sr = 1.0 / step_ratio offset = [1, 1, 2, 2, 4, 1, 3][parity] c0 = [1.0, 1.0, 1.0, 2.0, 24.0, 1.0, 6.0][parity] c = c0 / \ > misc.factorial(np.arange(offset, step * nterms + offset, step)) E AttributeError: module 'scipy.misc' has no attribute 'factorial' ../../../miniconda3/envs/testenv/lib/python3.6/site-packages/numdifftools/core.py:330: AttributeError

I do not see these on my laptop -- all the tests pass there. This does not seem to have anything to do with my change to the assert, but what do I know?

my guess is a version of something (numdifftools?) changed under this diff. Comparing versions from the previous travis build and this one should let us know (and figure out if there's a good change). I'll dig into it after work.

That looks right -- I don't think it's anything I did. Is there some way to force a re-run of Travis on master, so that we can see if this failure also happens there (and my little tweak is innocent)?

@ColCarroll numdifftools didn't change, but scipy 1.3.0 was released three days ago (17 May). This looks like an upstream problem with numdifftools.

And yes,factorial was removed from scipy.misc: scipy/scipy@1cde305#diff-0763dd201fcbf9b088f9eb502b4f3d42

AFAICT this is fixed in the version of numdifftools on GitHub (or at least it is different -- factorial comes from scipy.special instead of scipy.misc now) but numdifftools has not seen a release in two years.

When you've got the factorial problem fixed, please squash merge this PR, or LMK if you would prefer me to smash it down to one commit for you to merge. Thanks!

twiecki · 2019-05-21T10:43:18Z

@rpgoldman What's your email?

rpgoldman · 2019-05-21T15:31:28Z

Rebased onto the latest master. If the tests pass now, I will squash this down to one commit so there's a clean history.

twiecki · 2019-05-21T16:15:04Z

We can also just squash merge here.

…

On Tue, May 21, 2019, 17:31 rpgoldman ***@***.***> wrote: Rebased onto the latest master. If the tests pass now, I will squash this down to one commit so there's a clean history. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3468>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFETGDGNESNL56V7UZ7TU3PWQIVHANCNFSM4HLCBZAA> .

Add the option to take varnames (`var_names`), rather than var objects as parameters. Extend the set of parameters to accept varnames as an alternative to vars, preserving backwards compatibility. Also revise the docstring, to clarify the return type and add type comments.

rpgoldman · 2019-05-21T18:24:03Z

Pushed a cleaned up version of the history. Waiting for the checks, then I hope we can merge and close.

lucianopaz · 2019-05-21T19:26:32Z

Thanks @rpgoldman!

lucianopaz requested changes May 6, 2019

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

pymc3/sampling.py Show resolved Hide resolved

pymc3/sampling.py Show resolved Hide resolved

lucianopaz reviewed May 6, 2019

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

lucianopaz added request discussion WIP labels May 6, 2019

lucianopaz reviewed May 7, 2019

View reviewed changes

rpgoldman force-pushed the ppc-params branch from b7b2e7c to 2a3f712 Compare May 7, 2019 14:37

lucianopaz reviewed May 8, 2019

View reviewed changes

rpgoldman force-pushed the ppc-params branch 2 times, most recently from a98054f to 7effb76 Compare May 8, 2019 13:56

rpgoldman force-pushed the ppc-params branch from 7effb76 to 8551f5f Compare May 8, 2019 18:14

lucianopaz reviewed May 8, 2019

View reviewed changes

pymc3/sampling.py Outdated Show resolved Hide resolved

pymc3/sampling.py Show resolved Hide resolved

rpgoldman mentioned this pull request May 10, 2019

Added minimal python3.5 test env #3470

Merged

lucianopaz approved these changes May 11, 2019

View reviewed changes

lucianopaz added request discussion and removed WIP request discussion labels May 11, 2019

rpgoldman force-pushed the ppc-params branch from c424231 to 537f3cc Compare May 13, 2019 02:49

ColCarroll reviewed May 13, 2019

View reviewed changes

rpgoldman force-pushed the ppc-params branch from 6cf20e2 to e936038 Compare May 16, 2019 20:07

ColCarroll reviewed May 16, 2019

View reviewed changes

ColCarroll mentioned this pull request May 21, 2019

Remove numdifftools #3485

Merged

rpgoldman force-pushed the ppc-params branch from 6543917 to 4abbe6f Compare May 21, 2019 15:30

rpgoldman and others added 3 commits May 21, 2019 13:18

Delete unused variable.

8bb85df

Add type hint to modelcontext.

f9916ea

rpgoldman force-pushed the ppc-params branch from 4abbe6f to 48b3862 Compare May 21, 2019 18:23

lucianopaz merged commit 78ce11c into pymc-devs:master May 21, 2019

[WIP] Proposed revision to sample_posterior_predictive() #3468

[WIP] Proposed revision to sample_posterior_predictive() #3468

Conversation

rpgoldman commented May 6, 2019

lucianopaz left a comment

Choose a reason for hiding this comment

lucianopaz commented May 6, 2019

rpgoldman commented May 7, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fonnesbeck commented May 7, 2019

rpgoldman commented May 7, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucianopaz commented May 8, 2019

junpenglao commented May 8, 2019

rpgoldman commented May 8, 2019 • edited Loading

junpenglao commented May 8, 2019

rpgoldman commented May 8, 2019

lucianopaz left a comment

Choose a reason for hiding this comment

lucianopaz commented May 9, 2019

rpgoldman commented May 9, 2019

rpgoldman commented May 10, 2019

lucianopaz left a comment

Choose a reason for hiding this comment

twiecki commented May 11, 2019 via email

lucianopaz commented May 11, 2019

twiecki commented May 11, 2019 via email

rpgoldman commented May 13, 2019

rpgoldman commented May 13, 2019

ColCarroll left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpgoldman commented May 16, 2019

ColCarroll left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpgoldman May 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpgoldman May 20, 2019 • edited Loading

Choose a reason for hiding this comment

twiecki commented May 21, 2019

rpgoldman commented May 21, 2019

twiecki commented May 21, 2019 via email

rpgoldman commented May 21, 2019

lucianopaz commented May 21, 2019

rpgoldman commented May 8, 2019 •

edited

Loading

rpgoldman May 20, 2019 •

edited

Loading

rpgoldman May 20, 2019 •

edited

Loading