Skip to content

[Merged by Bors] - Simplification of tilde-callstack #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 18 commits into from

Conversation

torfjelde
Copy link
Member

This PR introduces the simplification of the tilde-callstack as discussed in #249.

Copy-pasted from there:

  • Remove unnecessary complexity in ~ implementation.
    • Current calling hierarchy for a ~ statement is:
      • tilde_assume -> tilde(rng, ...) -> _tilde(rng, ...) -> assume
      • tilde_observe -> tilde(...) -> _tilde(...) -> observe
      • Similarly for dot_tilde_assume and dot_tilde_observe.
    • This is super-confusing and difficult to debug.
    • _tilde is currently only used for NamedDist to allow overriding the variable-name used for a particular ~ statement.
    • Propose the following changes:
      • Remove _tilde and handle NamedDist before calling tilde_assume, etc. by using a unpack_right_vns (and unpack_right_left_vns for dot-statements) (thanks to @devmotion)
      • Rename tilde_assume (tilde_observe) and to tilde_assume! (tilde_observe!), and tilde(rng, ...) (tilde(...)) to tilde_assume(rng, ...) (tilde_observe(...)).
        • tilde_assume! simply calls tilde_assume followed by acclogp(varinfo, result_from_tilde_assume), so the ! here is to indicate that it's mutating the logp field in VarInfo.

@torfjelde torfjelde mentioned this pull request May 31, 2021
1 task
right::MultivariateDistribution, left::AbstractMatrix, vn::VarName
)
vns = map(axes(left, 2)) do i
return VarName(vn, (vn.indexing..., Tuple(i)))
Copy link
Member

@yebai yebai Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly confused here - is the argument Tuple(i) redundant with vn.indexing?

EDIT: Got the answer after reading the previous code. Maybe consider adding a comment here?

@yebai
Copy link
Member

yebai commented Jun 1, 2021

bors try

bors bot added a commit that referenced this pull request Jun 1, 2021
@bors
Copy link
Contributor

bors bot commented Jun 1, 2021

try

Build failed:

Copy link
Member

@devmotion devmotion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add deprecations to make it non-breaking?

@torfjelde
Copy link
Member Author

Can you add deprecations to make it non-breaking?

Actually, how would I do this? E.g. there's no "equivalent" for _tilde anymore, right

@devmotion
Copy link
Member

E.g. there's no "equivalent" for _tilde anymore, right

Luckily _tilde is not exported and hence technically this change is not breaking 🙂

Part of the API are only

assume,
dot_assume,
observer,
dot_observe,
tilde,
dot_tilde,

(BTW it seems there is a typo?! I am quite certain we intended to export observe), so if possible, it would be good to add deprecation warnings to these, if needed. E.g., tilde should be forwarded to tilde_observe and tilde_assume it seems.

@torfjelde
Copy link
Member Author

Ooooh I didn't think we exported any of these! But cool, then I'll do that 👍

@torfjelde
Copy link
Member Author

torfjelde commented Jun 2, 2021

Wait, isn't guaranteed to be breaking since tilde_assume has been renamed to tilde_assume!, etc. for which deprecation wouldn't be sufficient?

@devmotion
Copy link
Member

AFAICT tilde_assume etc was not exported 🙂

@torfjelde
Copy link
Member Author

Oh yeah true, haha. Weird that we even export tilde though, but cool 👍

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@devmotion
Copy link
Member

Looks good but it would be good to wait until Turing is compatible with DynamicPPL 0.11 since otherwise we can't run the integration tests properly.

@torfjelde
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Jun 2, 2021
@bors
Copy link
Contributor

bors bot commented Jun 2, 2021

try

Build failed:

@torfjelde
Copy link
Member Author

Btw, @devmotion how do you think we should implement Turing.Modes.OptimizationContext in this PR? IMO it's very natural as a primitive context, but AFAIK we need to wrap other contexts to ensure that we respect prior, likelihood, etc.

@torfjelde
Copy link
Member Author

Yes, that annoyed me as well. I guess the "correct" way would be to just keep using f in package B until the deprecation is removed. Deprecation warnings are disabled by default, so users won't notice it even though we keep using the deprecated method.

Aaah they're disabled by default? I was thinking of the same but didn't bother suggesting it since I thought this would mean that people just using DPPL would see the dep-warnings. Hmm, kind of annoying though 😕

Maybe just better to make a breaking release together with #253 ?

@devmotion
Copy link
Member

Personally I think it would be better to make a non-breaking release and add the deprecation warnings, and also to keep the PRs separate. We can keep the deprecations only for a short time but it would keep DynamicPPL and Turing better synced and it would give us a bit more time to fix the deprecations in Turing. It just feels more relaxed.

@torfjelde
Copy link
Member Author

torfjelde commented Jun 4, 2021

Personally I think it would be better to make a non-breaking release and add the deprecation warnings, and also to keep the PRs separate. We can keep the deprecations only for a short time but it would keep DynamicPPL and Turing better synced and it would give us a bit more time to fix the deprecations in Turing. It just feels more relaxed.

But aren't we going to make a new breaking release of DPPL very soon anyways, following merging of #253 ? So the process is:

  1. Add calls to tilde rather than *_tilde_* and tilde_observe in every *_tilde_* statement in this PR.
  2. We get deprecations-galore, i.e. it's not helpful for identifying where things go wrong.
  3. We make new release after Introduction of SamplingContext #253, and then everything breaks anyways.

I'll do it though, but it just seems unnecessary since this PR itself adds no new functionality and so the result for the user is the same whether this PR makes it in or not. And so we might as well just wait with merging this PR until we've merged #253 into it.
I.e. merge #253 into this PR once that's approved, merge this into master, and then make a breaking release. This is why I made the target of #253 this branch rather than master btw.

@devmotion
Copy link
Member

I don't have a very strong opinion but my feeling is that deprecations and non-breaking releases would be helpful, despite the additional effort. A non-breaking release is picked up by Turing automatically and does not require any fixes which might create delays and cause problems if we want to propagate other non-breaking changes or bugfixes to Turing while doing the refactoring. It requires additional commitment and effort if we want to minimize this delay since we basically have to fix the breaking changes immediately in Turing whereas deprecations don't have to be fixed right away.

BTW is #253 actually breaking? The model API is only extended and we just wrap and unwrap SamplingContext internally, don't we?

@torfjelde
Copy link
Member Author

torfjelde commented Jun 4, 2021

A non-breaking release is picked up by Turing automatically and does not require any fixes which might create delays and cause problems if we want to propagate other non-breaking changes or bugfixes to Turing while doing the refactoring.

That's actually a good point, but if we deprecate the code for wrapper-contexts, e.g. MiniBatchContext, ends up looking like:

function tilde_assume(...)
    # do stuff
    ...
    
    # Here we're currently calling `tilde_assume` again, which makes sense.
    # But now we have to call `tilde`, which is then forwarded to `tilde_assume`.
    return tilde(...)
end

I'm not sure this makes our lives easier when making changes 😅

BTW is #253 actually breaking? The model API is only extended and we just wrap and unwrap SamplingContext internally, don't we?

Yes because you have stuff like https://github.com/TuringLang/Turing.jl/blob/c79dce028b3edd0be284bf28b13d83956606b786/src/inference/ess.jl#L138-L144 and https://github.com/TuringLang/Turing.jl/blob/c79dce028b3edd0be284bf28b13d83956606b786/src/modes/ModeEstimation.jl#L27-L39. These will break.

@devmotion
Copy link
Member

These will break.

Only if we go with my suggestion I assume? If (probably only initially) we just unwrap SamplingContext immediately by defining

function tilde_assume(c::SamplingContext, ...)
    return tilde(c.rng, c.sampler, ....)
end

etc., then these definitions would still be reached it seems. This would also avoid any issues regarding propagating it downwards, unwrapping etc since the logic would be exactly the same. Actually, that's what I assumed we would do as a start and hence I became a bit confused by all our discussions lately 😄

Then we could first fix the tilde pipeline deprecations in Turing and then start to modify and improve the pipeline in DynamicPPL in a second step.

But ultimately I think you should decide if you want to aim for a non-breaking release in this PR and include the deprecations since you are working on it and it would be additional work for you.

@torfjelde
Copy link
Member Author

torfjelde commented Jun 5, 2021

Only if we go with my suggestion I assume? If (probably only initially) we just unwrap SamplingContext immediately by defining

function tilde_assume(c::SamplingContext, ...)
    return tilde(c.rng, c.sampler, ....)
end

etc., then these definitions would still be reached it seems. This would also avoid any issues regarding propagating it downwards, unwrapping etc since the logic would be exactly the same. Actually, that's what I assumed we would do as a start and hence I became a bit confused by all our discussions lately

Ughh yeah true. I've gotten a bit too tunnel-visioned on the separation between sampling and evaluation.
It should make it non-breaking, but it means that:

  1. We have "inconsistent" signatures which isn't preferrable.
  2. We can't also make the changes to assume(rng, ...) that makes it so that it always samples new values, since this will also break what the implementations I referred to above.

I.e. if we're making that change and want it to be non-breaking, it's literally just a refactoring that the users never see 😕

But ultimately I think you should decide if you want to aim for a non-breaking release in this PR and include the deprecations since you are working on it and it would be additional work for you.

I'm strongly leaning towards just breaking 😕 It's just because we're going to need to do some signficant work on downstream packages to accomodate the changes anyways, and the non-breaking changes we can make are both somewhat annoying (e.g. having to call tilde from tilde_assume) and has the potential of requiring changes down-stream (it shouldn't, but we're touching internal methods in downstream packages in ways we maybe shouldn't). All this leads to both delays and more work in total.

I really prefer just merging into this PR, and then making sure that integration-tests pass, before making one big breaking release. From an end-user perspective, there will be no difference between this and making the non-breaking releases first.

@yebai
Copy link
Member

yebai commented Jun 5, 2021

I think it's ok to make a breaking release if we can

  1. Identify all breaking changes and put them in the DynamicPPL's readme file.
  2. Fix all downstream uses/tests in Turing.

It is worth noting that there are several non-Turing packages (indirectly) depending on DynamicPPL, e.g. MCMCTempering and AnnealedIS, and DiffEqBayes.

@torfjelde
Copy link
Member Author

It is worth noting that there are several non-Turing packages (indirectly) depending on DynamicPPL, e.g. MCMCTempering and AnnealedIS, and DiffEqBayes.

Just checked real quick, and none of those touch the internals of DPPL, e.g. overloading tilde, so they should be completely unaffected:)

end
function tilde(rng, ctx::PriorContext, sampler, right, vn::VarName, inds, vi)
function tilde_assume(rng, ctx::PriorContext, sampler, right, vn::VarName, inds, vi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we get rid of inds at this place, too? The value of it is (kind of by implicit contract) supposed to be the same as vn.indexing anyway. The only reason for its existence I can remember is that is seemed convenient to me for AutoGibbs...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also of the opinion that we should do stuff like that, but I figured I'd leave it out of this particular PR and remove such redundancies later.

Btw, would love to hear your thoughts on #253 :)

torfjelde and others added 3 commits June 7, 2021 11:31
This is #253 but the only motivation here is to get `SamplingContext` in, nothing relating to interactions with other contexts, etc.

Co-authored-by: Hong Ge <[email protected]>
@codecov
Copy link

codecov bot commented Jun 9, 2021

Codecov Report

Merging #252 (06d319c) into master (5f61d88) will decrease coverage by 11.60%.
The diff coverage is 48.35%.

❗ Current head 06d319c differs from pull request most recent head 650c69a. Consider uploading reports for the commit 650c69a to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           master     #252       +/-   ##
===========================================
- Coverage   74.50%   62.90%   -11.61%     
===========================================
  Files          15       15               
  Lines        1216     1294       +78     
===========================================
- Hits          906      814       -92     
- Misses        310      480      +170     
Impacted Files Coverage Δ
src/varname.jl 85.71% <0.00%> (-14.29%) ⬇️
src/context_implementations.jl 40.63% <41.05%> (-7.52%) ⬇️
src/compiler.jl 80.00% <60.00%> (-3.19%) ⬇️
src/loglikelihoods.jl 44.89% <60.00%> (-20.32%) ⬇️
src/contexts.jl 87.50% <66.66%> (+0.83%) ⬆️
src/model.jl 76.59% <85.71%> (-14.32%) ⬇️
src/submodel_macro.jl 100.00% <100.00%> (ø)
src/varinfo.jl 66.38% <0.00%> (-18.22%) ⬇️
src/selector.jl 85.71% <0.00%> (-14.29%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c95ccfa...650c69a. Read the comment docs.

Co-authored-by: David Widmann <[email protected]>
@torfjelde
Copy link
Member Author

Looks like we're ready to go?

@torfjelde
Copy link
Member Author

Oh actually, I guess we should wait for #261.

@torfjelde
Copy link
Member Author

Are we ready to give bors the 👍 ? This won't be affected by the libtask issue since we're not testing Turing in the integration tests.

@yebai
Copy link
Member

yebai commented Jun 9, 2021

bors try

bors bot added a commit that referenced this pull request Jun 9, 2021
@yebai
Copy link
Member

yebai commented Jun 9, 2021

bors r+

bors bot pushed a commit that referenced this pull request Jun 9, 2021
This PR introduces the simplification of the tilde-callstack as discussed in #249.

Copy-pasted from there:
- Remove unnecessary complexity in `~` implementation.
  - Current calling hierarchy for a `~` statement is:
    - `tilde_assume` -> `tilde(rng, ...)` -> `_tilde(rng, ...)` -> `assume`
    - `tilde_observe` -> `tilde(...)` -> `_tilde(...)` -> `observe`
    - Similarly for `dot_tilde_assume` and `dot_tilde_observe`.
  - This is super-confusing and difficult to debug.
  - `_tilde` is currently only used for `NamedDist` to allow overriding the variable-name used for a particular `~` statement.
  - Propose the following changes:
    - Remove `_tilde` and handle `NamedDist` _before_ calling `tilde_assume`, etc. by using a `unpack_right_vns` (and `unpack_right_left_vns` for dot-statements) (thanks to @devmotion)
    - Rename `tilde_assume` (`tilde_observe`) and to `tilde_assume!` (`tilde_observe!`), and `tilde(rng, ...)` (`tilde(...)`) to `tilde_assume(rng, ...)` (`tilde_observe(...)`).
      - `tilde_assume!` simply calls `tilde_assume` followed by `acclogp(varinfo, result_from_tilde_assume)`, so the `!` here is to indicate that it's mutating the `logp` field in `VarInfo`.


Co-authored-by: Hong Ge <[email protected]>
@bors bors bot changed the title Simplification of tilde-callstack [Merged by Bors] - Simplification of tilde-callstack Jun 9, 2021
@bors bors bot closed this Jun 9, 2021
@bors bors bot deleted the tor/tilde-simplification branch June 9, 2021 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants