Skip to content

Fix condition and fix in submodels #892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 23, 2025
Merged

Fix condition and fix in submodels #892

merged 17 commits into from
Apr 23, 2025

Conversation

penelopeysm
Copy link
Member

@penelopeysm penelopeysm commented Apr 16, 2025

Summary

This PR closes #857, i.e., gives users sensible ways of conditioning and fixing variables in submodels. To aid in understanding this, it also adds a long piece of documentation (PREVIEW HERE) explaining the design.

Performance

In the table below I compare the time taken for _evaluate!!(model, ...) on breaking (the base branch) and py/submodel-cond (HEAD of this PR). For good measure, the followup PR #896 is also benchmarked.

The model tested comprises m submodels, all of which contain n assumed variables. All times are in µs.

Profiling code (click to expand)
using DynamicPPL, Distributions, Chairmarks
using Plots

@model function inner(n)
    xs = Vector{Float64}(undef, n)
    for i in eachindex(xs)
        xs[i] ~ Normal(0, 1)
    end
end
@model function outer1(n)
    a ~ to_submodel(inner(n))
end
@model function outer10(n)
    a1 ~ to_submodel(inner(n))
    a2 ~ to_submodel(inner(n))
    a3 ~ to_submodel(inner(n))
    a4 ~ to_submodel(inner(n))
    a5 ~ to_submodel(inner(n))
    a6 ~ to_submodel(inner(n))
    a7 ~ to_submodel(inner(n))
    a8 ~ to_submodel(inner(n))
    a9 ~ to_submodel(inner(n))
    a10 ~ to_submodel(inner(n))
end
@model function outer20(n)
    a1 ~ to_submodel(inner(n))
    a2 ~ to_submodel(inner(n))
    a3 ~ to_submodel(inner(n))
    a4 ~ to_submodel(inner(n))
    a5 ~ to_submodel(inner(n))
    a6 ~ to_submodel(inner(n))
    a7 ~ to_submodel(inner(n))
    a8 ~ to_submodel(inner(n))
    a9 ~ to_submodel(inner(n))
    a10 ~ to_submodel(inner(n))
    a11 ~ to_submodel(inner(n))
    a12 ~ to_submodel(inner(n))
    a13 ~ to_submodel(inner(n))
    a14 ~ to_submodel(inner(n))
    a15 ~ to_submodel(inner(n))
    a16 ~ to_submodel(inner(n))
    a17 ~ to_submodel(inner(n))
    a18 ~ to_submodel(inner(n))
    a19 ~ to_submodel(inner(n))
    a20 ~ to_submodel(inner(n))
end

function profile(m, n)
    @info "Profiling with $m submodel(s) and $n inner model size"
    if m == 1
        model = outer1(n)
    elseif m == 10
        model = outer10(n)
    elseif m == 20
        model = outer20(n)
    else
        error("Invalid value for m")
    end
    v = VarInfo(model); c = DefaultContext();
    b = @be DynamicPPL._evaluate!!(model, v, c)
    @info "... got $(median(b).time)"
    return median(b).time
end
ms = [1, 10, 20]
ns = [1, 10, 25, 50, 100, 200]

# call profile(m, n) for m in ms for n in ns
         PR base      #892 - THIS PR    #896
m   n    breaking     py/submodel-cond  py/submodel-prefix
1   1    0.665634146  0.77410526        0.403409091
1   10   1.45625      1.60305263        1.16836
1   25   2.9          3.0375            2.549272727
1   50   4.9834       5.0418            4.645833333
1   100  9.416666667  9.43066667        9.3125
1   200  18.125       18.042            17.709
10  1    44.333       49                39.5
10  10   61.7085      65.458            54.625
10  25   85.875       92.771            81.458
10  50   136.458      140.812           129.042
10  100  235.708      232.6875          225.125
10  200  411.875      403.125           399.291
20  1    594.708      605.167           579.9585
20  10   650.9995     671.271           627.7915
20  25   734.417      759.937           708.375
20  50   866.292      876.521           839.708
20  100  1135.1665    1143.646          1110.6875
20  200  1702.291     1701.792          1644.166

TODO

  • Check performance
    • Understand why performance is improved
  • Apply the same fixes for fix
  • Check whether the behaviour of decondition and unfix is still sensible (it should be, but you never know...)
  • Document the approach taken in managing stacks of PrefixContext and (ConditionContext | FixedContext)
    • Correct the blurb about tilde_assume (it's no longer accurate after this PR)
    • Add a section on getting nested submodels right (i.e. PrefixContexts of PrefixContexts)
  • Add changelog entry
  • Add high-level tests for behaviour at the model level
    • paying special attention to nested submodels, because that's what broke [wip] submodel explorations #815;
    • also make sure to check manual prefixing and no prefixing;
    • also make sure to check the old-style conditioning via model arguments (the code I modified should not affect this because it doesn't touch the inargnames checks, only the contextual_isassumption checks; but probably good to test anyway)
    • also make sure to test with non-identity-lens varnames on LHS of submodel tilde
  • Add unit tests for new functions

@penelopeysm penelopeysm changed the base branch from main to breaking April 16, 2025 22:41
Copy link

codecov bot commented Apr 16, 2025

Codecov Report

Attention: Patch coverage is 97.61905% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.04%. Comparing base (be27636) to head (fcb44e5).
Report is 3 commits behind head on breaking.

Files with missing lines Patch % Lines
src/contexts.jl 98.30% 1 Missing ⚠️
src/utils.jl 80.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           breaking     #892      +/-   ##
============================================
+ Coverage     84.76%   85.04%   +0.28%     
============================================
  Files            35       35              
  Lines          3879     3919      +40     
============================================
+ Hits           3288     3333      +45     
+ Misses          591      586       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@penelopeysm
Copy link
Member Author

penelopeysm commented Apr 16, 2025

Mildly amazed that it passed all tests

I'm silly, I deleted the test suite, of course it passed

Copy link
Contributor

github-actions bot commented Apr 17, 2025

Benchmark Report for Commit fcb44e5

Computer Information

Julia Version 1.11.5
Commit 760b2e5b739 (2025-04-14 06:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 4 × AMD EPYC 7763 64-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

Benchmark Results

|                 Model | Dimension |  AD Backend |      VarInfo Type | Linked | Eval Time / Ref Time | AD Time / Eval Time |
|-----------------------|-----------|-------------|-------------------|--------|----------------------|---------------------|
| Simple assume observe |         1 | forwarddiff |             typed |  false |                  9.1 |                 1.6 |
|           Smorgasbord |       201 | forwarddiff |             typed |  false |                712.4 |                33.4 |
|           Smorgasbord |       201 | forwarddiff | simple_namedtuple |   true |                401.0 |                46.8 |
|           Smorgasbord |       201 | forwarddiff |           untyped |   true |               1187.2 |                26.7 |
|           Smorgasbord |       201 | forwarddiff |       simple_dict |   true |               3305.2 |                22.5 |
|           Smorgasbord |       201 | reversediff |             typed |   true |               1410.5 |                29.2 |
|           Smorgasbord |       201 |    mooncake |             typed |   true |                924.6 |                 5.1 |
|    Loop univariate 1k |      1000 |    mooncake |             typed |   true |               5241.9 |                 4.2 |
|       Multivariate 1k |      1000 |    mooncake |             typed |   true |                976.0 |                 8.9 |
|   Loop univariate 10k |     10000 |    mooncake |             typed |   true |              58686.3 |                 3.7 |
|      Multivariate 10k |     10000 |    mooncake |             typed |   true |               8484.6 |                 9.8 |
|               Dynamic |        10 |    mooncake |             typed |   true |                125.0 |                11.9 |
|              Submodel |         1 |    mooncake |             typed |   true |                 13.1 |                 6.6 |
|                   LDA |        12 | reversediff |             typed |   true |                452.6 |                 5.3 |

@penelopeysm
Copy link
Member Author

penelopeysm commented Apr 17, 2025

There's a nasty 4x regression on the submodel benchmark...

Submodels now actually run faster. I was prepared to take a slight hit on performance, but this has exceeded my expectations.

@penelopeysm penelopeysm force-pushed the py/submodel-cond branch 3 times, most recently from 44d75fe to 478c879 Compare April 17, 2025 12:14
@penelopeysm penelopeysm force-pushed the py/submodel-cond branch 2 times, most recently from 8abd633 to fd0ee1d Compare April 17, 2025 18:21
@penelopeysm
Copy link
Member Author

@yebai Oh, I like the newsletter board idea!

@penelopeysm penelopeysm marked this pull request as ready for review April 18, 2025 00:23
@penelopeysm penelopeysm changed the title Fix condition and fix in submodels Fix condition, fix, and variable prefixing in submodels Apr 19, 2025
@penelopeysm penelopeysm force-pushed the py/submodel-cond branch 2 times, most recently from 4dd93b5 to 8c3bff4 Compare April 21, 2025 17:07
@penelopeysm penelopeysm changed the title Fix condition, fix, and variable prefixing in submodels Fix condition and fix in submodels Apr 21, 2025
@penelopeysm penelopeysm requested a review from mhauru April 22, 2025 12:58
Copy link
Member

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good, thanks for doing this. I only had tiny comments and questions.

In Turing.jl, GibbsContext might benefit from a similar treatment. Or replacing it with use of ConditionContext. There was some reason why we didn't do that originally, I forget now what it was.

@penelopeysm penelopeysm requested a review from mhauru April 22, 2025 17:00
@penelopeysm
Copy link
Member Author

I have never taken a look at GibbsContext, but I reckon I can do so after this DPPL release.

Copy link
Member

@mhauru mhauru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@penelopeysm penelopeysm merged commit ff5f2cb into breaking Apr 23, 2025
4 of 16 checks passed
@penelopeysm penelopeysm deleted the py/submodel-cond branch April 23, 2025 11:09
@penelopeysm penelopeysm mentioned this pull request Apr 23, 2025
8 tasks
github-merge-queue bot pushed a commit that referenced this pull request Apr 24, 2025
* Release 0.36

* AbstractPPL 0.11 + change prefixing behaviour (#830)

* AbstractPPL 0.11; change prefixing behaviour

* Use DynamicPPL.prefix rather than overloading

* Remove VarInfo(VarInfo, params) (#870)

* Unify `{untyped,typed}_{vector_,}varinfo` constructor functions (#879)

* Unify {Untyped,Typed}{Vector,}VarInfo constructors

* Update invocations

* NTVarInfo

* Fix tests

* More fixes

* Fixes

* Fixes

* Fixes

* Use lowercase functions, don't deprecate VarInfo

* Rewrite VarInfo docstring

* Fix methods

* Fix methods (really)

* Link varinfo by default in AD testing utilities; make test suite run on linked varinfos (#890)

* Link VarInfo by default

* Tweak interface

* Fix tests

* Fix interface so that callers can inspect results

* Document

* Fix tests

* Fix changelog

* Test linked varinfos

Closes #891

* Fix docstring + use AbstractFloat

* Fix `condition` and `fix` in submodels (#892)

* Fix conditioning in submodels

* Simplify contextual_isassumption

* Add documentation

* Fix some tests

* Add tests; fix a bunch of nested submodel issues

* Fix fix as well

* Fix doctests

* Add unit tests for new functions

* Add changelog entry

* Update changelog

Co-authored-by: Hong Ge <[email protected]>

* Finish docs

* Add a test for conditioning submodel via arguments

* Clean new tests up a bit

* Fix for VarNames with non-identity lenses

* Apply suggestions from code review

Co-authored-by: Markus Hauru <[email protected]>

* Apply suggestions from code review

* Make PrefixContext contain a varname rather than symbol (#896)

---------

Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: Markus Hauru <[email protected]>

---------

Co-authored-by: Markus Hauru <[email protected]>
Co-authored-by: Hong Ge <[email protected]>
Co-authored-by: Markus Hauru <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conditioning on submodel variables
3 participants