-
Notifications
You must be signed in to change notification settings - Fork 31
Implement AD testing and benchmarking (with DITest) #883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Benchmark Report for Commit e1a34e1Computer Information
Benchmark Results
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #883 +/- ##
==========================================
+ Coverage 84.87% 84.89% +0.01%
==========================================
Files 34 35 +1
Lines 3815 3833 +18
==========================================
+ Hits 3238 3254 +16
- Misses 577 579 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Pull Request Test Coverage Report for Build 14256574630Details
💛 - Coveralls |
Pull Request Test Coverage Report for Build 14263072728Details
💛 - Coveralls |
The reasons for preference are super valid. I also think that since the hand-rolled version is not too complicated, it's worth to maintain it ourselves. Otherwise for new contributors to be able to contribute to this, they need to know what a test scenario is for DIT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The amount of (duplicated) code is quite minimal. It might make sense to merge this PR together with #882
) | ||
end | ||
|
||
function DynamicPPL.TestUtils.AD.run_ad( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One alternative is to overload DI.test_differentiation
so we can tweak adtype
internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DIT.Scenario doesn't contain a type parameter that we can dispatch on, though.
(While putting the ADTests bit together, I also found out that Scenarios can't be prepared with a specific value of x
: JuliaDiff/DifferentiationInterface.jl#771)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could do something like test_differentiation(..., ::DynamicPPL.Model, ::AbstractADType, ...)
and have that construct the scenario
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could do something like test_differentiation(..., ::DynamicPPL.Model, ::AbstractADType, ...) and have that construct the scenario
I like this idea. One could have both
test_differentiation(..., ::DynamicPPL.Model, ::AbstractADType, ...)
benchmark_differentiation(..., ::DynamicPPL.Model, ::AbstractADType, ...)
Having unified interfaces across DI and Turing for these autodiff tests would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep in mind that DIT was designed mostly to test DI itself, so its interface is still rather dirty and unstable. Also, DIT.test_differentiation
does way more than you probably need here. But if there is interest in standardization, we can take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @gdalle, for the context.
It is an excellent idea to improve DIT so it can become a community resource like DI. It's very helpful to have a standard interface where
- packages (like DynamicPPL, Bijectors, Distributions) can register test scenerios for AD backends
- AD backends can run registered test scenarios easily
It would help the autodiff dev community discover bugs more quickly. It would also inform the general users which AD backend is likely compatible with the library (e.g. Lux, Turing) they want to use (see, e.g. https://turinglang.org/ADTests/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DIT is in the weird position where it simultaneously does much more than what we need and also doesn't do some of the things we need. I've said this elsewhere (in meetings etc) but this isn't a criticism of DIT, it's just about choosing the right tool for the job IMO.
Co-authored-by: Hong Ge <[email protected]>
Closing for now (with a note to keep an eye on DIT if we can use it for future AD testing stuff) |
Part 2 of two options. The other one at #882.
Closes #869
Why am I not in favour of this one?
I think some exposition is required here, and I didn't have time to explain this super clearly during the meeting.
The API of DITest is like this:
You construct a scenario, which includes the function
f
, the value at which to evaluate it / the gradientx
, and a bunch of other things. Crucially, the scenario does not include theadtype
.You then run the scenario with an
adtype
(or an array thereof).From the perspective of generic functions
f
, this is quite a nice interface. The tricky bit with DynamicPPL, as I briefly mentioned, is that when you passLogDensityFunction
a model, varinfo, etc. it does a bunch of things that not only changes the functionf
being differentiated, but also potentially modifies theadtype
that is actually used. See, especially, this constructor:DynamicPPL.jl/src/logdensityfunction.jl
Lines 110 to 115 in 019e41b
(Note that LogDensityFunctionsAD.jl used to do this stuff for us; #806 effectively removed it and inlined its optimisations into that inner constructor.)
What this means is that, to be completely consistent with the way DynamicPPL behaves, one has to:
src/logdensityfunctions.jl
that generates the functionf
, so that the scenario can use the correctf
.adtype
, we have to make sure that scenarios generated with oneadtype
are later run with the sameadtype
.adtype
; it potentially also modifies theadtype
.make_scenario
; it also includes arun_ad
function below, which ensures that the scenario is run with the appropriately modifiedadtype
.If we adopt this PR, then we have to choose between either:
src/logdensityfunctions.jl
, as I've done in this PR; orsrc/logdensityfunctions.jl
(3) is a no-go as it would have noticeable impacts on performance, and even though I think it'd be very nice if we could just export a list of scenarios, I'm not really comfortable with either (1) or (2), and I don't think it's a good enough reason to do either.
The alternative to this, #882, already makes the API very straightforward (it's just one function with a very thorough docstring) and so I don't think it's unfair to define that as our interface - especially considering that it's most likely that we will actually be the ones writing the integration tests for other people.