-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Document the role of transforms #7040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Transforms have no role in forward sampling methods (prior/posterior predictive) nor in observed variables during mcmc (pm.sample), since observed variables are fixed then. I think you're misunderstanding their use. Most users shouldn't bother with transforms, they are there to be able to do mcmc in an unconstrained space, and are not meant to change the meaning of the model (there's some exceptions like Ordered, but that's never a default). Sounds like you want a Truncated Normal instead, if the bounds are part of your generative model |
Ok, thank you for the quick reply. I think truncated normal should solve my actual use case (technically, im using a GMM but switching it to Mixed(TruncatedGaussian) should do the trick) Would it maybe make sense to update the transformations documentation with a paragraph about when they should or should not be used to avoid future confused developers like me? |
This being said: the example i demonstrated should not be too hard yo make work by just backward transforming observations and then forward transforming the model outputs for prior/posterior predictive. It would add some interesting flexibility. |
The transformations are not just a deterministic transformation, they are accompanied by a jacobian so that when doing mcmc sampling, the prior specified by the RV is respected. There's no principled way of doing that with forward sampling. The forward pass of an interval transformed Normal wouldn't look like neither a Normal nor a TruncatedNormal. If you want a deterministic transformation that is part of the generative graph you should write it explicitly. |
Ok. That would explain the use of exponential and sigmoid in the interval transformation code I noticed :) Thank you again for clarifying. Maybe still makes sense to add a "mostly for internal use" warning to https://www.pymc.io/projects/docs/en/stable/api/distributions/transforms.html ? |
Regarding them not working on prior/posterior predictive and this being a surprise to users, there's a proposal to distinguish between default automatic transforms and user defined ones: #5674 We could then issue a warning that they have no effect in those cases like we do for models with Potentials |
I think a note on the docs could be super valuable. Explain the normal use cases and what they do and do not. |
Hello, first time contributor here. I'd struggled to wrap my head round the transforms and figured that working on this issue would help me learn about them. In writing the docs I came across some things that are still unclear to me; I thought I'd mention them here either to get feedback on incorporating them into the current PR or perhaps they might become separate issues if need be: Is there any intended convention for the naming of Transform subclasses?In some cases, such as Is there a principled distinction between the Transforms defined in
|
I don't think so. What you describe looks like a difference in emphasis between 1) transforms used to unconstrain sampling and 2) transforms used to distort sampling.
This is partly a legacy issue, partly a partial (no pun) overlap issue. The transforms in distribution.transforms are the user facing methods. Some of these correspond to a subset of the transforms from the logprob module, which are used in more internal contexts (such as logprob inference). There are also some transforms that are only useful for users, like Ordered, which are defined only in distribution.transforms. TLDR: Transforms needed for internal consumption are defined in the logprob module. Transforms needed for user consumption are defined or made available in the distributions module. Users shouldn't have to know about those in the logprob module, but it won't hurt them if they stumble across them accidentally either.
Some transforms have to be operationalized (e.g., the ZeroSumAxes requires knowing the number of axes of summation). Other transforms can be used always the same way, so we instantiate them once for users. There was some forth and back during recent times, so there may be cases that are not up to date. |
Describe the issue:
If a transformation is done with an observed variable, it seems to flat out be ignored.
In the example below, the observations are out of the interval given to pm.Normal, but it still fits as if the transformation was not there. Sampling the posterior predictive also ignores the transform, giving values way out of the range.
Transformations are currently not very well documented so I might be misunderstanding something, but having an interval transform keep values within the interval even on observations seems like a sensible expectation so I'm filing this as a bug.
Reproduceable code example:
Error message:
No response
PyMC version information:
PyMC 5.9.2
Context for the issue:
No response
The text was updated successfully, but these errors were encountered: