Skip to content

Variables converted to shared variables by assigning step methods do not have broadcastable dimensions identified #3337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ctm22396 opened this issue Jan 11, 2019 · 1 comment

Comments

@ctm22396
Copy link
Contributor

ctm22396 commented Jan 11, 2019

I was running a model that required broadcasting a variable along a length 1 dimension. For example:

import numpy as np
import pymc3 as pm

y = np.random.choice([0, 1], size=(100, 40))

with pm.Model() as model:
    a = pm.Lognormal('a', shape=(1, 40))
    b = pm.Normal('b', shape=(1, 40))
    
    theta = pm.Normal('theta', shape=(100, 1))
    logit_p = pm.math.dot(theta, a) + b

    yhat = pm.Bernoulli('yhat', logit_p=logit_p, observed=y)

    trace = pm.sample(50, tune=0, chains=1,
              step = pm.Metropolis(vars=[theta]))

Assigning a separate step method to a, b and theta (or rather setting a step method to theta and letting a, b be auto-assigned) would yield the following error:

TypeError: Type TensorType(float64, matrix) (of Variable theta_shared__) \
into Type TensorType(float64, col) ...[truncated]

The issue comes from when a, b, and theta are transformed into a shared variable when either initializing the step method (in the call to make_shared_replacements made by Metropolis) or initializing the ValueGradFunction object (called by NUTS), the broadcastable argument is not provided. By default theano shared variables are not broadcastable. So without setting the broadcastable argument, the dot on theta and a fails.

This can be solved by setting the theano.shared broadcastable argument equal to var.broadcastable. That way, the shared replacement for the has the same broadcasting rules as the pymc variable it is standing in for. If the variable is scalar, the broadcastable argument cannot be set, so we must split the cases as follows:

Example from theanof.py make_shared_replacements: Instead of

return {var: theano.shared(var.tag.test_value, var.name + '_shared') for var in othervars}

I propose

out = {}
for var in othervars:
    broadcast = var.broadcastable
    if len(broadcast):
        out[var] = theano.shared(var.tag.test_value, var.name + '_shared',
                                 broadcastable=broadcast) # <- this is the solution
    else:
        out[var] = theano.shared(var.tag.test_value, var.name + '_shared')
return out

and in model.py __init__ of ValueGradFunction: Instead of

for var in extra_vars:
    shared = theano.shared(var.tag.test_value, var.name + '_shared__')
    self._extra_vars_shared[var.name] = shared

I propose

for var in extra_vars:
    broadcast = var.broadcastable
    if len(broadcast):
        shared = theano.shared(var.tag.test_value,
                               var.name + '_shared__',
                               broadcastable=broadcast)
    else:
        shared = theano.shared(var.tag.test_value,
                               var.name + '_shared__')

    self._extra_vars_shared[var.name] = shared

And likewise wherever else this problem arises.

Here is additional information about my environment:

  • PyMC3 Version: 3.6
  • Theano Version: 1.0.2
  • Python Version: 3.7.1
  • Operating system: linux
  • How did you install PyMC3: conda
michaelosthege pushed a commit that referenced this issue Mar 10, 2021
It seems like broadcasting information gets lost when applying
`pm.make_shared_replacements`, leading to problems with the metropolis
sampler. Potentially related issues below:
 - #1083
 - #1304
 - #1983

This fix was previously suggested in the following issue:
 - #3337

It could be that further adaptations are necessary as indicated in the
issue. Strangely, this does not seem to lead to problems when using
NUTS.
@michaelosthege
Copy link
Member

closed by #4492

michaelosthege pushed a commit to michaelosthege/pymc that referenced this issue Mar 10, 2021
It seems like broadcasting information gets lost when applying
`pm.make_shared_replacements`, leading to problems with the metropolis
sampler. Potentially related issues below:
 - pymc-devs#1083
 - pymc-devs#1304
 - pymc-devs#1983

This fix was previously suggested in the following issue:
 - pymc-devs#3337

It could be that further adaptations are necessary as indicated in the
issue. Strangely, this does not seem to lead to problems when using
NUTS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants