Skip to content

Prior predictive sampling does not work with Dirichlet Process example #3789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rpgoldman opened this issue Jan 27, 2020 · 7 comments
Closed

Comments

@rpgoldman
Copy link
Contributor

Description of the problem

I copied the dp_mix.ipynb notebook (Austin Rochford's Dirichlet Process example) to a working directory. Then I started it up, and built his model as follows:

with pm.Model() as model:
    alpha = pm.Gamma('alpha', 1., 1.)
    beta = pm.Beta('beta', 1., alpha, shape=K)
    w = pm.Deterministic('w', stick_breaking(beta))

    tau = pm.Gamma('tau', 1., 1., shape=K)
    lambda_ = pm.Gamma('lambda_', 10., 1., shape=K)
    mu = pm.Normal('mu', 0, tau=lambda_ * tau, shape=K)
    obs = pm.NormalMixture('obs', w, mu, tau=lambda_ * tau,
                           observed=old_faithful_df.std_waiting.values)

When I try to use pm.sample_prior_predictive() on this model, I get the following error:

Please provide the full traceback.

ValueError                                Traceback (most recent call last)
~/src/pymc3/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
    812                 try:
--> 813                     return dist_tmp.random(point=point, size=size)
    814                 except (ValueError, TypeError):

~/src/pymc3/pymc3/distributions/mixture.py in random(self, point, size)
    442             comp_dist_shapes, broadcast_shape = (
--> 443                 self.infer_comp_dist_shapes(point=point)
    444             )

~/src/pymc3/pymc3/distributions/mixture.py in infer_comp_dist_shapes(self, point)
    365                     test_sample = self._comp_dists.random(point=point,
--> 366                                                           size=None)
    367                     comp_dist_shapes = test_sample.shape

~/src/pymc3/pymc3/distributions/continuous.py in random(self, point, size)
    494         mu, tau, _ = draw_values([self.mu, self.tau, self.sigma],
--> 495                                  point=point, size=size)
    496         return generate_samples(stats.norm.rvs, loc=mu, scale=tau**-0.5,

~/src/pymc3/pymc3/distributions/distribution.py in draw_values(params, point, size)
    649             if to_eval == missing_inputs:
--> 650                 raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
    651             to_eval = set(missing_inputs)

ValueError: Cannot resolve inputs for ['Elemwise{mul,no_inplace}.0', 'Elemwise{mul,no_inplace}.0']

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-22-1660b0c992d3> in <module>
      1 with model:
----> 2     samples = pm.sample_prior_predictive()

~/src/pymc3/pymc3/sampling.py in sample_prior_predictive(samples, model, vars, var_names, random_seed)
   1733     names = get_default_varnames(vars_, include_transformed=False)
   1734     # draw_values fails with auto-transformed variables. transform them later!
-> 1735     values = draw_values([model[name] for name in names], size=samples)
   1736 
   1737     data = {k: v for k, v in zip(names, values)}

~/src/pymc3/pymc3/distributions/distribution.py in draw_values(params, point, size)
    630                                         point=point,
    631                                         givens=temp_givens,
--> 632                                         size=size)
    633                     givens[next_.name] = (next_, value)
    634                     drawn[(next_, size)] = value

~/src/pymc3/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
    820                     with _DrawValuesContextBlocker():
    821                         val = np.atleast_1d(dist_tmp.random(point=point,
--> 822                                                             size=None))
    823                     # Sometimes point may change the size of val but not the
    824                     # distribution's shape

~/src/pymc3/pymc3/distributions/mixture.py in random(self, point, size)
    441             w = draw_values([self.w], point=point, size=size)[0]
    442             comp_dist_shapes, broadcast_shape = (
--> 443                 self.infer_comp_dist_shapes(point=point)
    444             )
    445 

~/src/pymc3/pymc3/distributions/mixture.py in infer_comp_dist_shapes(self, point)
    364                 with _DrawValuesContextBlocker():
    365                     test_sample = self._comp_dists.random(point=point,
--> 366                                                           size=None)
    367                     comp_dist_shapes = test_sample.shape
    368             broadcast_shape = comp_dist_shapes

~/src/pymc3/pymc3/distributions/continuous.py in random(self, point, size)
    493         """
    494         mu, tau, _ = draw_values([self.mu, self.tau, self.sigma],
--> 495                                  point=point, size=size)
    496         return generate_samples(stats.norm.rvs, loc=mu, scale=tau**-0.5,
    497                                 dist_shape=self.shape,

~/src/pymc3/pymc3/distributions/distribution.py in draw_values(params, point, size)
    648         while to_eval or missing_inputs:
    649             if to_eval == missing_inputs:
--> 650                 raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
    651             to_eval = set(missing_inputs)
    652             missing_inputs = set()

ValueError: Cannot resolve inputs for ['Elemwise{mul,no_inplace}.0', 'Elemwise{mul,no_inplace}.0']

I am able to successfully sample all the other variables in the model, including the parents of the obs random variable, but if I try to sample obs, I get that error.

Versions and main components

  • PyMC3 Version: Head
  • Theano Version: 1.0.4
  • Python Version: 3.6.8
  • Operating system: Mac OS
  • How did you install PyMC3: pip -e from my git repo
@rpgoldman
Copy link
Contributor Author

This trivial NormalMixture seems to sample correctly:

with pm.Model() as test_model:
    pm.NormalMixture('obs', [0.1, 0.8, 0.1], [-1, 0, 1], tau=5)
    samples = pm.sample_prior_predictive()

It seems like there's some problem with the need to infer the shape of the component normals in this case. But shouldn't the shape just be the shape of the parent mixture variable?

@lucianopaz
Copy link
Contributor

Could you try explicitly providing comp_shape=(k,) to the normal mixture? The error you're seeing happens because the mixture class can't infer the shape of the mixture components correctly.
Another thing, the traceback you copied has a different like numbering compared to the files on the master branch. Are you working ahead of master? That shouldn't be a problem as long as only the docstrings were changed, but just make sure that for some reason you aren't behind it.

@rpgoldman
Copy link
Contributor Author

@lucianopaz

  1. I'm working on my vectorize-posterior-predictive branch, because speedy prediction is critical to my project. But that does not touch the infrastructure for prior predictive sampling: I inject a stub into sampling.py that redirects draw_values when my vectorized sampler is active.

  2. I'm afraid that changing the model to the following does not solve the problem:

    with pm.Model() as model:
        alpha = pm.Gamma('alpha', 1., 1.)
        beta = pm.Beta('beta', 1., alpha, shape=K)
        w = pm.Deterministic('w', stick_breaking(beta))
    
        tau = pm.Gamma('tau', 1., 1., shape=K)
        lambda_ = pm.Gamma('lambda_', 10., 1., shape=K)
        mu = pm.Normal('mu', 0, tau=lambda_ * tau, shape=K)
        obs = pm.NormalMixture('obs', w, mu, tau=lambda_ * tau, comp_shape=(K,),
                               observed=old_faithful_df.std_waiting.values)
    

    ... does not solve the problem, although now the error is different -- I get an error that the probabilities don't sum to 1:

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    ~/src/pymc3/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
        812                 try:
    --> 813                     return dist_tmp.random(point=point, size=size)
        814                 except (ValueError, TypeError):
    
    ~/src/pymc3/pymc3/distributions/mixture.py in random(self, point, size)
        496         w_samples = random_choice(p=w_,
    --> 497                                   size=None)  # w's shape already includes size
        498         # Now we broadcast the chosen components to the dist_shape
    
    ~/src/pymc3/pymc3/distributions/dist_math.py in random_choice(*args, **kwargs)
        337         p = np.reshape(p, (-1, p.shape[-1]))
    --> 338         samples = np.array([np.random.choice(k, p=p_) for p_ in p])
        339         # We reshape to the desired output shape
    
    ~/src/pymc3/pymc3/distributions/dist_math.py in <listcomp>(.0)
        337         p = np.reshape(p, (-1, p.shape[-1]))
    --> 338         samples = np.array([np.random.choice(k, p=p_) for p_ in p])
        339         # We reshape to the desired output shape
    
    mtrand.pyx in numpy.random.mtrand.RandomState.choice()
    
    ValueError: probabilities do not sum to 1
    
    During handling of the above exception, another exception occurred:
    
    ValueError                                Traceback (most recent call last)
    <ipython-input-26-1660b0c992d3> in <module>
          1 with model:
    ----> 2     samples = pm.sample_prior_predictive()
    
    ~/src/pymc3/pymc3/sampling.py in sample_prior_predictive(samples, model, vars, var_names, random_seed)
       1733     names = get_default_varnames(vars_, include_transformed=False)
       1734     # draw_values fails with auto-transformed variables. transform them later!
    -> 1735     values = draw_values([model[name] for name in names], size=samples)
       1736 
       1737     data = {k: v for k, v in zip(names, values)}
    
    ~/src/pymc3/pymc3/distributions/distribution.py in draw_values(params, point, size)
        630                                         point=point,
        631                                         givens=temp_givens,
    --> 632                                         size=size)
        633                     givens[next_.name] = (next_, value)
        634                     drawn[(next_, size)] = value
    
    ~/src/pymc3/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
        820                     with _DrawValuesContextBlocker():
        821                         val = np.atleast_1d(dist_tmp.random(point=point,
    --> 822                                                             size=None))
        823                     # Sometimes point may change the size of val but not the
        824                     # distribution's shape
    
    ~/src/pymc3/pymc3/distributions/mixture.py in random(self, point, size)
        537                 size=mixture_size,
        538                 broadcast_shape=broadcast_shape,
    --> 539                 comp_dist_shapes=comp_dist_shapes,
        540             )
        541         # Test that the mixture has the same number of "samples" as w
    
    ~/src/pymc3/pymc3/distributions/mixture.py in _comp_samples(self, point, size, comp_dist_shapes, broadcast_shape)
        298                       broadcast_shape=None):
        299         if self.comp_is_distribution:
    --> 300             samples = self._comp_dists.random(point=point, size=size)
        301         else:
        302             if comp_dist_shapes is None:
    
    ~/src/pymc3/pymc3/distributions/continuous.py in random(self, point, size)
        493         """
        494         mu, tau, _ = draw_values([self.mu, self.tau, self.sigma],
    --> 495                                  point=point, size=size)
        496         return generate_samples(stats.norm.rvs, loc=mu, scale=tau**-0.5,
        497                                 dist_shape=self.shape,
    
    ~/src/pymc3/pymc3/distributions/distribution.py in draw_values(params, point, size)
        648         while to_eval or missing_inputs:
        649             if to_eval == missing_inputs:
    --> 650                 raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
        651             to_eval = set(missing_inputs)
        652             missing_inputs = set()
    
    ValueError: Cannot resolve inputs for ['Elemwise{mul,no_inplace}.0', 'Elemwise{mul,no_inplace}.0']
    
  3. The docstring for comp_shape is not at all clear to me:

        comp_shape : shape of the Normal component
            notice that it should be different than the shape
            of the mixture distribution, with one axis being
            the number of components.
    

    In this test case, there are K normal components in the mix. So does that mean that the comp_shape should be (K,) or does it mean that there are K different components each of shape (1,)? The docstring can be read either way.

    If you can answer that question, I will make a MR that puts an example into the docstring.

@lucianopaz
Copy link
Contributor

About point 3, yes, the comp_shape should be (K,). It should be something that can broadcast with dist_shape + (K,), that's why it's ambiguous. For example, if your mixture had shape, (3,), then the mixture components could have shape (3, K), (1,K) or (K,).
About point 2, something is definitely broken. _comp_dists.random(point=point, size=size) shouldn't raise an error like that. I'll check it out

@lucianopaz
Copy link
Contributor

Found the bug in draw_values. Will open a PR to fix shortly. It's linked to an old issue that I had patched up in b9f960a. Now I realized that patch was a bit wrong, and figured out a more general way to solve both problems.

@michaelosthege
Copy link
Member

This was possibly fixed by v4, where the entire random sampling procedure works totally different.

I'm going to mark this as "needs info" instead of closing though. We should take the example from above and check if it works in v4.

@michaelosthege michaelosthege added bug needs info Additional information required labels Jun 5, 2021
@ricardoV94
Copy link
Member

I confirmed this works in V4. The only issue is that the weights sometimes overflow above 1 leading to an error in predictive sampling of the interval Mixture categorical variable. Setting w = w/w.sum() fixes it.

@ricardoV94 ricardoV94 added wontfix and removed needs info Additional information required labels Nov 3, 2022
@ricardoV94 ricardoV94 closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants