Preregister shapes of sampler stats #6517

michaelosthege · 2023-02-11T21:12:09Z

As described in #6503 this adds a stats_dtypes_shape attribute to BlockedStep to replace BlockedStep.stats_dtypes.

It is implemented in a backwards-compatible manner with a deprecation warning for samplers that are not yet updated.

I also specified all stat shapes I was confident about. If someone with NUTS stat experiments could comment the remaining ones that'd be great!

Related issues

Checklist

Explain important implementation details 👆
Make sure that the pre-commit linting/style checks pass.
Link relevant issues (preferably in nice commit messages)
Are the changes covered by tests and docstrings?
Fill out the short summary sections 👇

New features

Step methods can now pre-register the shape of emitted sampler stats using the stats_dtypes_shapes class attribute. The stats_dtypes attribute is being deprecated.

codecov · 2023-02-11T21:56:01Z

Codecov Report

Merging #6517 (b14897e) into main (31c30dc) will decrease coverage by 71.05%.
The diff coverage is 35.89%.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #6517       +/-   ##
===========================================
- Coverage   94.73%   23.69%   -71.05%     
===========================================
  Files         147      147               
  Lines       27864    27913       +49     
===========================================
- Hits        26398     6613    -19785     
- Misses       1466    21300    +19834

Impacted Files	Coverage Δ
pymc/tests/sampling/test_mcmc.py	`0.00% <0.00%> (-98.61%)`	⬇️
pymc/tests/step_methods/test_compound.py	`0.00% <0.00%> (-100.00%)`	⬇️
pymc/step_methods/compound.py	`54.66% <52.94%> (-42.80%)`	⬇️
pymc/blocking.py	`87.50% <100.00%> (-8.16%)`	⬇️
pymc/step_methods/hmc/hmc.py	`35.71% <100.00%> (-57.15%)`	⬇️
pymc/step_methods/hmc/nuts.py	`95.27% <100.00%> (-2.03%)`	⬇️
pymc/step_methods/metropolis.py	`20.46% <100.00%> (-63.12%)`	⬇️
pymc/step_methods/slicer.py	`27.50% <100.00%> (-68.75%)`	⬇️
pymc/tests/gp/test_gp.py	`0.00% <0.00%> (-100.00%)`	⬇️
pymc/tests/gp/test_mean.py	`0.00% <0.00%> (-100.00%)`	⬇️
... and 125 more

pymc/step_methods/hmc/hmc.py

michaelosthege · 2023-02-11T22:20:18Z

@covertg want to take a shot a reviewing here?

covertg · 2023-02-14T20:34:30Z

Sure, I'll do my best! First time for pymc so obviously please take it with some grains of salt, but hope this is helpful. I don't have the experience with NUTS to confirm the shapes and dtypes so I won't comment there.

covertg

Main comment is regarding whether we really want to fix stats_dtypes and stats_dtypes_shapes in the constructor __new__, or whether it would be better to allow the step method to modify them later.

covertg · 2023-02-14T20:21:24Z

pymc/step_methods/compound.py

@@ -77,12 +131,21 @@ def __new__(cls, *args, **kwargs):
        if len(vars) == 0:
            raise ValueError("No free random variables to sample.")

+        # Auto-fill stats metadata attributes from whichever was given.
+        stats_dtypes, stats_dtypes_shapes = infer_warn_stats_info(


Assigning stats_dtypes and stats_dtypes_shapes here in the constructor means that they could fall out of sync later, if only one were to get modified. Is this a case we should consider? I could imagine this happening if a step method wanted to determine stat shape at initialization, for example perhaps for a stat shape that varies with the number of variables passed to the step method.

I'm not sure if that is a compelling case or not. But if it is — perhap these two attributes would be better exposed via @property with getters and setters to ensure they stay in sync?

I agree that we should get the flexibility to have samplers initialize stats on instantiation instead of specifying them as class attributes.

I also considered properties, but this wouldn't have worked nicely with the assignment of these fields in the class definition.

I don't think that sync is a problem because we can remove the old attribute.

I will think about how a refactor to definition at initialization will look like.

pymc/step_methods/compound.py

covertg · 2023-02-14T20:29:16Z

pymc/step_methods/compound.py

+    result = {}
+    for s, step in enumerate(steps):
+        for sname, (dtype, shape) in step.stats_dtypes_shapes.items():
+            result[f"sampler_{s}__{sname}"] = (dtype, shape)


Currently these dictionary keys are not passed to arviz when creating InferenceData objects. But if/when they are, we'll probably want a way for the user to map back from <step name> to <list of variables that were sampled by that stepper>

Agreed. ArviZ should add such a field, but even before that gets done we can add this to mcbackend.RunMeta

ricardoV94

Looks good, except for a nitpick

ricardoV94 · 2023-02-20T10:11:46Z

pymc/step_methods/compound.py

+    Shapes are interpreted in the following ways:
+    - `[]` is a scalar.
+    - `[3,]` is a length-3 vector.
+    - `[4, -1]` is a matrix with 4 rows and a dynamic number of columns.


I would vote to use None at the PyMC level as that maps directly to the convention used in PyTensor. When we use these to create a McBackend trace then we can map None to -1.

Other backends can use None to specify unknown shape if they want.

Suggested change

- `[4, -1]` is a matrix with 4 rows and a dynamic number of columns.

- `[4, None]` is a matrix with 4 rows and a dynamic number of columns.

Closes pymc-devs#6503

michaelosthege added enhancements trace-backend Traces and ArviZ stuff labels Feb 11, 2023

michaelosthege force-pushed the stats-refactoring branch from 168e19c to 6d2bf43 Compare February 11, 2023 21:44

michaelosthege marked this pull request as ready for review February 11, 2023 22:14

michaelosthege requested a review from aseyboldt February 11, 2023 22:17

michaelosthege commented Feb 11, 2023

View reviewed changes

pymc/step_methods/hmc/hmc.py Outdated Show resolved Hide resolved

michaelosthege mentioned this pull request Feb 11, 2023

Optional McBackend support #6510

Merged

5 tasks

michaelosthege requested a review from ferrine February 12, 2023 14:11

covertg reviewed Feb 14, 2023

View reviewed changes

Add StatsBijection.n_samplers property

2ad7835

michaelosthege force-pushed the stats-refactoring branch from 6d2bf43 to cb90744 Compare February 18, 2023 10:56

michaelosthege requested review from ricardoV94, aloctavodia and twiecki February 18, 2023 11:00

ricardoV94 reviewed Feb 20, 2023

View reviewed changes

michaelosthege added 2 commits February 20, 2023 11:50

Introduce stats_dtypes_shape attribute

6297352

Closes pymc-devs#6503

Specify stats_dtypes_shapes for all samplers

b14897e

michaelosthege force-pushed the stats-refactoring branch from cb90744 to b14897e Compare February 20, 2023 10:51

michaelosthege merged commit 7e9e17d into pymc-devs:main Feb 20, 2023

michaelosthege deleted the stats-refactoring branch February 20, 2023 11:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preregister shapes of sampler stats #6517

Preregister shapes of sampler stats #6517

michaelosthege commented Feb 11, 2023 •

edited

Loading

codecov bot commented Feb 11, 2023 •

edited

Loading

michaelosthege commented Feb 11, 2023

covertg commented Feb 14, 2023

covertg left a comment

covertg Feb 14, 2023

michaelosthege Feb 15, 2023

covertg Feb 14, 2023

michaelosthege Feb 15, 2023

ricardoV94 left a comment

ricardoV94 Feb 20, 2023

	- `[4, -1]` is a matrix with 4 rows and a dynamic number of columns.
	- `[4, None]` is a matrix with 4 rows and a dynamic number of columns.

Preregister shapes of sampler stats #6517

Preregister shapes of sampler stats #6517

Conversation

michaelosthege commented Feb 11, 2023 • edited Loading

New features

codecov bot commented Feb 11, 2023 • edited Loading

Codecov Report

michaelosthege commented Feb 11, 2023

covertg commented Feb 14, 2023

covertg left a comment

Choose a reason for hiding this comment

covertg Feb 14, 2023

Choose a reason for hiding this comment

michaelosthege Feb 15, 2023

Choose a reason for hiding this comment

covertg Feb 14, 2023

Choose a reason for hiding this comment

michaelosthege Feb 15, 2023

Choose a reason for hiding this comment

ricardoV94 left a comment

Choose a reason for hiding this comment

ricardoV94 Feb 20, 2023

Choose a reason for hiding this comment

michaelosthege commented Feb 11, 2023 •

edited

Loading

codecov bot commented Feb 11, 2023 •

edited

Loading