Sensitivity analysis and marginal effects #1673

drbenvincent · 2025-05-06T13:29:12Z

Description

This PR provides extra functionality to extract insights from MMM's. It adds a new class which allows the MMM to be evaluated at a grid of parameters (in what I'm calling a sweep). We can use this functionality to plot marginal effects (where the sweep defines the absolute value of a driver) or where we run a counterfactual sweep. In this latter case, we can run either a multiplicative or additive sweep. This mode retains the time-varying driver values but modifies them multiplicatively or additively.

Sweep type

Absolute - Sets the all the values of the target driver to the sweep value(s). This would be used in a 'classic' marginal effects approach
Multiplicative - Rather than setting all the values of a driver to a given sweep value, this retains the original time-varying nature of the driver but simply multiplies them by a factor
Additive - Similar to the multiplicative type, additive retains the time-varying driver values but shifts them up or down additively.

Plot type

plot_uplift - plots the change in total outcome variable (e.g. sales) as a function of the sweep value
plot_marginal_effects - plots the marginal effects as a function of the sweep value.

TODO

Checklist

Checked that the pre-commit linting/style checks pass. Feel free to comment pre-commit.ci autofix to auto-fix.
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks) using numpydoc format.
If you are a pro: each commit corresponds to a relevant logical change

📚 Documentation preview 📚: https://pymc-marketing--1673.org.readthedocs.build/en/1673/

review-notebook-app · 2025-05-06T13:29:17Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2025-05-06T13:32:23Z

Codecov Report

Attention: Patch coverage is 14.89362% with 80 lines in your changes missing coverage. Please review.

Project coverage is 90.54%. Comparing base (4c4f251) to head (4dc83bd).

Files with missing lines	Patch %	Lines
pymc_marketing/mmm/plot.py	2.22%	44 Missing ⚠️
pymc_marketing/mmm/sensitivity_analysis.py	23.80%	32 Missing ⚠️
pymc_marketing/mmm/multidimensional.py	33.33%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1673      +/-   ##
==========================================
- Coverage   91.59%   90.54%   -1.05%     
==========================================
  Files          60       61       +1     
  Lines        6778     6872      +94     
==========================================
+ Hits         6208     6222      +14     
- Misses        570      650      +80

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cetagostini · 2025-05-09T07:52:44Z

Hey @drbenvincent very nice work, I really like the functionality here. Nevertheless, after a first quick scan, a few suggestion thinking in the future:

All related work to estimation of marginal effects would be nice to have it in the .py file created, but all related plots would be better to have in the plot suite. It's going slowly but we are trying to move all plots there (Consider this plot suite must be pymc.model agnostic, take as reference other plots were we infer dimensions or shapes automatically).
I'll use the multidimensional class for the example, we are moving away from the previous class slowly. You can create the same model, the only you need to do is from pymc_marketing.mmm.multidimensional import MMM all the signature will be the same, except you need to define the target_column. That's it :)

Doing this hopefully we can aim to a signature like:

sweep_values = mmm.counterfactual_sweep(
    predictors=["influencer_spend"],
    sweep_values=np.linspace(0, 2, 12),
    sweep_type="absolute",
)

optimizable_model.plot.marginal_effects(
    samples=sweep_values,
);

This will be consistent with how other plots and methods are being used, for example, optimizer have very similar way of working. How does that sound?

On the other hand, I left a few minor questions!

Note

I saw an issue with drivers being scaled (Ensure the sweep values are being applied in the original space), if you are using the new MMM class, you can make any variable in the graph original scale. Meaning, you can create the original scale var, and probably change your signature from predictors=["influencer_spend"] to var_names=["influencer_spend_original_scale"]. Users could use any var to plot this effects, and you can probably control that it's not able to run arbitrary, adding a raise error if the var_name it's not in the coords for channel data (?) or similar.

review-notebook-app · 2025-05-09T08:03:48Z

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2025-05-09T08:03:47Z
----------------------------------------------------------------

Why the total uplift starts in negative? that confuse me a bit. Regarding the second plot, may be better to just called derivative?

drbenvincent commented on 2025-05-09T08:35:50Z
----------------------------------------------------------------

The top plot starts negative on the left because we've gone from some influencer spend to zero influencer spend. The y-axis is change in uplift relative to the actual scenario, so it answers the question of "what would have happened to our total sales if we had spent zero on influencers.

Second plot - I think it makes sense to keep the "marginal effects" terminology because that is quite popular in stats circles. There's certainly scope for changing the y-axis label, but I think it makes sense to describe the marginal effects as the derivative when I flesh out the notebook text.

review-notebook-app · 2025-05-09T08:03:49Z

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2025-05-09T08:03:48Z
----------------------------------------------------------------

This confuse me a bit more. Why the derivative looks like this?

drbenvincent commented on 2025-05-09T08:37:49Z
----------------------------------------------------------------

Good point. Check the y-axis scale. The top plot is essentially linear, so the derivative is actually a constant value. But when the axes are scaled so tightly around the values it magnifies numerical imprecision. I will see what I can do with y-axis scaling to better visually portray that the marginal effect here is flat as a function of the variable being manipulated on the x-axis

drbenvincent commented on 2025-05-15T11:27:17Z
----------------------------------------------------------------

Resolved in fb51cc4

review-notebook-app · 2025-05-09T08:03:50Z

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2025-05-09T08:03:49Z
----------------------------------------------------------------

What happens when you move to multiplicative change? Does this means you change the nature of the model to multiplicative?

drbenvincent commented on 2025-05-09T08:39:43Z
----------------------------------------------------------------

Ah, so this will be better explained as I flesh out the notebook text. The additive, multiplicative, or absolute options are how we are manipulating the target driver. So setting the sweep values to have a multiplicative effect keeps the time course of their values but literally just scales them up or down. Nothing about the model is changed other than how we are manipulating the target driver.

drbenvincent · 2025-05-09T08:35:52Z

The top plot starts negative on the left because we've gone from some influencer spend to zero influencer spend. The y-axis is change in uplift relative to the actual scenario, so it answers the question of "what would have happened to our total sales if we had spent zero on influencers.

Second plot - I think it makes sense to keep the "marginal effects" terminology because that is quite popular in stats circles. There's certainly scope for changing the y-axis label, but I think it makes sense to describe the marginal effects as the derivative when I flesh out the notebook text.

View entire conversation on ReviewNB

drbenvincent · 2025-05-09T08:37:50Z

Good point. Check the y-axis scale. The top plot is essentially linear, so the derivative is actually a constant value. But when the axes are scaled so tightly around the values it magnifies numerical imprecision. I will see what I can do with y-axis scaling to better visually portray that the marginal effect here is flat as a function of the variable being manipulated on the x-axis

View entire conversation on ReviewNB

drbenvincent · 2025-05-09T08:39:45Z

Ah, so this will be better explained as I flesh out the notebook text. The additive, multiplicative, or absolute options are how we are manipulating the target driver. So setting the sweep values to have a multiplicative effect keeps the time course of their values but literally just scales them up or down. Nothing about the model is changed other than how we are manipulating the target driver.

View entire conversation on ReviewNB

drbenvincent · 2025-05-09T09:14:17Z

All related work to estimation of marginal effects would be nice to have it in the .py file created, but all related plots would be better to have in the plot suite. It's going slowly but we are trying to move all plots there (Consider this plot suite must be pymc.model agnostic, take as reference other plots were we infer dimensions or shapes automatically).

So it's very worth having this discussion now. Bear in mind that the counterfactual sweep returns a more complex idata - a set of uplifts but with an additional sweep dimension. At the moment this is stored as it's own thing, but I guess I could store it as a new DataArray in the idata.

Happy to be guided by you guys about the best way to incorporate this into the codebase. I think there are a few options:

Make the counterfactual sweep a method of the MMM class

Either, return a results object with it's own plot methods
Or, add a new sweep DataArray to the MMM's idata and add new plot methods to the MMMPlotSuite

Keep plot functions in the CounterfactualSweep but these just point to the real plot functionality which lives in MMMPlotSuite
Keep it as it is. Probably not what you want.

I'm guessing (1) might be your favourite? Any thoughts from @williambdean on this? Would be ideal to pin down the API and general approach early, to avoid having to do any reimplementation/restructuring.

pymc_marketing/mmm/marginal_effects.py

Co-authored-by: Will Dean <[email protected]>

drbenvincent · 2025-05-15T11:03:23Z

@cetagostini I've had luck switching to pymc_marketing.mmm.multidimensional.MMM, however my sweep code calls a method

actual = self.mmm._get_group_predictive_data(
    group="posterior_predictive", original_scale=True
)["y"]

which does not seem to have made it from the original MMM class into the new multidimensional one.

pymc-marketing/pymc_marketing/mmm/base.py

Lines 275 to 295 in 6d71a20

    
           def _get_group_predictive_data( 
        
               self, 
        
               group: Literal["prior_predictive", "posterior_predictive"], 
        
               original_scale: bool = False, 
        
           ) -> Dataset: 
        
               """Get the prior or posterior predictive data.""" 
        
               try: 
        
                   group_data: Dataset = getattr(self, group) 
        
               except Exception as e: 
        
                   raise RuntimeError( 
        
                       f"Make sure the model has been fitted and the {group} has been sampled!" 
        
                   ) from e 
        
               if original_scale: 
        
                   group_data = apply_sklearn_transformer_across_dim( 
        
                       data=group_data, 
        
                       func=self.get_target_transformer().inverse_transform, 
        
                       dim_name="date", 
        
                   ) 
        
               return group_data

Any chance you could add this add this method in, or otherwise let me know how to easily access the posterior predictions in the original scale? I can see how you can do it in the multidimensional example notebook, but it would be useful if a user doesn't have to remember to manually rescale

drbenvincent · 2025-05-27T16:09:06Z

FYI. At this point of my refactor, the API is:

results:xr.Dataset = CounterfactualSweep(
    mmm=mmm,
    predictors=["influencer_spend"],
    sweep_values=np.linspace(0, 2, 12),  # Set spend directly from 0 to $100k
    sweep_type="absolute",
).run_sweep()
CounterfactualSweep.plot_uplift(results);
CounterfactualSweep.plot_marginal_effects(results);

The changes are:

No longer auto-computes, it needs to be triggered with the run_sweep method,
Results are now self contained in an xr.Dataset, not inside the class.
Plot methods only require this xr.Dataset.
No longer have to input X into the CounterfactualSweep class because it takes it directly from mmm.

This is not the end goal. Just keeping track of progress because I'm only able to work on this in bursts.

drbenvincent · 2025-05-28T10:23:18Z

Note to self: next focus should be making the sweep intervention in the original data space. Once this is done, we can verify it is working correctly by checking the uplift for a multiplicative change of 1 is zero (because the counterfactual scenario is equal to the actual scenario).

drbenvincent · 2025-05-30T13:38:40Z

Notes for when I pick this up again. Current status...

mmm.sensitivity_analysis(
    predictors=["influencer_spend"],
    sweep_values=np.linspace(0, 2, 12),  # Set spend directly from 0 to $100k
    sweep_type="multiplicative",
)

This now adds a new group called "sensitivity_analysis" to the mmm.idata.

A by product of this approach is that we will only have one set of results available to us at a time, in the idata.

Plotting can be done entirely from this idata group. Right now it's done like this

SensitivityAnalysis.plot(mmm.idata.sensitivity_analysis)
SensitivityAnalysis.plot(mmm.idata.sensitivity_analysis, marginal=True);

but the next step is to move the plot code into the MMMPlotSuite. That will get us to the desired API.

Plots also appear to be working as intended - see the zero uplift at a multiplicative change of 1.

Multiplicative sweep example

Additive sweep example

drbenvincent · 2025-06-02T09:23:05Z

API is now:

mmm.sensitivity_analysis(
    predictors=["influencer_spend"],
    sweep_values=np.linspace(0, 2, 12),
    sweep_type="multiplicative",
)

mmm.plot.plot_sensitivity_analysis()
mmm.plot.plot_sensitivity_analysis(marginal=True);

drbenvincent · 2025-06-04T12:39:55Z

Just a quick ping about this one @cetagostini - just because I'm a little idle across the board waiting on reviews on multiple projects :)

initial stab at CounterfactualSweep class + associated example notebook

292e19c

drbenvincent marked this pull request as draft May 6, 2025 13:29

github-actions bot added docs Improvements or additions to documentation MMM labels May 6, 2025

drbenvincent requested a review from juanitorduz May 6, 2025 13:42

drbenvincent and others added 5 commits May 6, 2025 14:49

Merge branch 'main' into marginal-effects

3a78f2b

attempt to add the new notebook to the examples gallery

4f8f45d

delete commented code

148335e

fix example in docs and re-run notebook with some hidden inputs/outputs

467e204

Merge branch 'main' into marginal-effects

a04d03e

cetagostini assigned drbenvincent May 9, 2025

add some TODO's to the notebook

bd8f491

williambdean reviewed May 12, 2025

View reviewed changes

cetagostini and others added 6 commits May 13, 2025 13:45

Merge branch 'main' into marginal-effects

86d536b

Update pymc_marketing/mmm/marginal_effects.py

63a9a8e

Co-authored-by: Will Dean <[email protected]>

improve type hinting

654cbd6

update docstring of plot_marginal_effects method

00e4309

Use Literal in type hint

3d8f89a

Merge branch 'main' into marginal-effects

41b1823

drbenvincent added 2 commits May 27, 2025 16:43

X no longer required as an input to CounterfactualSweep

b408330

remove redundant sweep_values index

1aa9700

Merge branch 'main' into marginal-effects

fc790c6

drbenvincent added 7 commits May 29, 2025 14:20

rename to SensivityAnalysis

cf646e6

Merge branch 'main' into marginal-effects

4f06922

compute gradient with xarray instead of numpy

d9686ff

add MMM.sensitivity_analysis as wrapper to call SensitivityAnalysis

78f6350

formatting

4236bb6

rename notebook

2575edc

remove commented code in notebook

70d41ee

drbenvincent changed the title ~~Marginal effects + counterfactual sweeps for MMM insights~~ Sensitivity analysis and marginal effects May 29, 2025

drbenvincent added 5 commits May 30, 2025 13:43

fix scaling + add crosshairs on plots

db5cfa2

combine into a single plot function

d051cbe

api change, results now stored in idata, and fix crosshairs

64d8084

minor tweaks

4666187

Merge branch 'main' into marginal-effects

aa012b8

drbenvincent added 3 commits May 30, 2025 14:43

better sweep values for additive sweep example

828569e

Merge branch 'main' into marginal-effects

023e6e5

move plot_sensitivity_analysis into MMMPlotSuite

8d24e68

rename example in the gallery view. Docs updated

34bebe5

drbenvincent marked this pull request as ready for review June 2, 2025 14:42

drbenvincent requested review from williambdean and cetagostini June 2, 2025 14:42

add functionality to plot y-axis in percentage terms

4dc83bd

drbenvincent requested a review from JakePiekarski314 June 4, 2025 16:09

Sensitivity analysis and marginal effects #1673

Are you sure you want to change the base?

Sensitivity analysis and marginal effects #1673

Conversation

drbenvincent commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Sweep type

Plot type

TODO

Checklist

Uh oh!

review-notebook-app bot commented May 6, 2025

Uh oh!

codecov bot commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cetagostini commented May 9, 2025

Uh oh!

review-notebook-app bot commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drbenvincent commented May 9, 2025

Uh oh!

drbenvincent commented May 9, 2025

Uh oh!

drbenvincent commented May 9, 2025

Uh oh!

drbenvincent commented May 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

drbenvincent commented May 15, 2025

Uh oh!

drbenvincent commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drbenvincent commented May 28, 2025

Uh oh!

drbenvincent commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drbenvincent commented Jun 2, 2025

Uh oh!

drbenvincent commented Jun 4, 2025

Uh oh!

Uh oh!

drbenvincent commented May 6, 2025 •

edited

Loading

codecov bot commented May 6, 2025 •

edited

Loading

review-notebook-app bot commented May 9, 2025 •

edited

Loading

review-notebook-app bot commented May 9, 2025 •

edited

Loading

review-notebook-app bot commented May 9, 2025 •

edited

Loading

drbenvincent commented May 27, 2025 •

edited

Loading

drbenvincent commented May 30, 2025 •

edited

Loading