Get plot data for prepostfit experiments #438

lpoug · 2025-02-07T11:14:33Z

Closes Get model results data #418
Allows access to the data of PrePostFit experiments through the use of .get_plot_data() function

Note: First time doing a proper PR so not sure if all pre-requisites are here.

📚 Documentation preview 📚: https://causalpy--438.org.readthedocs.build/en/438/

drbenvincent · 2025-02-27T21:13:04Z

Humble apologies for taking so long to getting to this PR @lpoug. I've unfortunately not had much time to spend on CausalPy as I'd have liked, but hoping to catch up with the backlog.

There are currently a couple of issues with the remote checks. I'm hoping to get these resolved in #437, at which point I'll test this out locally and give feedback if necessary before we can merge this :)

codecov · 2025-02-27T21:19:08Z

Codecov Report

Attention: Patch coverage is 24.39024% with 31 lines in your changes missing coverage. Please review.

Project coverage is 92.98%. Comparing base (3a7ddea) to head (6dca884).

Files with missing lines	Patch %	Lines
causalpy/experiments/prepostfit.py	12.00%	22 Missing ⚠️
causalpy/experiments/base.py	46.15%	7 Missing ⚠️
causalpy/plot_utils.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #438      +/-   ##
==========================================
- Coverage   94.40%   92.98%   -1.43%     
==========================================
  Files          31       31              
  Lines        1985     2025      +40     
==========================================
+ Hits         1874     1883       +9     
- Misses        111      142      +31

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

lpoug · 2025-03-03T13:37:35Z

Humble apologies for taking so long to getting to this PR @lpoug. I've unfortunately not had much time to spend on CausalPy as I'd have liked, but hoping to catch up with the backlog.

There are currently a couple of issues with the remote checks. I'm hoping to get these resolved in #437, at which point I'll test this out locally and give feedback if necessary before we can merge this :)

Absolutely no problem whatsoever @drbenvincent! Let me know when the time comes, I'll be around 😄

drbenvincent · 2025-03-03T20:04:10Z

Hi @lpoug. I pushed some changes, can you make sure to pull the latest version?

I'll try to review this in the next few days :)

lpoug · 2025-04-01T10:29:01Z

Hey there @drbenvincent. Just to be sure, are you waiting on anything on my side?

drbenvincent · 2025-04-01T14:40:38Z

Apologies for the delay! Just dropping in some review comments now

drbenvincent

Sorry about the slow review on this. My bad.

Overall this looks good. I've suggested some minor changes. Other than that the main thing is to update the tests to ensure this functionality remains working into the future.

Could you add new tests to test_integration_pymc_examples.py and test_integration_skl_examples.py. I imagine we can just test that we successfully get back a dataframe from calling result.get_plot_data on the experiments that you've implemented so far. You could optionally test that the contents of that dataframe is as expected, e.g. has the desired columns.

In theory an ultra pedantic person might want to test that we get an exception when calling the get_plot_data on experiments that.

Because this PR involves additional methods, can you run make uml. This should update the UML diagram that we include in CONTRIBUTING.md

Sorry again about the latency on this review.

drbenvincent · 2025-04-01T14:41:57Z

causalpy/experiments/base.py

+            raise ValueError("Unsupported model type")
+
+    @abstractmethod
+    def get_plot_data_bayesian(self, *args, **kwargs):


Change to _get_plot_data_bayesian to emphasise that the preferred user facing method to call is get_plot_data

Modifications made in 6a6face

I took the liberty of renaming plot_bayesian and plot_ols to _plot_bayesian and _plot_ols, respectively. I guess this should follow the same logic as for _get_plot_data_bayesian? Let me know if you think differently and I'll reverse these changes.

drbenvincent · 2025-04-01T14:42:26Z

causalpy/experiments/base.py

+        raise NotImplementedError("get_plot_data_bayesian method not yet implemented")
+
+    @abstractmethod
+    def get_plot_data_ols(self, *args, **kwargs):


Same comment as before... Change to _get_plot_data_ols to emphasise that the preferred user facing method to call is get_plot_data

Modifications made in 6a6face as well

drbenvincent · 2025-04-01T14:52:50Z

causalpy/experiments/base.py

+        elif isinstance(self.model, RegressorMixin):
+            return self.get_plot_data_ols(*args, **kwargs)
+        else:
+            raise ValueError("Unsupported model type")


I don't have a good feel whether this should be a ValueError or a NotImplementedError. Have a think about whatever works best, but I don't have a strong opinion.

I haven't made any changes here yet. I guess we could argue that the NotImplementedError makes sense within the _get_plot_data_bayesian and _get_plot_data_old because this tells us that these functionalities could be implemented in the future.
In get_plot_data, the ValueError could make sense as the function is not technically not implemented.
Happy to discuss or have your final opinion on this in any case!

…ata_ols to _get_plot_data_ols, bayesian_plot to _bayesian_plot, and ols_plot to _ols_plot

lpoug · 2025-04-07T13:21:04Z

Sorry about the slow review on this. My bad.

Overall this looks good. I've suggested some minor changes. Other than that the main thing is to update the tests to ensure this functionality remains working into the future.

Could you add new tests to test_integration_pymc_examples.py and test_integration_skl_examples.py. I imagine we can just test that we successfully get back a dataframe from calling result.get_plot_data on the experiments that you've implemented so far. You could optionally test that the contents of that dataframe is as expected, e.g. has the desired columns.

In theory an ultra pedantic person might want to test that we get an exception when calling the get_plot_data on experiments that.

Because this PR involves additional methods, can you run make uml. This should update the UML diagram that we include in CONTRIBUTING.md

Sorry again about the latency on this review.

No problem at all! I was just starting to get worried about something that still needed to be done on my end 😅

Thank you for the reviews. I've added links to the commits directly in your comments. 6a6face (renaming of functions)

Regarding tests, I have added them in 97f0d79

I have not added anything yet to test that we get an exception when calling the get_plot_data on experiments for which it is not implemented. I'll try to take a moment to think about how to do so precisely.

Finally, I have updated the diagrams in 0edca77

Let me know if these changes look good or not or if you had anything else in mind!

drbenvincent

Looks good. I think we are very nearly there :) Thanks for adding in the tests.

It could be prudent to rename the *_hdi_lower and *_hdi_upper columns to include the numerical hdi_prob as a percentage. For example if hdi_prob=0.8 then the columns could be labelled as *_hdi_lower_80 and *_hdi_upper_80. That way there is much less scope to make a mistake like generating 80% HDI's but forgetting that and thinking that you generated 95% HDI's for example. I think that should be pretty simple to do in _get_plot_data_bayesian or _get_plot_data_ols.

Should also be pretty simple to resolve the merge conflicts - it's just the updated uml images as far as I can see.

Could you also update the docstring of the tests? At the moment they list out what is tested, so you can just flag up that it tests the functionality of plot_data. I'm not sure we'll carry on doing that in the future if the number of tests gets large, but let's keep up with it for the moment.

…ted tests accordingly, and updated tests' docstring

lpoug · 2025-04-08T11:09:56Z

causalpy/tests/test_integration_pymc_examples.py

+    )
+    expected_columns = [
+        "prediction",
+        "pred_hdi_lower_94",


is it ok here to have fixed names with the default value of hdi_prob (i.e. 0.94)?

lpoug added 25 commits October 16, 2024 14:20

export plot data from plot utils

70110c0

add function to prepostfit class + fixes in plot_utils

fdc867f

generic get_plot_data in base.py and update prepostfit code to get hdi

deace7f

utility function to retrieve hdi and clean get_plot_data_bayesian

d7680f6

hdi_prob specification in get_plot_data_bayesian

1734486

tested for its and index alignment in recovering hdi

b79e6ee

removed unused library

3f813ac

export plot data from plot utils

bcf9e4f

add function to prepostfit class + fixes in plot_utils

8cf55ba

generic get_plot_data in base.py and update prepostfit code to get hdi

8284a86

utility function to retrieve hdi and clean get_plot_data_bayesian

c975987

hdi_prob specification in get_plot_data_bayesian

adb52b2

tested for its and index alignment in recovering hdi

6075f20

removed unused library

d0c2109

export plot data from plot utils

0b7fa36

generic get_plot_data in base.py and update prepostfit code to get hdi

2493b17

hdi_prob specification in get_plot_data_bayesian

1eab610

tested for its and index alignment in recovering hdi

b49a646

removed unused library

5521a07

fix diverging branch

e2263ea

export plot data from plot utils

bb7305a

generic get_plot_data in base.py and update prepostfit code to get hdi

7ac0432

hdi_prob specification in get_plot_data_bayesian

f045c96

removed unused library

6a92c4f

Merge branch 'plot-data' of github.com:lpoug/CausalPy into plot-data

aea64a0

lpoug marked this pull request as ready for review February 17, 2025 08:44

Merge branch 'main' into plot-data

9b6fcba

ran pre-commit checks

6dca884

drbenvincent added the enhancement New feature or request label Mar 3, 2025

drbenvincent requested changes Apr 1, 2025

View reviewed changes

lpoug added 3 commits April 7, 2025 14:52

renamed get_plot_data_bayesian to _get_plot_data_bayesian, get_plot_d…

6a6face

…ata_ols to _get_plot_data_ols, bayesian_plot to _bayesian_plot, and ols_plot to _ols_plot

added tests to get_plot_data for pymc and skl experiments

97f0d79

updated uml diagrams

0edca77

drbenvincent requested changes Apr 7, 2025

View reviewed changes

added dynamic naming for hdi columns in _get_plot_data_bayesian, upda…

44d3870

…ted tests accordingly, and updated tests' docstring

lpoug commented Apr 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get plot data for prepostfit experiments #438

Get plot data for prepostfit experiments #438

lpoug commented Feb 7, 2025 •

edited

Loading

drbenvincent commented Feb 27, 2025

codecov bot commented Feb 27, 2025 •

edited

Loading

lpoug commented Mar 3, 2025

drbenvincent commented Mar 3, 2025

lpoug commented Apr 1, 2025

drbenvincent commented Apr 1, 2025

drbenvincent left a comment

drbenvincent Apr 1, 2025

lpoug Apr 7, 2025

drbenvincent Apr 1, 2025

lpoug Apr 7, 2025

drbenvincent Apr 1, 2025

lpoug Apr 7, 2025

lpoug commented Apr 7, 2025

drbenvincent left a comment

lpoug Apr 8, 2025

Get plot data for prepostfit experiments #438

Are you sure you want to change the base?

Get plot data for prepostfit experiments #438

Conversation

lpoug commented Feb 7, 2025 • edited Loading

drbenvincent commented Feb 27, 2025

codecov bot commented Feb 27, 2025 • edited Loading

Codecov Report

lpoug commented Mar 3, 2025

drbenvincent commented Mar 3, 2025

lpoug commented Apr 1, 2025

drbenvincent commented Apr 1, 2025

drbenvincent left a comment

Choose a reason for hiding this comment

drbenvincent Apr 1, 2025

Choose a reason for hiding this comment

lpoug Apr 7, 2025

Choose a reason for hiding this comment

drbenvincent Apr 1, 2025

Choose a reason for hiding this comment

lpoug Apr 7, 2025

Choose a reason for hiding this comment

drbenvincent Apr 1, 2025

Choose a reason for hiding this comment

lpoug Apr 7, 2025

Choose a reason for hiding this comment

lpoug commented Apr 7, 2025

drbenvincent left a comment

Choose a reason for hiding this comment

lpoug Apr 8, 2025

Choose a reason for hiding this comment

lpoug commented Feb 7, 2025 •

edited

Loading

codecov bot commented Feb 27, 2025 •

edited

Loading