Skip to content

Plots in Examples, getting started API and Documentation #4561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mjhajharia opened this issue Mar 23, 2021 · 6 comments
Closed

Plots in Examples, getting started API and Documentation #4561

mjhajharia opened this issue Mar 23, 2021 · 6 comments

Comments

@mjhajharia
Copy link
Member

  1. Use more Arviz plots: In the process of PR Replace matplotlib seaborn style plots with Arviz Plots in Documentation (WIP) #4560 I'm trying to replace seaborn as a style from all the plots, because seaborn and matplotlib colors etc are not that regularly updated I realised that it is not very clear that arviz can be used for general plotting as well without functions from pymc3 or something along those lines. I'm not sure but I've seen a lot of example documents using seaborn to plot, instead of arviz, I believe we should be encouraging arviz usage alongwith pymc3, anyway seaborn is an additional dependency. For example in - https://docs.pymc.io/notebooks/sampler-stats.html cell [7] seaborn dist_plot is used, the very same plot is already available in arviz, so why not use that?
  2. The plots in core documentation in distributions/discrete.py or distributions/continuous.py the plots are generated mostly by scipy.stats, I think it would make way more sense to generate plots from the respective pymc3 models themselves instead of plotting the same model in scipy, I might be entirely wrong, but these felt like slight concerns.
@OriolAbril
Copy link
Member

First of all, thanks for posting, I'll try to answer with two main goals in mind, first addressing some of the questions and second trying to help organizing everything and working on that as efficiently as possible. PyMC is a relatively large project, with dependencies and implications spread over multiple repositories, I myself still hesitate about where and how should some issues be addressed.

Use more Arviz plots

We are all in favour of this, and there has already been a lot of work on that, but it's true there is still much work to be done. There are two nits to take into account on that end though.

The first one is to be clear in separating actual plots from plotting styles. What should be enforced throughout all the codebase and documentation is the use of the arviz-darkgrid plotting style as it is colorblind friendly and when used everywhere will give a coherent style to the documentation. This is actually already enforced on all the example notebooks, as it's listed as a requirement on pymc3 notebook style guide.

The second distinction is about examples versus other kinds of documentation. Examples are hosted at https://github.com/pymc-devs/pymc-examples and are in the process of being updated, being one of the goals to show ArviZ usage for them, see for example pymc-devs/pymc-examples#34. There are many places where ArviZ should be used and it is not, however, ArviZ should not be used everywhere. There are many plots that should still be done with pure matplotlib or with seaborn. One clear example on that is https://docs.pymc.io/notebooks/multilevel_modeling.html, which uses arviz, xarray, pandas and pure matplotlib plots. All of the options end up using matplotlib and can therefore be combined and use multiple approaches in the same plot, this only means that the apis and functions of each package are designed towards some specific tasks and goals and we should try to use the best tool for the job.

Note: If you are interested in this second point and on working in pymc-examples, stay tuned, I will come with more news soon.

Now back to the point, I don't know all the documentation by heart, but I don't think there is much (if any) place for ArviZ usage in documentation other than examples. I think that there, we should probably use only the ArviZ plotting style and use matplotlib for everything else.

I believe we should be encouraging arviz usage alongwith pymc3, anyway seaborn is an additional dependency

We should be encouraging ArviZ usage because ArviZ is designed to explore, analyze and visualize the results of Bayesian models, our approach as PyMC project is to use PyMC3 to build and run the models and then ArviZ as the main library to explore the results. I may be repeating myself a bit, but just to be sure, example notebooks are executed independently and do not run the code every time we generate the docs, and they can have other dependencies, using seaborn is perfectly fine.

That being said, the notebook you linked too does clearly need an update to use ArviZ. The energy plot for example is a useful and common diagnostic, to the point that ArviZ has a function for it: plot_energy, again, ArviZ is designed with diagnosing as one of its goals.

The plots in core documentation in distributions/discrete.py or distributions/continuous.py the plots are generated mostly by scipy.stats

I guess that could be changed, but I don't think it's worth it. the plots there are only present to give an intuition about the shapes of the distributions and the roles of its parameters. With only this goal in mind, scipy feels like a better match, we don't want to build a model and fit it to data, we only want to visualize quickly how the different pdfs look

@mjhajharia
Copy link
Member Author

mjhajharia commented Mar 23, 2021

@OriolAbril thankyou for taking the time to do a detailed response, it makes sense to not use arviz everywhere yeah. I think I was being naive in my idea of how things "should" be, as for arviz-darkgrid style in documentation - #4563 this PR covers it I think, I might have missed something, but for now that seemed to be all. As for examples, yes I'd like to work on that, that is something i find personally important. You know, just better jupyter-notebooks make a library so much more accessible - as an undergrad student I've found them very valuable. So that sounds cool, let me know if you have any ideas or tasks on how to go about that. I think my issue was vague and I sort of intended for it to be a discussion of some sort. Do you think it would be more sensible if I shifted this to pymc3 discourse or something? It's just that I've found that venue sort of inactive.

@OriolAbril
Copy link
Member

So that sounds cool, let me know if you have any ideas or tasks on how to go about that.

I do have many ideas and tasks to do related to that, I hope I'll have time to actually write them down and translate them to issues at pymc-examples so the work can actually be done

Do you think it would be more sensible if I shifted this to pymc3 discourse or something? It's just that I've found that venue sort of inactive.

Sorry you found it wasn't very active, I think it's fine to have this here for now. My guess is that the only "audience" for your post were pymc developers, unlike with most other questions about pymc3 usage for example where it's easier for users to help one another.

@OriolAbril
Copy link
Member

I have started https://github.com/pymc-devs/pymc-examples/projects, it should get issues for all notebooks sorted in the different columns progressively, hopefully in ~2-3 weeks it will be finished, but we'll see

@mjhajharia
Copy link
Member Author

sounds great

@mjhajharia
Copy link
Member Author

mjhajharia commented Mar 25, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants