Skip to content

SMC: estimate marginal likelihood #2563

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 19, 2017
Merged

Conversation

aloctavodia
Copy link
Member

The implementation follows Ching, J. and Chen, Y. (2007). Transitional Markov Chain Monte Carlo Method for Bayesian Model Updating, Model Class Selection, and Model Averaging

I am not sure were to store the value of the marginal likelihood, now I am just adding the attribute marginal_likelihood to the model (see here) Suggestions are more than welcome!

I test the code using two type of examples and both show good agreement. :

  • The two Gaussian example in the above reference (I test against all scenarios)
  • A beta-binomial model, where the Bayes Factor is computed analytically. I tried combinations of different priors.

The test for Travis is based on the current SMC test I just add one line. The are two reasons behind this decision, avoid increasing time of testing and make the minimal changes to existing code. The test corresponds to the IV scenario in Ching & Chen paper. Nevertheless, if necessary I could add other scenarios to the tests or maybe a comparison with the beta-binomial.

For the test I also increase the number of n_chains to 1000 and decrease the number of n_steps to 10, I did this because the accuracy of the marginal likelihood depends more on the number of chains than on the value of n_steps, and while n_steps = 10 seems to be too small I think is enough for the tests.

@hvasbath
Copy link
Contributor

This is basically only a refactoring of things that have been there if I dont miss anything.
Out of curiosity and of course interest what do you need that for/ what can be done with it?

@aloctavodia
Copy link
Member Author

Yes this is mostly a refactoring, a by product of the SMC sampler is the estimation of the marginal likelihood from the unnormalized weights, this PR just computes that quantity (see sj and step.sjs) and makes that quantity available to the user.

The marginal likelihood is used in model comparison, hypothesis testing and model averaging. Maybe you have heard about Bayes factors (the ratio of two marginal likelihood from two models). I am not a very big fan of Bayes factors, but I think people may find them useful. in fact I am using them together with WAIC in a biomolecular project.

I realize now that this PR should also add a notebook with and example of how to use SMC to compute Bayes Factors and explaining them, currently Bayes factors are just barely mentioned in the PyMC3 documentation (I will add such a notebook to this PR tomorrow).

If you check the papers suggested by @seanlaw in #2519 you will see that one of the reasons to use methods such us WHAM is to compute the "partition function" and "free energies" these quantities are very important in Statistical Mechanics/Thermodynamics, interestingly the Bayesian equivalent of the partition function is the marginal likelihood. Luckily for us, and thanks to you, we can use SMC to estimate the marginal likelihood and hence I think we don't need to implemented methods such as WHAM.

@hvasbath
Copy link
Contributor

Thanks a lot for the explanation @aloctavodia ! I have so much to learn ... ;) . Such a notebook would be great! Cant wait to see it.

@aloctavodia
Copy link
Member Author

Besides the new notebook I made a couple of changes in GLM-model-selection notebook

  • Remove Bayes Factor section
  • Add a comment from Watanabe to balance the comment from Avehtari :-)

@hvasbath
Copy link
Contributor

hvasbath commented Sep 16, 2017

Thats a nice notebook @aloctavodia ! Thanks a lot for writing such a detailed description. There are few grammar things here and there, but maybe one of our native speakers should correct them- to be sure it is correct ;) .

@junpenglao
Copy link
Member

LGTM as well, very informative notebook on bayes factor @aloctavodia ;-)

@aloctavodia aloctavodia merged commit 02a4da0 into pymc-devs:master Sep 19, 2017
@aloctavodia aloctavodia deleted the SMCml branch September 19, 2017 16:44
ColCarroll pushed a commit that referenced this pull request Nov 9, 2017
* SMC: estimate marginal likelihood

* add example marginal likelihood computation

* fix typos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants