SMC: estimate marginal likelihood (#2563)

aloctavodia · ColCarroll · commit fdef760b6d8c · 2017-11-09T10:43:48.000-05:00
* SMC: estimate marginal likelihood

* add example marginal likelihood computation

* fix typos
diff --git a/docs/source/examples.rst b/docs/source/examples.rst
@@ -13,6 +13,7 @@ Howto
    notebooks/posterior_predictive.ipynb
    notebooks/model_comparison.ipynb
    notebooks/model_averaging.ipynb
+   notebooks/Bayes_factor.ipynb
    notebooks/howto_debugging.ipynb
    notebooks/PyMC3_tips_and_heuristic.ipynb
    notebooks/LKJ.ipynb
diff --git a/docs/source/notebooks/Bayes_factor.ipynb b/docs/source/notebooks/Bayes_factor.ipynb
diff --git a/docs/source/notebooks/GLM-model-selection.ipynb b/docs/source/notebooks/GLM-model-selection.ipynb
@@ -1452,45 +1452,9 @@
    "source": [
     "## Final remarks and tips\n",
     "\n",
-    "It is important to keep in mind that, with more data point, the real underlying model (one that we used to generate the data) should outperforms other models. \n",
+    "It is important to keep in mind that, with more data points, the real underlying model (one that we used to generate the data) should outperforms other models. \n",
     "\n",
-    "In general, PSIS-LOO is recommanded. To quote from [avehtari's comment](https://github.com/pymc-devs/pymc3/issues/938#issuecomment-313425552): \"I also recommend using PSIS-LOO instead of WAIC, because it's more reliable and has better diagnostics as discussed in http://link.springer.com/article/10.1007/s11222-016-9696-4 (preprint https://arxiv.org/abs/1507.04544), but if you insist to have one information criterion then leave WAIC.\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Bayes Factor"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Following text lifted directly from [JakeVDP blogpost](https://jakevdp.github.io/blog/2015/08/07/frequentism-and-bayesianism-5-model-selection/)\n",
-    "\n",
-    "The Bayesian approach proceeds very differently. Recall that the Bayesian model involves computing the odds ratio between two models:\n",
-    "\n",
-    "$$O_{21}=\\frac{P(M_{2} \\;|\\; D)}{P(M_{1} \\;|\\; D)}=\\frac{P(D \\;|\\; M_{2})}{P(D \\;|\\; M_{1})}\\frac{P(M_{2})}{P(M_{1})}$$\n",
-    "\n",
-    "Here the ratio $\\frac{P(M2)}{P(M1)}$ is the prior odds ratio, and is often assumed to be equal to 1 if no compelling prior evidence favors one model over another. The ratio $\\frac{P(D \\;|\\; M2)}{P(D \\;|\\; M1)}$ is the **Bayes factor**, and is the key to Bayesian model selection.\n",
-    "\n",
-    "\n",
-    "The Bayes factor can be computed by evaluating the integral over the parameter likelihood:\n",
-    "\n",
-    "$$P(D \\;|\\; M)=\\int_{\\Omega}P(D \\;|\\; \\theta,M) \\; P(\\theta \\;|\\; M) \\;d\\theta$$\n",
-    "\n",
-    "This integral is over the entire parameter space of the model, and thus can be extremely computationally intensive, especially as the dimension of the model grows beyond a few. "
+    "In general, PSIS-LOO is recommended. To quote from [avehtari's comment](https://github.com/pymc-devs/pymc3/issues/938#issuecomment-313425552): \"I also recommend using PSIS-LOO instead of WAIC, because it's more reliable and has better diagnostics as discussed in http://link.springer.com/article/10.1007/s11222-016-9696-4 (preprint https://arxiv.org/abs/1507.04544), but if you insist to have one information criterion then leave WAIC\". Alternatively Watanabe [says](http://watanabe-www.math.dis.titech.ac.jp/users/swatanab/index.html) \"WAIC is a better approximator of the generalization error than the pareto smoothing importance sampling cross validation. The Pareto smoothing cross validation may be the better approximator of the cross validation than WAIC, however, it is not of the generalization error\"."
    ]
   },
   {
@@ -1518,7 +1482,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.5.2"
+   "version": "3.6.1"
   },
   "widgets": {
    "state": {
diff --git a/pymc3/step_methods/smc.py b/pymc3/step_methods/smc.py
@@ -165,6 +165,7 @@ def __init__(self, vars=None, out_vars=None, n_chains=100, scaling=1., covarianc
         self.accepted = 0
 
         self.beta = 0
+        self.sjs = 1
         self.stage = 0
         self.chain_index = 0
         self.resampling_indexes = np.arange(n_chains)
@@ -276,23 +277,25 @@ def calc_beta(self):
             tempering parameter of the current stage
         weights : :class:`numpy.ndarray`
             Importance weights (floats)
+        sj : float
+            Mean of unnormalized weights
         """
-        low_beta = self.beta
+        low_beta = old_beta = self.beta
         up_beta = 2.
-        old_beta = self.beta
 
         while up_beta - low_beta > 1e-6:
             current_beta = (low_beta + up_beta) / 2.
-            temp = np.exp((current_beta - self.beta) * (self.likelihoods - self.likelihoods.max()))
-            cov_temp = np.std(temp) / np.mean(temp)
+            weights_un = np.exp((current_beta - self.beta) *
+                                (self.likelihoods - self.likelihoods.max()))
+            sj = np.mean(weights_un)
+            cov_temp = np.std(weights_un) / sj
             if cov_temp > self.coef_variation:
                 up_beta = current_beta
             else:
                 low_beta = current_beta
 
-        beta = current_beta
-        weights = temp / np.sum(temp)
-        return beta, old_beta, weights
+        weights = weights_un / np.sum(weights_un)
+        return current_beta, old_beta, weights, sj
 
     def calc_covariance(self):
         """Calculate trace covariance matrix based on importance weights.
@@ -551,7 +554,8 @@ def sample_smc(n_steps, n_chains=100, step=None, start=None, homepath=None, stag
 
             step.population, step.array_population, step.likelihoods = step.select_end_points(
                 mtrace)
-            step.beta, step.old_beta, step.weights = step.calc_beta()
+            step.beta, step.old_beta, step.weights, sj = step.calc_beta()
+            step.sjs *= sj
 
             if step.beta > 1.:
                 pm._log.info('Beta > 1.: %f' % step.beta)
@@ -576,8 +580,8 @@ def sample_smc(n_steps, n_chains=100, step=None, start=None, homepath=None, stag
         pm._log.info('Sample final stage')
         step.stage = -1
         chains = stage_handler.clean_directory(step.stage, chains, rm_flag)
-        temp = np.exp((1 - step.old_beta) * (step.likelihoods - step.likelihoods.max()))
-        step.weights = temp / np.sum(temp)
+        weights_un = np.exp((1 - step.old_beta) * (step.likelihoods - step.likelihoods.max()))
+        step.weights = weights_un / np.sum(weights_un)
         step.covariance = step.calc_covariance()
         step.proposal_dist = choose_proposal(step.proposal_name, scale=step.covariance)
         step.resampling_indexes = step.resample()
@@ -590,6 +594,8 @@ def sample_smc(n_steps, n_chains=100, step=None, start=None, homepath=None, stag
         _iter_parallel_chains(**sample_args)
 
         stage_handler.dump_atmip_params(step)
+
+        model.marginal_likelihood = step.sjs
         return stage_handler.create_result_trace(step.stage,
                                                  idxs=range(n_steps),
                                                  step=step,
diff --git a/pymc3/tests/test_smc.py b/pymc3/tests/test_smc.py
@@ -18,8 +18,8 @@ def setup_class(self):
         super(TestSMC, self).setup_class()
         self.test_folder = mkdtemp(prefix='ATMIP_TEST')
 
-        self.n_chains = 300
-        self.n_steps = 100
+        self.n_chains = 1000
+        self.n_steps = 10
         self.tune_interval = 25
 
         n = 4
@@ -76,6 +76,8 @@ def test_sample_n_core(self, n_jobs, stage):
         x = mtrace.get_values('X')
         mu1d = np.abs(x).mean(axis=0)
         np.testing.assert_allclose(self.muref, mu1d, rtol=0., atol=0.03)
+        # Scenario IV Ching, J. & Chen, Y. 2007
+        assert np.round(np.log(self.ATMIP_test.marginal_likelihood)) == -11.0
 
     def test_stage_handler(self):
         stage_number = -1