|
114 | 114 | "\n",
|
115 | 115 | "It is not possible to directly sample from this distribution owing to a computationally intractable normalization term.\n",
|
116 | 116 | "\n",
|
117 |
| - "[Metropolis-Hastings algorithms](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm) are technique for for sampling from intractable-to-normalize distributions.\n", |
| 117 | + "[Metropolis-Hastings algorithms](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm) are techniques for sampling from intractable-to-normalize distributions.\n", |
118 | 118 | "\n",
|
119 | 119 | "TensorFlow Probability offers a number of MCMC options, including several based on Metropolis-Hastings. In this notebook, we'll use [Hamiltonian Monte Carlo](https://en.wikipedia.org/wiki/Hamiltonian_Monte_Carlo) (`tfp.mcmc.HamiltonianMonteCarlo`). HMC is often a good choice because it can converge rapidly, samples the state space jointly (as opposed to coordinatewise), and leverages one of TF's virtues: automatic differentiation. That said, sampling from a BGMM posterior might actually be better done by other approaches, e.g., [Gibb's sampling](https://en.wikipedia.org/wiki/Gibbs_sampling)."
|
120 | 120 | ]
|
|
154 | 154 | "id": "Uj9uHZN2yUqz"
|
155 | 155 | },
|
156 | 156 | "source": [
|
157 |
| - "Before actually building the model, we'll need to define a new type of distribution. From the model specification above, its clear we're parameterizing the MVN with an inverse covariance matrix, i.e., [precision matrix](https://en.wikipedia.org/wiki/Precision_(statistics%29). To accomplish this in TF, we'll need to roll out our `Bijector`. This `Bijector` will use the forward transformation:\n", |
| 157 | + "Before actually building the model, we'll need to define a new type of distribution. From the model specification above, it's clear we're parameterizing the MVN with an inverse covariance matrix, i.e., [precision matrix](https://en.wikipedia.org/wiki/Precision_(statistics%29). To accomplish this in TF, we'll need to roll out our `Bijector`. This `Bijector` will use the forward transformation:\n", |
158 | 158 | "\n",
|
159 | 159 | "- `Y = tf.linalg.triangular_solve((tf.linalg.matrix_transpose(chol_precision_tril), X, adjoint=True) + loc`.\n",
|
160 | 160 | "\n",
|
|
459 | 459 | "id": "JS8XOsxiyiBV"
|
460 | 460 | },
|
461 | 461 | "source": [
|
462 |
| - "Hamiltonian Monte Carlo (HMC) requires the target log-probability function be differentiable with respect to its arguments. Furthermore, HMC can exhibit dramatically higher statistical efficiency if the state-space is unconstrained.\n", |
| 462 | + "Hamiltonian Monte Carlo (HMC) requires the target log-probability function to be differentiable with respect to its arguments. Furthermore, HMC can exhibit dramatically higher statistical efficiency if the state-space is unconstrained.\n", |
463 | 463 | "\n",
|
464 | 464 | "This means we'll have to work out two main issues when sampling from the BGMM posterior:\n",
|
465 | 465 | "\n",
|
|
479 | 479 | "2. run the MCMC in unconstrained space\n",
|
480 | 480 | "3. transform the unconstrained variables back to the constrained space.\n",
|
481 | 481 | "\n",
|
482 |
| - "As with `MVNCholPrecisionTriL`, we'll use [`Bijector`s](https://www.tensorflow.org/api_docs/python/tf/distributions/bijectors/Bijector) to transform random variables to unconstrained space.\n", |
| 482 | + "As with `MVNCholPrecisionTriL`, we'll use [`Bijector`s](https://www.tensorflow.org/probability/api_docs/python/tfp/bijectors/Bijector) to transform random variables to unconstrained space.\n", |
483 | 483 | "\n",
|
484 | 484 | "- The [`Dirichlet`](https://en.wikipedia.org/wiki/Dirichlet_distribution) is transformed to unconstrained space via the [softmax function](https://en.wikipedia.org/wiki/Softmax_function).\n",
|
485 | 485 | "\n",
|
486 |
| - "- Our precision random variable is a distribution over postive semidefinite matrices. To unconstrain these we'll use the `FillTriangular` and `TransformDiagonal` bijectors. These convert vectors to lower-triangular matrices and ensure the diagonal is positive. The former is useful because it enables sampling only $d(d+1)/2$ floats rather than $d^2$." |
| 486 | + "- Our precision random variable is a distribution over positive semidefinite matrices. To unconstrain these we'll use the `FillTriangular` and `TransformDiagonal` bijectors. These convert vectors to lower-triangular matrices and ensure the diagonal is positive. The former is useful because it enables sampling only $d(d+1)/2$ floats rather than $d^2$." |
487 | 487 | ]
|
488 | 488 | },
|
489 | 489 | {
|
|
0 commit comments