Skip to content

Increase unittest check_logcdf coverage and fix issues with some distribution methods #4393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jan 2, 2021
1 change: 1 addition & 0 deletions RELEASE-NOTES.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ It also brings some dreadfully awaited fixes, so be sure to go through the chang
- Fixed mathematical formulation in `MvStudentT` random method. (see [#4359](https://github.com/pymc-devs/pymc3/pull/4359))
- Fix issue in `logp` method of `HyperGeometric`. It now returns `-inf` for invalid parameters (see [4367](https://github.com/pymc-devs/pymc3/pull/4367))
- Fixed `MatrixNormal` random method to work with parameters as random variables. (see [#4368](https://github.com/pymc-devs/pymc3/pull/4368))
- Update the `logcdf` method of several continuous distributions to return -inf for invalid parameters and values, and raise an informative error when multiple values cannot be evaluated in a single call. (see [4393](https://github.com/pymc-devs/pymc3/pull/4393))

## PyMC3 3.10.0 (7 December 2020)

Expand Down
126 changes: 83 additions & 43 deletions pymc3/distributions/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -278,21 +278,24 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Returns
-------
TensorVariable
"""
lower = self.lower
upper = self.upper

return tt.switch(
tt.or_(tt.lt(value, self.lower), tt.gt(value, self.upper)),
tt.lt(value, lower) | tt.lt(upper, lower),
-np.inf,
tt.switch(
tt.eq(value, self.upper),
tt.lt(value, upper),
tt.log(value - lower) - tt.log(upper - lower),
0,
tt.log(value - self.lower) - tt.log(self.upper - self.lower),
),
)

Expand Down Expand Up @@ -344,7 +347,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -401,7 +404,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -542,7 +545,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -900,7 +903,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand All @@ -910,10 +913,10 @@ def logcdf(self, value):
"""
sigma = self.sigma
z = zvalue(value, mu=0, sigma=sigma)
return tt.switch(
tt.lt(z, -1.0),
tt.log(tt.erfcx(-z / tt.sqrt(2.0))) - tt.sqr(z),
return bound(
tt.log1p(-tt.erfc(z / tt.sqrt(2.0))),
0 <= value,
0 < sigma,
)


Expand Down Expand Up @@ -1106,7 +1109,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -1297,20 +1300,30 @@ def logcdf(self, value):
Parameters
----------
value: numeric
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.
Value(s) for which log CDF is calculated.

Returns
-------
TensorVariable
"""
value = floatX(tt.as_tensor(value))
a = floatX(tt.as_tensor(self.alpha))
b = floatX(tt.as_tensor(self.beta))
return tt.switch(
tt.le(value, 0),
-np.inf,
tt.switch(tt.ge(value, 1), 0, tt.log(incomplete_beta(a, b, value))),
# incomplete_beta function can only handle scalar values (see #4342)
if np.ndim(value):
raise TypeError(
f"Beta.logcdf expects a scalar value but received a {np.ndim(value)}-dimensional object."
)

a = self.alpha
b = self.beta

return bound(
tt.switch(
tt.lt(value, 1),
tt.log(incomplete_beta(a, b, value)),
0,
),
0 <= value,
0 < a,
0 < b,
)

def _distr_parameters_for_repr(self):
Expand Down Expand Up @@ -1521,7 +1534,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -1636,7 +1649,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -1792,7 +1805,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -1955,20 +1968,32 @@ def logcdf(self, value):
Parameters
----------
value: numeric
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.
Value(s) for which log CDF is calculated.

Returns
-------
TensorVariable
"""
# incomplete_beta function can only handle scalar values (see #4342)
if np.ndim(value):
raise TypeError(
f"StudentT.logcdf expects a scalar value but received a {np.ndim(value)}-dimensional object."
)

nu = self.nu
mu = self.mu
sigma = self.sigma
lam = self.lam
t = (value - mu) / sigma
sqrt_t2_nu = tt.sqrt(t ** 2 + nu)
x = (t + sqrt_t2_nu) / (2.0 * sqrt_t2_nu)
return tt.log(incomplete_beta(nu / 2.0, nu / 2.0, x))

return bound(
tt.log(incomplete_beta(nu / 2.0, nu / 2.0, x)),
0 < nu,
0 < sigma,
0 < lam,
)


class Pareto(Continuous):
Expand Down Expand Up @@ -2090,7 +2115,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -2209,7 +2234,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -2317,7 +2342,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -2468,7 +2493,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand All @@ -2478,7 +2503,17 @@ def logcdf(self, value):
"""
alpha = self.alpha
beta = self.beta
return bound(tt.log(tt.gammainc(alpha, beta * value)), value >= 0, alpha > 0, beta > 0)
# Avoid C-assertion when the gammainc function is called with invalid values (#4340)
safe_alpha = tt.switch(tt.lt(alpha, 0), 0, alpha)
safe_beta = tt.switch(tt.lt(beta, 0), 0, beta)
safe_value = tt.switch(tt.lt(value, 0), 0, value)

return bound(
tt.log(tt.gammainc(safe_alpha, safe_beta * safe_value)),
0 <= value,
0 < alpha,
0 < beta,
)

def _distr_parameters_for_repr(self):
return ["alpha", "beta"]
Expand Down Expand Up @@ -2632,7 +2667,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand All @@ -2642,11 +2677,16 @@ def logcdf(self, value):
"""
alpha = self.alpha
beta = self.beta
# Avoid C-assertion when the gammaincc function is called with invalid values (#4340)
safe_alpha = tt.switch(tt.lt(alpha, 0), 0, alpha)
safe_beta = tt.switch(tt.lt(beta, 0), 0, beta)
safe_value = tt.switch(tt.lt(value, 0), 0, value)

return bound(
tt.log(tt.gammaincc(alpha, beta / value)),
value >= 0,
alpha > 0,
beta > 0,
tt.log(tt.gammaincc(safe_alpha, safe_beta / safe_value)),
0 <= value,
0 < alpha,
0 < beta,
)


Expand Down Expand Up @@ -2814,7 +2854,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -3114,7 +3154,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -3503,7 +3543,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -3632,7 +3672,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -3914,7 +3954,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down Expand Up @@ -4256,7 +4296,7 @@ def logcdf(self, value):

Parameters
----------
value: numeric
value: numeric or np.ndarray or theano.tensor
Value(s) for which log CDF is calculated. If the log CDF for multiple
values are desired the values must be provided in a numpy array or theano tensor.

Expand Down
Loading