Statistical distributions and utility functions¶
Fast, numerically stable implementations of log PDFs and CDFs as well as statistical utility functions.
Note
Distributions are only guaranteed to be correct within their support. E.g. the behaviour of evaluating a Gamma distribution for negative values is undefined.
-
class
sciutils.stats.
BoundedVariable
(a=0, b=1)¶ A bounded variable \(y = a + \frac{(b - a)}{1 + \exp(-x)}\) on the interval \([a, b]\).
-
apply
(x)¶ Transform a variable from an unconstrained space to a possibly constrained space.
- Parameters
x (array_like) – Variable to transform.
- Returns
y (array_like) – Transformed variable.
log_jacobian (array_like) – Logarithm of the Jacobian associated with the transform.
-
invert
(y)¶ Transform a variable from a possibly constrained space to an untransformed space.
- Parameters
y (array_like) – Transformed variable.
- Returns
x – Variable after inverse transform.
- Return type
array_like
-
-
class
sciutils.stats.
ParameterReshaper
(parameters)¶ Reshape an array of parameters to a dictionary of named parameters and vice versa.
Trailing dimensions of each parameter are considered batch dimensions and are left unchanged.
- Parameters
parameters (dict[str, tuple]) – Mapping from parameter names to shapes.
Examples
>>> reshaper = su.stats.ParameterReshaper({'a': 2, 'b': (2, 3)}) >>> reshaper.to_dict(np.arange(reshaper.size)) {'a': array([0, 1]), 'b': array([[2, 3, 4], [5, 6, 7]])}
-
to_array
(values, moveaxis=False, validate=True)¶ Convert a dictionary of values to an array.
- Parameters
values (dict[str, np.ndarray]) – Mapping from parameter names to values.
moveaxis (bool) – Move the first axis to the last dimension after reshaping to an array, e.g. if the batch dimensions are leading.
validate (bool) – Validate the input at some cost to performance.
- Returns
array – Array of parameters encoding the named parameters.
- Return type
np.ndarray
-
to_dict
(array, moveaxis=False, validate=True)¶ Convert an array to a dictionary of values.
Trailing dimensions of the array are considered batch dimensions and are left unchanged.
- Parameters
array (np.ndarray) – Array of parameters encoding a parameter set.
moveaxis (bool) – Move the last axis to the first dimension before reshaping to a dictionary, e.g. if the batch dimensions are leading.
validate (bool) – Validate the input at some cost to performance.
- Returns
values – Mapping from parameter names to values.
- Return type
dict[str, np.ndarray]
-
class
sciutils.stats.
SemiBoundedVariable
(loc=0, scale=1)¶ A semi-bounded variable \(y = loc + scale\times\exp(x)\) on the interval \([loc, \inf]\) if \(scale > 0\) and \([-\inf, loc]\) if \(scale < 0\).
-
apply
(x)¶ Transform a variable from an unconstrained space to a possibly constrained space.
- Parameters
x (array_like) – Variable to transform.
- Returns
y (array_like) – Transformed variable.
log_jacobian (array_like) – Logarithm of the Jacobian associated with the transform.
-
invert
(y)¶ Transform a variable from a possibly constrained space to an untransformed space.
- Parameters
y (array_like) – Transformed variable.
- Returns
x – Variable after inverse transform.
- Return type
array_like
-
-
sciutils.stats.
cauchy_logcdf
(x, mu, sigma)¶ Evaluate the log CDF of the Cauchy distribution.
-
sciutils.stats.
cauchy_logpdf
(x, mu, sigma)¶ Evaluate the log PDF of the Cauchy distribution.
-
sciutils.stats.
evaluate_hpd_levels
(pdf, pvals)¶ Evaluate the levels that include a given fraction of the the probability mass.
- Parameters
pdf (array_like) – Probability density function evaluated over a regular grid.
pvals (array_like or int) – Probability mass to be included within the corresponding level or the number of levels.
- Returns
levels – Contour levels of the probability density function that enclose the desired probability mass.
- Return type
array_like
-
sciutils.stats.
evaluate_hpd_mass
(pdf)¶ Evaluate the highest posterior density mass excluded from isocontours.
- Parameters
pdf (array_like) – Probability density function evaluated over a regular grid.
- Returns
excluded – The probability mass excluded at a given isocontour of the pdf.
- Return type
array_like
-
sciutils.stats.
evaluate_mode
(x, lin=200, **kwargs)¶ Evaluate the mode of a univariate distribution based on samples using a kernel density estimate.
- Parameters
x (array_like) – Univariate samples from the distribution.
lin (array_like or int) – Sample points at which to evaluate the density estimate or the number of sample points across the range of the data.
**kwargs (dict) – Additional arguments passed to the
scipy.stats.gaussian_kde
constructor.
- Returns
mode
- Return type
float
-
sciutils.stats.
halfcauchy_logcdf
(x, mu, sigma)¶ Evaluate the log CDF of the half-Cauchy distribution.
-
sciutils.stats.
halfcauchy_logpdf
(x, mu, sigma)¶ Evaluate the log PDF of the half-Cauchy distribution.
-
sciutils.stats.
maybe_build_model
(model_code, root='.pystan', **kwargs)¶ Build a pystan model or retrieve a cached version.
- Parameters
model_code (str) – Stan model code to build.
root (str) – Root directory at which to cache models.
**kwargs (dict) – Additional arguments passed to the pystan.StanModel constructor.
- Returns
model – Compiled stan model.
- Return type
pystan.StanModel
-
sciutils.stats.
normal_logcdf
(x, mu, sigma)¶ Evaluate the log CDF of the normal distribution.
-
sciutils.stats.
normal_logpdf
(x, mu, sigma)¶ Evaluate the log PDF of the normal distribution.