socca.fitting.methods¶
Sampling methods for Bayesian inference.
This module provides various backends for posterior sampling:
Nested Sampling¶
Monte Carlo Sampling¶
Optimization¶
optimizer: Maximum a posteriori optimization
- socca.fitting.methods.run_nautilus(self, log_likelihood, log_prior, prior_transform, checkpoint, resume, getzprior, **kwargs)[source]¶
Run nested sampling using the Nautilus sampler.
Performs Bayesian parameter estimation using neural network- accelerated nested sampling via the Nautilus package. Supports checkpointing, resuming, and optional prior evidence computation.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
prior_transform (callable) – Function that transforms unit hypercube to parameter space.
checkpoint (str or None) – Path to checkpoint file for saving/resuming the sampler state.
resume (bool) – If True and checkpoint file exists, resume from saved state.
getzprior (bool) – If True, run a second nested sampling to estimate the prior evidence for Bayesian model comparison with prior deboosting.
**kwargs (dict) –
Additional keyword arguments passed to nautilus.Sampler and its run method. Common options include:
nlive/n_live : int, number of live points (default: 1000)
flive/f_live : float, stopping criterion (default: 0.01)
- discard_explorationbool, discard exploration phase
samples (default: True)
Set (Attributes)
--------------
sampler (nautilus.Sampler) – The main Nautilus sampler object.
samples (ndarray) – Posterior samples from nested sampling.
logw (ndarray) – Log-weights for each sample.
weights (ndarray) – Normalized importance weights for each sample.
logz (float) – Log-evidence (marginal likelihood) estimate.
sampler_prior (nautilus.Sampler or None) – Prior sampler object if getzprior=True, else None.
logz_prior (float or None) – Prior evidence if getzprior=True, else None.
Notes
Nautilus uses neural networks to learn the iso-likelihood contours, making it efficient for high-dimensional problems. The method prints elapsed time after completion.
References
Lange, J. U., MNRAS, 525, 3181 (2023) Nautilus documentation: https://nautilus-sampler.readthedocs.io/en/latest/
- socca.fitting.methods.run_dynesty(self, log_likelihood, log_prior, prior_transform, checkpoint, resume, getzprior, **kwargs)[source]¶
Run nested sampling using the Dynesty sampler.
Performs Bayesian parameter estimation using nested sampling via the Dynesty package. Supports checkpointing, resuming, and optional prior evidence computation.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
prior_transform (callable) – Function that transforms unit hypercube to parameter space.
checkpoint (str or None) – Path to checkpoint file for saving/resuming the sampler state. If None, no checkpointing is performed.
resume (bool) – If True and checkpoint file exists, resume from saved state.
getzprior (bool) – If True, run a second nested sampling to estimate the prior evidence for Bayesian model comparison with prior deboosting.
**kwargs (dict) –
Additional keyword arguments passed to dynesty.NestedSampler and its run_nested method. Common options include:
nlive : int, number of live points (default: 1000)
dlogz : float, stopping criterion (default: 0.01)
Set (Attributes)
--------------
sampler (dynesty.NestedSampler) – The main Dynesty sampler object.
samples (ndarray) – Posterior samples from nested sampling.
weights (ndarray) – Importance weights for each sample.
logz (float) – Log-evidence (marginal likelihood) estimate.
sampler_prior (dynesty.NestedSampler or None) – Prior sampler object if getzprior=True, else None.
logz_prior (float or None) – Prior evidence if getzprior=True, else None.
Notes
The method automatically extracts valid kwargs for NestedSampler and run_nested based on their function signatures.
References
Speagle, J. S., MNRAS, 493, 3132 (2020) Dynesty documentation: https://dynesty.readthedocs.io/en/v3.0.0/
- socca.fitting.methods.run_pocomc(self, log_likelihood, log_prior, checkpoint, resume, getzprior, **kwargs)[source]¶
Run nested sampling using the pocoMC sampler.
Performs Bayesian parameter estimation using preconditioned Monte Carlo nested sampling via the pocoMC package. Supports checkpointing, resuming, and optional prior evidence computation.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
checkpoint (str or None) – Path prefix for checkpoint files. If None and resume=True, defaults to “run”. Checkpoint files are saved in a directory named “{checkpoint}_pocomc_dump”.
resume (bool) – If True, resume from the latest saved state in the checkpoint directory.
getzprior (bool) – If True, run a second nested sampling to estimate the prior evidence for Bayesian model comparison with prior deboosting.
**kwargs (dict) –
Additional keyword arguments passed to pocomc.Sampler and its run method. Common options include:
- nlive/n_live/n_effectiveint, effective sample size
(default: 1000)
- n_activeint, number of active particles
(default: nlive // 2)
- save_everyint, save state every N iterations
(default: 10)
seed : int, random seed (default: 0)
Set (Attributes)
--------------
sampler (pocomc.Sampler) – The main pocoMC sampler object.
samples (ndarray) – Posterior samples from nested sampling.
logw (ndarray) – Log-weights for each sample.
weights (ndarray) – Normalized importance weights for each sample.
logz (float) – Log-evidence (marginal likelihood) estimate.
sampler_prior (pocomc.Sampler or None) – Prior sampler object if getzprior=True, else None.
logz_prior (float or None) – Prior evidence if getzprior=True, else None.
Notes
pocoMC uses normalizing flows for preconditioned sampling, making it efficient for complex, multimodal posteriors. Prior distributions must be NumPyro distributions.
References
Karamanis, M., Beutler, F., Peacock, J. A., Nabergoj, D., Seljak, U., MNRAS, 516, 1644 (2022) Karamanis, M., Nabergoj, D., Beutler, F., Peacock, J. A., Seljak, U., arXiv:2207.05660 (2022) pocoMC documentation: https://pocomc.readthedocs.io/en/latest/
- socca.fitting.methods.run_numpyro(self, log_likelihood, **kwargs)[source]¶
Run Hamiltonian Monte Carlo sampling using NumPyro’s NUTS.
Performs Bayesian parameter estimation using the No-U-Turn Sampler (NUTS), a variant of Hamiltonian Monte Carlo (HMC), via the NumPyro package.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
**kwargs (dict) –
Additional keyword arguments passed to numpyro.infer.NUTS and numpyro.infer.MCMC. Common options include:
- n_warmup/nwarmup/num_warmupint, number of warmup
iterations (default: 1000)
- n_samples/nsamples/num_samplesint, number of posterior
samples (default: 2000)
seed : int, random seed (default: 0)
Set (Attributes)
--------------
samples (ndarray) – Posterior samples from MCMC, shape (n_samples, n_params).
weights (ndarray) – Uniform weights (all ones) since MCMC samples are unweighted.
Notes
NUTS automatically tunes step size and number of leapfrog steps during the warmup phase. This method does not compute evidence, so it cannot be used for Bayesian model comparison. The method requires parameter priors to be NumPyro distributions.
References
numpyro documentation: https://num.pyro.ai/en/stable/
- socca.fitting.methods.run_emcee(self, log_likelihood, log_prior, checkpoint, resume, **kwargs)[source]¶
Run ensemble MCMC sampling using the emcee package.
Performs Bayesian parameter estimation using the affine-invariant ensemble sampler from the emcee package. Supports checkpointing via HDF5 backends and parallelization via MPI or multiprocessing.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
checkpoint (str or None) – Path to the HDF5 backend file for checkpointing. If None and resume=True, defaults to “run.hdf5”.
resume (bool) – If True, resume from the checkpoint file.
**kwargs (dict) –
Additional keyword arguments passed to emcee.EnsembleSampler and its run_mcmc method. Common options include:
nwalkers : int, number of walkers (default: 2 * ndim)
- nsteps/n_steps/num_stepsint, number of MCMC steps
(default: 5000). Acts as maximum when converge=True.
- discard/nburn/n_burnint, number of burn-in steps to
discard (default: 0). When converge=True and not set, auto-set to 2 * max(tau).
thin : int, thinning factor (default: 1)
seed : int, random seed (default: None)
- convergebool, if True run until convergence based on
autocorrelation time estimates (default: False)
- check_everyint, check convergence every N steps
(default: 100)
- tau_factorfloat, require chain length > tau_factor * tau
for convergence (default: 50)
- tau_rtolfloat, relative tolerance for tau stability
(default: 0.01)
- thin_factorfloat, thinning factor applied to tau when checking convergence
(default: 0.50)
- discard_factorfloat, factor multiplied by tau to determine burn-in discard when converge=True
(default: 2.0)
Set (Attributes)
--------------
sampler (emcee.EnsembleSampler) – The emcee sampler object.
samples (ndarray) – Posterior samples, shape (n_samples, n_params).
weights (ndarray) – Uniform weights (all ones) since MCMC samples are unweighted.
tau (ndarray or None) – Integrated autocorrelation time per parameter, if computed.
Notes
emcee uses an affine-invariant ensemble of walkers to explore the posterior. It does not compute evidence, so it cannot be used for Bayesian model comparison. Initial walker positions are drawn from the prior distributions.
When converge=True, the sampler runs in batches and checks the integrated autocorrelation time after each batch. Convergence is declared when (1) the chain is longer than tau_factor * tau for all parameters, and (2) the tau estimate has stabilized (relative change < tau_rtol). The burn-in (discard) is then auto-set to 2 * max(tau) unless explicitly provided.
References
Foreman-Mackey, D., Hogg, D. W., Lang, D., Goodman, J., PASP, 125, 306 (2013) emcee documentation: https://emcee.readthedocs.io/en/stable/
- socca.fitting.methods.run_optimizer(self, pinits, **kwargs)[source]¶
Run maximum likelihood optimization using scipy.optimize.
Finds the maximum likelihood estimate (MLE) or maximum a posteriori (MAP) estimate using L-BFGS-B optimization with automatic differentiation via JAX.
- Parameters:
pinits (array_like, str) –
Initial parameter values. Can be:
- array_likespecific initial values in parameter space
(will be transformed to unit hypercube internally)
- ”median”start from the median of each prior (0.5 in
unit hypercube)
”random” : start from random values in unit hypercube
**kwargs (dict) –
Additional keyword arguments passed to scipy.optimize.minimize. Common options include:
tol : float, tolerance for termination
options : dict, solver-specific options
Set (Attributes)
--------------
results (scipy.optimize.OptimizeResult) –
Optimization result object containing:
x : optimal parameters in unit hypercube
fun : negative log-likelihood at optimum
success : whether optimization succeeded
message : description of termination cause
- Raises:
ValueError – If pinits is a string other than “median” or “random”.
Notes
The optimization is performed in the unit hypercube space with bounds [0, 1] for each parameter. The objective function is the negative log-likelihood, and gradients are computed automatically using JAX. The L-BFGS-B method is used for box-constrained optimization.
References
Byrd, R. H,, Lu, P., Nocedal, J., SIAM J. Sci. Statist. Comput 16, 1190 (1995) Zhu, C., Byrd, R. H., Nocedal, J. TOMS, 23, 550 (1997) scipy.optimize.minimize documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html L-BFGS-B documentation: https://docs.scipy.org/doc/scipy/reference/optimize.minimize-lbfgsb.html#optimize-minimize-lbfgsb
Nested Sampling¶
nautilus¶
Nautilus neural network-accelerated nested sampling backend.
- socca.fitting.methods.nautilus.run_nautilus(self, log_likelihood, log_prior, prior_transform, checkpoint, resume, getzprior, **kwargs)[source]¶
Run nested sampling using the Nautilus sampler.
Performs Bayesian parameter estimation using neural network- accelerated nested sampling via the Nautilus package. Supports checkpointing, resuming, and optional prior evidence computation.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
prior_transform (callable) – Function that transforms unit hypercube to parameter space.
checkpoint (str or None) – Path to checkpoint file for saving/resuming the sampler state.
resume (bool) – If True and checkpoint file exists, resume from saved state.
getzprior (bool) – If True, run a second nested sampling to estimate the prior evidence for Bayesian model comparison with prior deboosting.
**kwargs (dict) –
Additional keyword arguments passed to nautilus.Sampler and its run method. Common options include:
nlive/n_live : int, number of live points (default: 1000)
flive/f_live : float, stopping criterion (default: 0.01)
- discard_explorationbool, discard exploration phase
samples (default: True)
Set (Attributes)
--------------
sampler (nautilus.Sampler) – The main Nautilus sampler object.
samples (ndarray) – Posterior samples from nested sampling.
logw (ndarray) – Log-weights for each sample.
weights (ndarray) – Normalized importance weights for each sample.
logz (float) – Log-evidence (marginal likelihood) estimate.
sampler_prior (nautilus.Sampler or None) – Prior sampler object if getzprior=True, else None.
logz_prior (float or None) – Prior evidence if getzprior=True, else None.
Notes
Nautilus uses neural networks to learn the iso-likelihood contours, making it efficient for high-dimensional problems. The method prints elapsed time after completion.
References
Lange, J. U., MNRAS, 525, 3181 (2023) Nautilus documentation: https://nautilus-sampler.readthedocs.io/en/latest/
dynesty¶
Dynesty nested sampling backend.
- socca.fitting.methods.dynesty.run_dynesty(self, log_likelihood, log_prior, prior_transform, checkpoint, resume, getzprior, **kwargs)[source]¶
Run nested sampling using the Dynesty sampler.
Performs Bayesian parameter estimation using nested sampling via the Dynesty package. Supports checkpointing, resuming, and optional prior evidence computation.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
prior_transform (callable) – Function that transforms unit hypercube to parameter space.
checkpoint (str or None) – Path to checkpoint file for saving/resuming the sampler state. If None, no checkpointing is performed.
resume (bool) – If True and checkpoint file exists, resume from saved state.
getzprior (bool) – If True, run a second nested sampling to estimate the prior evidence for Bayesian model comparison with prior deboosting.
**kwargs (dict) –
Additional keyword arguments passed to dynesty.NestedSampler and its run_nested method. Common options include:
nlive : int, number of live points (default: 1000)
dlogz : float, stopping criterion (default: 0.01)
Set (Attributes)
--------------
sampler (dynesty.NestedSampler) – The main Dynesty sampler object.
samples (ndarray) – Posterior samples from nested sampling.
weights (ndarray) – Importance weights for each sample.
logz (float) – Log-evidence (marginal likelihood) estimate.
sampler_prior (dynesty.NestedSampler or None) – Prior sampler object if getzprior=True, else None.
logz_prior (float or None) – Prior evidence if getzprior=True, else None.
Notes
The method automatically extracts valid kwargs for NestedSampler and run_nested based on their function signatures.
References
Speagle, J. S., MNRAS, 493, 3132 (2020) Dynesty documentation: https://dynesty.readthedocs.io/en/v3.0.0/
Monte Carlo Sampling¶
pocomc¶
pocoMC preconditioned Monte Carlo sampling backend.
- socca.fitting.methods.pocomc.run_pocomc(self, log_likelihood, log_prior, checkpoint, resume, getzprior, **kwargs)[source]¶
Run nested sampling using the pocoMC sampler.
Performs Bayesian parameter estimation using preconditioned Monte Carlo nested sampling via the pocoMC package. Supports checkpointing, resuming, and optional prior evidence computation.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
checkpoint (str or None) – Path prefix for checkpoint files. If None and resume=True, defaults to “run”. Checkpoint files are saved in a directory named “{checkpoint}_pocomc_dump”.
resume (bool) – If True, resume from the latest saved state in the checkpoint directory.
getzprior (bool) – If True, run a second nested sampling to estimate the prior evidence for Bayesian model comparison with prior deboosting.
**kwargs (dict) –
Additional keyword arguments passed to pocomc.Sampler and its run method. Common options include:
- nlive/n_live/n_effectiveint, effective sample size
(default: 1000)
- n_activeint, number of active particles
(default: nlive // 2)
- save_everyint, save state every N iterations
(default: 10)
seed : int, random seed (default: 0)
Set (Attributes)
--------------
sampler (pocomc.Sampler) – The main pocoMC sampler object.
samples (ndarray) – Posterior samples from nested sampling.
logw (ndarray) – Log-weights for each sample.
weights (ndarray) – Normalized importance weights for each sample.
logz (float) – Log-evidence (marginal likelihood) estimate.
sampler_prior (pocomc.Sampler or None) – Prior sampler object if getzprior=True, else None.
logz_prior (float or None) – Prior evidence if getzprior=True, else None.
Notes
pocoMC uses normalizing flows for preconditioned sampling, making it efficient for complex, multimodal posteriors. Prior distributions must be NumPyro distributions.
References
Karamanis, M., Beutler, F., Peacock, J. A., Nabergoj, D., Seljak, U., MNRAS, 516, 1644 (2022) Karamanis, M., Nabergoj, D., Beutler, F., Peacock, J. A., Seljak, U., arXiv:2207.05660 (2022) pocoMC documentation: https://pocomc.readthedocs.io/en/latest/
emcee¶
emcee affine-invariant ensemble MCMC backend.
- socca.fitting.methods.emcee.run_emcee(self, log_likelihood, log_prior, checkpoint, resume, **kwargs)[source]¶
Run ensemble MCMC sampling using the emcee package.
Performs Bayesian parameter estimation using the affine-invariant ensemble sampler from the emcee package. Supports checkpointing via HDF5 backends and parallelization via MPI or multiprocessing.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
log_prior (callable) – Function that computes the log-prior given parameters.
checkpoint (str or None) – Path to the HDF5 backend file for checkpointing. If None and resume=True, defaults to “run.hdf5”.
resume (bool) – If True, resume from the checkpoint file.
**kwargs (dict) –
Additional keyword arguments passed to emcee.EnsembleSampler and its run_mcmc method. Common options include:
nwalkers : int, number of walkers (default: 2 * ndim)
- nsteps/n_steps/num_stepsint, number of MCMC steps
(default: 5000). Acts as maximum when converge=True.
- discard/nburn/n_burnint, number of burn-in steps to
discard (default: 0). When converge=True and not set, auto-set to 2 * max(tau).
thin : int, thinning factor (default: 1)
seed : int, random seed (default: None)
- convergebool, if True run until convergence based on
autocorrelation time estimates (default: False)
- check_everyint, check convergence every N steps
(default: 100)
- tau_factorfloat, require chain length > tau_factor * tau
for convergence (default: 50)
- tau_rtolfloat, relative tolerance for tau stability
(default: 0.01)
- thin_factorfloat, thinning factor applied to tau when checking convergence
(default: 0.50)
- discard_factorfloat, factor multiplied by tau to determine burn-in discard when converge=True
(default: 2.0)
Set (Attributes)
--------------
sampler (emcee.EnsembleSampler) – The emcee sampler object.
samples (ndarray) – Posterior samples, shape (n_samples, n_params).
weights (ndarray) – Uniform weights (all ones) since MCMC samples are unweighted.
tau (ndarray or None) – Integrated autocorrelation time per parameter, if computed.
Notes
emcee uses an affine-invariant ensemble of walkers to explore the posterior. It does not compute evidence, so it cannot be used for Bayesian model comparison. Initial walker positions are drawn from the prior distributions.
When converge=True, the sampler runs in batches and checks the integrated autocorrelation time after each batch. Convergence is declared when (1) the chain is longer than tau_factor * tau for all parameters, and (2) the tau estimate has stabilized (relative change < tau_rtol). The burn-in (discard) is then auto-set to 2 * max(tau) unless explicitly provided.
References
Foreman-Mackey, D., Hogg, D. W., Lang, D., Goodman, J., PASP, 125, 306 (2013) emcee documentation: https://emcee.readthedocs.io/en/stable/
numpyro¶
NumPyro NUTS Hamiltonian Monte Carlo backend.
- socca.fitting.methods.numpyro.run_numpyro(self, log_likelihood, **kwargs)[source]¶
Run Hamiltonian Monte Carlo sampling using NumPyro’s NUTS.
Performs Bayesian parameter estimation using the No-U-Turn Sampler (NUTS), a variant of Hamiltonian Monte Carlo (HMC), via the NumPyro package.
- Parameters:
log_likelihood (callable) – Function that computes the log-likelihood given parameters.
**kwargs (dict) –
Additional keyword arguments passed to numpyro.infer.NUTS and numpyro.infer.MCMC. Common options include:
- n_warmup/nwarmup/num_warmupint, number of warmup
iterations (default: 1000)
- n_samples/nsamples/num_samplesint, number of posterior
samples (default: 2000)
seed : int, random seed (default: 0)
Set (Attributes)
--------------
samples (ndarray) – Posterior samples from MCMC, shape (n_samples, n_params).
weights (ndarray) – Uniform weights (all ones) since MCMC samples are unweighted.
Notes
NUTS automatically tunes step size and number of leapfrog steps during the warmup phase. This method does not compute evidence, so it cannot be used for Bayesian model comparison. The method requires parameter priors to be NumPyro distributions.
References
numpyro documentation: https://num.pyro.ai/en/stable/
Optimization¶
optimizer¶
L-BFGS-B maximum likelihood and MAP optimization backend.
- socca.fitting.methods.optimizer.run_optimizer(self, pinits, **kwargs)[source]¶
Run maximum likelihood optimization using scipy.optimize.
Finds the maximum likelihood estimate (MLE) or maximum a posteriori (MAP) estimate using L-BFGS-B optimization with automatic differentiation via JAX.
- Parameters:
pinits (array_like, str) –
Initial parameter values. Can be:
- array_likespecific initial values in parameter space
(will be transformed to unit hypercube internally)
- ”median”start from the median of each prior (0.5 in
unit hypercube)
”random” : start from random values in unit hypercube
**kwargs (dict) –
Additional keyword arguments passed to scipy.optimize.minimize. Common options include:
tol : float, tolerance for termination
options : dict, solver-specific options
Set (Attributes)
--------------
results (scipy.optimize.OptimizeResult) –
Optimization result object containing:
x : optimal parameters in unit hypercube
fun : negative log-likelihood at optimum
success : whether optimization succeeded
message : description of termination cause
- Raises:
ValueError – If pinits is a string other than “median” or “random”.
Notes
The optimization is performed in the unit hypercube space with bounds [0, 1] for each parameter. The objective function is the negative log-likelihood, and gradients are computed automatically using JAX. The L-BFGS-B method is used for box-constrained optimization.
References
Byrd, R. H,, Lu, P., Nocedal, J., SIAM J. Sci. Statist. Comput 16, 1190 (1995) Zhu, C., Byrd, R. H., Nocedal, J. TOMS, 23, 550 (1997) scipy.optimize.minimize documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html L-BFGS-B documentation: https://docs.scipy.org/doc/scipy/reference/optimize.minimize-lbfgsb.html#optimize-minimize-lbfgsb
Utilities¶
Utility functions for sampler output processing.
- socca.fitting.methods.utils.get_imp_weights(logw, logz=None)[source]¶
Compute importance weights from log-weights and log-evidence.
Converts log-weights to normalized importance weights using the log-evidence for numerical stability. The weights are normalized such that they sum to 1.0.
- Parameters:
logw (array_like) – Log-weights from importance-weighted sampling.
logz (float or array_like, optional) – Log-evidence value(s). If None, uses the maximum log-weight. If not None and not iterable, converts to a single-element list. Default is None.
- Returns:
weights – Normalized importance weights in linear space.
- Return type:
ndarray
Notes
The importance weights are computed as:
\[w_i = \exp[(\log w_i - \log Z) - \log\sum_j \exp(\log w_j - \log Z)]\]where \(\log Z\) is the log-evidence (logz[-1]).