BayesiPy

BayesiPy is a Python library for post-hoc uncertainty estimation in pre-trained neural networks. In other words, given a deep neural network (DNN) that was trained via standard back-propagation, BayesiPy can enhance it with calibrated confidence estimates (error bars on the predictions) without altering the model’s original accuracy. This is crucial in applications where knowing the model’s uncertainty is as important as the prediction itself (e.g., in medical diagnosis or autonomous driving).

BayesiPy provides a suite of state-of-the-art techniques to quantify uncertainty post-hoc, each balancing fidelity and computational cost in different ways. These include:

A full range of Linearized Laplace (LLA) methods (from aleximmer/Laplace):
- Full Laplace
- Layerwise Laplace
- Subnetwork Laplace
- Last-Layer Laplace
- Various curvature approximations (Exact, Kron, Diagonal, KFAC, etc.)
accElerated Linearized Laplace Approximation (ELLA)
Variational Linearized Laplace Approximation (VaLLA)
Scalable Linearized Laplace Approximation (ScaLLA)
Mean-Field Variational Inference (MFVI)
Spectral-normalized Gaussian Process (SNGP)
Fixed-Mean Gaussian Process (FM-GP)

In the sections below, we explain each technique and cite their original references. We then compare the methods, highlighting trade-offs in calibration, scalability, and computational cost. Finally, we show how to install BayesiPy, give a simple usage example, and outline how you can contribute to the project.

Instalation

BayesiPy is a Python (≥3.10) / PyTorch library. You need a working PyTorch install (CPU or GPU) – see pytorch.org for platform-specific instructions.

1. Clone the repository

git clone https://github.com/Ludvins/BayesiPy.git
cd BayesiPy

2. Create and activate a virtual environment (recommended)

python -m venv .venv
source .venv/bin/activate 
python -m pip install --upgrade pip

3.Install BayesiPy

User install (just the library + core deps):

pip install .

or directly from GitHub in another project:

pip install "git+https://github.com/Ludvins/BayesiPy.git"

This installs BayesiPy as a package (import bayesipy) from your working copy, so changes in the repo are immediately reflected without re-installing.

Developer / editable install (recommended if you modify the code):

pip install -e .

Uncertainty Estimation Techniques in BayesiPy

1. Linearized Laplace Methods

BayesiPy integrates the full suite of Linearized Laplace approximations proposed by Immer et al. (2021) and subsequent works. The idea is to treat a pre-trained neural network $f(\cdot; \theta)$ at its MAP estimate $\theta_\text{MAP}$ and approximate the posterior over $\theta$ locally by a Gaussian whose mean is $\theta_\text{MAP}$ and whose covariance is derived from the (generalized) Gauss-Newton or Hessian of the negative log-likelihood. By “linearizing” the network’s parameters around the MAP solution, we obtain:

Full Laplace: Uses the full Hessian matrix. Most accurate but extremely memory-intensive for large models.
Subnetwork Laplace: Selects a subset of the network’s parameters (e.g., a subnetwork or certain layers) for the Laplace approximation, reducing complexity.
Last-Layer Laplace (LLA): Approximates only the last layer’s weights by a Gaussian, holding the rest of the network fixed as a deterministic feature extractor.
Curvature Approximations: You can choose from diagonal, KFAC, or exact Hessian approximations to balance computational cost with approximation accuracy.

References:

Immer et al. (2021) – “Improving Predictions of Neural Networks via Monte Carlo Methods, the Laplace Approximation, and Bayesian Neural Networks.” [AISTATS]

accElerated Linearized Laplace Approximation (ELLA) specifically accelerates and approximates linearized Laplace approximation by using subsets of data, Nyström approximations and other low-rank techniques to handle larger models and datasets. Its emphasis on scalability and memory efficiency makes it appealing for modern architectures.

Source: Deng et al. (2022) proposed an accelerated linearized Laplace method (ELLA) with a Nyström approximation to the network’s tangent kernel for improved scalability.
[NeurIPS]

Variational Linearized Laplace Approximation (VaLLA) is another variant of LLA that leverages sparse Gaussian processes in function space. Rather than computing Hessians directly, VaLLA uses variational inference to fit a GP whose mean is anchored at the DNN output. In practice, VaLLA can yield high-quality calibration with sub-linear complexity in the dataset size.

Source: Ortega et al. (2024a) – “Variational Linearized Laplace Approximation for Bayesian Deep Learning.” [ICML]

Scalable Linearized Laplace Approximation (ScaLLA) approximates the kernel of the Linearized Laplace Approximation (LLA) using a surrogate deep neural network. Training relies solely on efficient Jacobian–vector products, allowing for predictive uncertainty estimates on large-scale pre-trained DNNs.

2. Mean-Field Variational Inference (MFVI)

Mean-Field Variational Inference (MFVI) is a classic approach where we assume a factorized Gaussian over neural network weights and optimize its mean/variance parameters via the ELBO. This method can correct overconfidence, but often underestimates uncertainty due to the independence assumption.

Source: Deng et al. (2021) – “BayesAdapter: Being Bayesian, Inexpensively and Robustly, via Bayesian Fine-tuning” [ACML]

3. Spectral-Normalized Gaussian Process (SNGP)

Spectral-Normalized Gaussian Process (SNGP) integrates a GP layer at the network’s output and enforces a distance-preserving feature space via spectral normalization. This makes the network’s predictions distance-aware, which helps with out-of-distribution (OOD) detection. SNGP typically involves a re-training or fine-tuning step to incorporate the spectral normalization in earlier layers.

Source: Liu et al. (2020) – “Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness.” [NeurIPS]

4. Fixed-Mean Gaussian Process (FMGP)

Fixed-Mean Gaussian Process (FMGP) is a method where we overlay a GP whose mean is fixed to the pre-trained DNN output. This GP focuses on learning the uncertainty (variance) around the DNN’s predictions. FMGP can be trained post-hoc with a sparse variational approach, scaling to large datasets and architectures. Empirically, it often outperforms other methods in calibration quality with moderate computational overhead.

Source: Ortega et al. (2024b) – “Fixed-Mean Gaussian Processes for Post-hoc Bayesian Deep Learning.” [arXiv]

Comparison of Techniques

Below is a brief summary comparing these techniques, with insights drawn from both the original Laplace approximation library [Immer et al., 2021] and recent works like [Ortega et al., 2024b]:

Method	Pros	Cons	Use Case
Full Laplace	Most faithful local Gaussian approximation to MAP parameters; captures correlations in all parameters	Extremely memory- and computation-heavy; not feasible for large networks	Good for small-to-medium networks or when maximizing fidelity is paramount
Layerwise/Subnet	Balances coverage (whole network or a subset) with computational feasibility; can approximate key layers	More complex to configure (which layers to include? which blocks?). Some approximation needed for Hessian.	For medium-to-large networks if you need more coverage than last-layer but can’t afford full Laplace
Last-Layer Laplace	Very fast post-hoc correction; no retraining, just Hessian-based Gaussian around final layer	Focuses only on last-layer uncertainty; can miss uncertainties originating in earlier layers	A quick Bayesian “upgrade” that often fixes overconfidence on moderate tasks
ELLA	Scalable variant of Laplace using low-rank or Nyström approximations	Hyperparameters for kernel approximation need tuning	Large-scale scenarios where even standard LLA is too costly
VaLLA	Excellent calibration (function-space GP perspective) with sub-linear complexity in data size	Requires iterative variational optimization; can be slower to converge; more complicated inference step	High-fidelity uncertainty for large datasets or critical applications
ScaLLA	Excellent calibration with very fast evaluation	Requires training a surrogate model, can be slow to train. Requires context points for good OOD performance	High-fidelity uncertainty for large datasets, superior OOD detection due to kernel biasing.
MFVI	Classic fully Bayesian approach over weights; easy to implement (Bayes by Backprop)	Mean-field assumption often underestimates uncertainty; can be very slow or memory-heavy for large networks	For those wanting a “full BNN” approach or partial Bayesian layers with factorized posteriors
SNGP	Distance-aware; single forward-pass at inference; strong OOD detection	Typically requires training from scratch or at least heavy fine-tuning with spectral normalization; not purely post-hoc	Production-friendly if you can integrate spectral norms and a GP head early on
FMGP	High-quality calibration; scalable to large data; easy to wrap any pre-trained model (fixed mean)	Extra training to fit the GP’s variational parameters; number of inducing points is a hyperparameter that can affect memory usage	Post-hoc method offering advanced Bayesian-quality uncertainty for large-scale tasks without heavy retraining

Usage Example

Below is a short snippet demonstrating how to apply a Fixed-Mean GP (FMGP) to a pre-trained regression model. For other methods, the usage is similar, but you would import from the relevant module (e.g., bayesipy.laplace for the linearized Laplace methods).

import copy
import torch
import numpy as np
from bayesipy.fmgp import FMGP

# Suppose 'f' is your pre-trained PyTorch model, e.g., an nn.Module for regression.
fmgp = FMGP(
    model=copy.deepcopy(f),        # copy of the MAP-trained model
    likelihood="regression",       # 'regression' or 'classification'
    kernel="RBF",                  # kernel for the GP (e.g., Radial Basis Function)
    inducing_locations="kmeans",   # how to initialize inducing points (k-means on the training data)
    num_inducing=50,               # number of inducing points (scales the GP complexity)
    noise_variance=np.exp(-5),     # initial noise variance for regression
    y_mean=0.0,                    # target mean if data was normalized
    y_std=1.0                      # target std if data was normalized
)

# Train the FMGP to learn the GP variance parameters (the base model 'f' is not changed):
loss = fmgp.fit(
    iterations=3000,
    lr=1e-3,
    train_loader=train_loader,  # PyTorch DataLoader with (X, y)
    verbose=True
)
print("Finished FMGP training with final loss:", loss)

# Get predictions with uncertainty:
X_test = ...  # some test inputs (NumPy array or torch.Tensor)
mean_pred, var_pred = fmgp.predict(torch.tensor(X_test, dtype=torch.float32))
print("Predictive mean:", mean_pred)
print("Predictive variance:", var_pred)

For Linearized Laplace methods (including Full, Subnetwork, Last-layer, etc.), you would typically import from bayesipy.laplace. For instance:

from bayesipy.laplace import Laplace

# Suppose 'f' is your pre-trained model.
laplace_model = Laplace(
    model=f,
    approximation="full",    # 'full', 'kron', 'diag', 'kfac', etc.
    subset_of_weights="all", # 'all', 'last_layer', 'subnetwork', etc.
    likelihood="classification"
)

laplace_model.fit(train_loader)
preds, preds_std = laplace_model.predict(test_loader)

Consult the repository’s examples folder for more usage details and advanced configurations.

Contributing

Contributions are welcome! If you’d like to improve BayesiPy, follow these steps:

Open an Issue: Report bugs, request new features, or ask questions. We track all changes and discussions in GitHub issues.
Fork the Repo & Create a Branch: For code changes, fork BayesiPy and create a feature branch for your work.
Pull Request: When you’re ready, open a pull request describing your changes. Make sure to include relevant tests or update examples/documentation. Please follow PEP8 style guidelines.
Testing: We encourage adding or updating tests in tests/. Ensure your changes don’t break existing functionality.
Discussion: For major proposals, start a discussion via an issue so we can share feedback.

By contributing, you help advance accessible Bayesian deep learning methods for the community.

Bibliography

Immer et al. (2021) – “Improving Predictions of Neural Networks via Monte Carlo Methods, the Laplace Approximation, and Bayesian Neural Networks.”
AISTATS 2021, Link
Kristiadi et al. (2020) – “Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks.”
ICML 37:5436–5446, Link
Deng et al. (2022) – “Accelerated Linearized Laplace Approximation for Bayesian Deep Learning.”
NeurIPS 35, Link
Ortega et al. (2024a) – “Variational Linearized Laplace Approximation for Bayesian Deep Learning.”
ICML 41, Link
Deng et al. (2021) – “BayesAdapter: Being Bayesian, Inexpensively and Robustly, via Bayesian Fine-tuning.”
ACML, Link
Liu et al. (2020) – “Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness.”
NeurIPS 33, Link
Ortega et al. (2024b) – “Fixed-Mean Gaussian Processes for Post-hoc Bayesian Deep Learning.”
[arXiv:2412.04177]

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
.web_playground		.web_playground
bayesipy		bayesipy
benchmarks		benchmarks
examples		examples
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BayesiPy

Instalation

1. Clone the repository

2. Create and activate a virtual environment (recommended)

3.Install BayesiPy

Developer / editable install (recommended if you modify the code):

Uncertainty Estimation Techniques in BayesiPy

1. Linearized Laplace Methods

2. Mean-Field Variational Inference (MFVI)

3. Spectral-Normalized Gaussian Process (SNGP)

4. Fixed-Mean Gaussian Process (FMGP)

Comparison of Techniques

Usage Example

Contributing

Bibliography

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BayesiPy

Instalation

1. Clone the repository

2. Create and activate a virtual environment (recommended)

3.Install BayesiPy

Developer / editable install (recommended if you modify the code):

Uncertainty Estimation Techniques in BayesiPy

1. Linearized Laplace Methods

2. Mean-Field Variational Inference (MFVI)

3. Spectral-Normalized Gaussian Process (SNGP)

4. Fixed-Mean Gaussian Process (FMGP)

Comparison of Techniques

Usage Example

Contributing

Bibliography

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages