Bayesian Statistics

PyMC Project

The PyMC Project develops PyMC, an open-source probabilistic programming library in Python that enables intuitive specification of Bayesian statistical models and supports modern inference algorithms including MCMC and variational methods.

PyMC is one of the most widely used probabilistic programming libraries in the Python ecosystem, providing a user-friendly interface for Bayesian modeling that has made these methods accessible to a broad community of data scientists, researchers, and practitioners. Built on the philosophy that Bayesian modeling should be as natural as writing down a statistical model on paper, PyMC has grown from a research tool into a mature, community-driven project.

History and Evolution

2003

Christopher Fonnesbeck creates PyMC as a Python library for Bayesian modeling, initially using Metropolis-Hastings and Gibbs sampling algorithms.

2009

PyMC 2 is released with expanded modeling capabilities and improved documentation, attracting a growing user community.

2016

PyMC3 is released, built on Theano for automatic differentiation and featuring the No-U-Turn Sampler (NUTS) and variational inference. This marks a major leap in capability and performance.

2022

PyMC v4/v5 is released, migrating to PyTensor (a fork of Aesara/Theano) for improved maintainability and performance, with support for JAX-based sampling backends.

Design Philosophy

PyMC's design philosophy prioritizes accessibility and expressiveness. Models are specified using a Pythonic syntax that closely mirrors the mathematical notation of the model, making it easy for researchers to translate their statistical thinking into working code. The library provides a rich set of probability distributions, supports custom distributions and likelihoods, and integrates with the broader Python scientific computing ecosystem including NumPy, pandas, and Matplotlib.

The PyMC Ecosystem

PyMC is part of a broader ecosystem of Bayesian tools in Python. ArviZ provides posterior diagnostics and visualization. Bambi offers a formula-based interface for generalized linear models. PyMC-Marketing provides tools for marketing mix modeling and customer lifetime value estimation. Together, these tools create a comprehensive Bayesian workflow in Python.

Technical Capabilities

PyMC supports a wide range of inference algorithms. The default sampler is NUTS, providing efficient gradient-based MCMC for continuous parameters. For models with discrete parameters, PyMC offers specialized samplers including Metropolis-Hastings and the categorical Gibbs sampler. Variational inference methods, including ADVI (automatic differentiation variational inference), provide faster approximate posterior inference for large-scale problems. Recent versions support sampling backends including JAX-based samplers such as BlackJAX and NumPyro, enabling GPU-accelerated inference.

Community and Governance

PyMC is a community-driven open-source project sponsored by NumFOCUS. Development is led by a core team that includes Christopher Fonnesbeck, Thomas Wiecki, Osvaldo Martin, and numerous other contributors. The project maintains active forums, extensive documentation, and a collection of example notebooks that serve as both tutorials and templates for common modeling tasks.

"PyMC brought Bayesian statistics into the Python world and made it feel native. It showed that Bayesian modeling doesn't have to mean leaving behind the tools and workflows that data scientists already know."— Thomas Wiecki

Related Topics

External Links