In many scientific settings — particularly neuroimaging, pharmacology, and systems biology — one fits a rich "full" model and then wants to compare it against dozens or hundreds of reduced variants that set certain parameters to zero or constrain them. Refitting each reduced model is computationally expensive. Bayesian Model Reduction (BMR), developed by Karl Friston, Chris Penny, and colleagues, provides an analytic shortcut: the evidence and posterior of any reduced model can be computed directly from the full model's posterior, provided the reduced model differs only in its prior.
The Core Identity
Suppose a full model m_f has prior p(θ | m_f) and posterior p(θ | y, m_f), and a reduced model m_r has a different prior p(θ | m_r) but the same likelihood. The ratio of model evidences is:
For Gaussian priors and approximate Gaussian posteriors:
log p(y | m_r) − log p(y | m_f) = ½ log|Σ_r| − ½ log|Σ_f| + ½(μ_f − μ_r)ᵀ Σ_r⁻¹(μ_f − μ_r) + …
This identity is exact: no additional approximation is introduced beyond whatever was used to obtain the full posterior. If the full posterior was obtained via Variational Laplace (a common approximation in neuroimaging), BMR inherits that approximation but adds no further error. The key insight is that changing the prior is equivalent to reweighting the posterior — an importance-sampling-like operation.
The Savage–Dickey Connection
BMR is closely related to the Savage–Dickey density ratio for computing Bayes factors. When the reduced model sets a parameter to a specific value θ₀ (a point prior), the Bayes factor is simply the ratio of the posterior density to the prior density evaluated at θ₀ under the full model. BMR generalizes this to arbitrary prior changes, not just point restrictions.
In Dynamic Causal Modelling (DCM) for fMRI data, a single model fit can take minutes to hours. With BMR, thousands of reduced models — corresponding to different hypotheses about which neural connections are present — can be evaluated in seconds. Combined with Bayesian Model Averaging (BMA), BMR enables exhaustive search over model spaces that would be computationally infeasible to explore by refitting. This has transformed the practice of effective connectivity analysis in neuroimaging.
Applications in Neuroimaging
BMR was originally developed for and has had its greatest impact in computational neuroimaging. In DCM, each model specifies a hypothesis about directed neural connections between brain regions. The full model includes all candidate connections, and reduced models "switch off" subsets by replacing their priors with shrinkage priors centred tightly on zero. BMR computes the evidence for each reduced model, and Bayesian Model Averaging over the reduced set yields robust posterior estimates of connection strengths.
Historical Development
Penny et al. introduced the concept of comparing DCM models through prior changes, laying groundwork for BMR.
Friston and Penny formalized Bayesian Model Reduction, proving the evidence identity and demonstrating its use for exhaustive model comparison in DCM.
Friston, Litvak, et al. extended BMR to hierarchical (parametric empirical Bayes) settings, enabling group-level inference with automatic model reduction at both individual and group levels.
BMR has been adopted in pharmacological modelling, computational psychiatry, and generalized to non-Gaussian settings through variational approximations.
Assumptions and Limitations
BMR requires that reduced models share the same likelihood as the full model — only the prior differs. This is naturally satisfied when "reducing" means tightening or shifting priors on a subset of parameters. The accuracy of BMR depends on the quality of the full model's posterior approximation; if the full posterior is poorly estimated, BMR inherits those errors. Additionally, the Gaussian approximation used in practice may be inadequate for strongly non-Gaussian posteriors.
"Bayesian Model Reduction turns the problem of model comparison on its head: instead of fitting many models, fit one model well and analytically derive the rest."— Karl Friston, 2016
Worked Example: Reducing a Regression Model via Savage-Dickey Ratios
A full model has 5 regression coefficients. We use Bayesian Model Reduction to determine which parameters can be set to zero without losing evidence, by computing Savage-Dickey density ratios analytically from the full model's posterior.
β₂ = 0.15 (SE = 0.08) — small effect
β₃ = 1.80 (SE = 0.90) — moderate effect
β₄ = −0.05 (SE = 0.03) — negligible
β₅ = 0.02 (SE = 0.01) — negligible
Prior: each βⱼ ~ N(0, 10)
Step 1: Savage-Dickey Ratio BF(reduced) BF = p(βⱼ = 0 | data, full) / p(βⱼ = 0 | prior)
For β₁: p(0|data) = N(0; 2.50, 1.25²) = 0.0513, p(0|prior) = 0.1261
BF₁ = 0.0513/0.1261 = 0.41 → Keep (BF < 1)
For β₄: p(0|data) = N(0; −0.05, 0.03²) = 1.48, p(0|prior) = 0.1261
BF₄ = 1.48/0.1261 = 11.74 → Drop (BF > 1)
Step 2: Results β₁: BF = 0.41 → Keep (strong effect)
β₂: BF = 0.65 → Keep (borderline)
β₃: BF = 0.48 → Keep (moderate effect)
β₄: BF = 11.74 → Drop (negligible)
β₅: BF = 18.22 → Drop (negligible)
BMR identifies β₄ and β₅ as removable — their Savage-Dickey BFs (11.7 and 18.2) strongly favor the reduced model. Critically, BMR achieves this without refitting the reduced models. The entire analysis uses only the full model's posterior, saving enormous computation when comparing many nested model variants. The reduced model retains {β₁, β₂, β₃} — a 40% reduction in parameters with analytical certainty.