Bayesian Statistics

Ecology & Species Modeling

Bayesian occupancy models, species distribution models, and population dynamics frameworks allow ecologists to rigorously account for imperfect detection and environmental uncertainty when studying where species live and how populations change.

ψᵢ ~ Bernoulli(Ψ); yᵢⱼ | zᵢ=1 ~ Bernoulli(p)

Ecology confronts a fundamental observational problem: most species are not detected every time they are present. A bird may inhabit a forest patch yet remain silent during a survey; a rare amphibian may be underground when biologists visit. Bayesian methods have become the dominant framework for ecological modeling precisely because they handle this imperfect detection naturally, separating the ecological process (is the species there?) from the observation process (did we see it?).

Occupancy Models

The single-season occupancy model, introduced by MacKenzie et al. (2002) and rapidly adopted in Bayesian form, estimates the probability that a species occupies a site while explicitly modeling detection probability. The two-level structure is inherently hierarchical: a latent true state (occupied or not) generates the observed detection history through repeated visits.

Single-Season Occupancy Model zᵢ ~ Bernoulli(ψᵢ)     [true occupancy at site i]
yᵢⱼ | zᵢ = 1 ~ Bernoulli(pᵢⱼ)     [detection at visit j]
yᵢⱼ | zᵢ = 0 = 0

logit(ψᵢ) = Xᵢβ     [occupancy covariates]
logit(pᵢⱼ) = Wᵢⱼα     [detection covariates]

Bayesian estimation allows ecologists to place informative priors on detection probability based on pilot studies, propagate uncertainty from detection into occupancy estimates, and extend the framework to dynamic multi-season models that track colonization and extinction.

Species Distribution Models

Species distribution models (SDMs) relate occurrence records to environmental covariates — temperature, precipitation, elevation, land cover — to predict where a species could live. Bayesian SDMs, fitted via MCMC or INLA, offer several advantages over maximum-likelihood alternatives: they produce full posterior distributions over predicted habitat suitability, accommodate spatial autocorrelation through Gaussian random fields, and integrate multiple data sources through joint likelihood models.

Integrated species distribution models (ISDMs) combine presence-only data (museum records, citizen science) with presence-absence surveys in a single Bayesian framework, using the structured data to anchor the bias inherent in opportunistic observations.

The Rise of Citizen Science Data

Platforms like eBird and iNaturalist generate millions of species observations annually, but these data are heavily biased toward accessible areas and charismatic species. Bayesian models can incorporate observation effort, spatial bias surfaces, and preferential sampling corrections to extract reliable ecological signals from messy citizen science datasets.

Population Dynamics and State-Space Models

Bayesian state-space models separate true population dynamics from observation error. The process model describes how abundance changes over time — through birth, death, immigration, and emigration — while the observation model accounts for imperfect counting. This separation is critical because ignoring observation error leads to overestimation of process variance and misleading conclusions about population stability.

State-Space Population Model Nₜ₊₁ = Nₜ · λₜ · εₜ     [process: true abundance]
yₜ = Nₜ · pₜ + ηₜ     [observation: counted abundance]

λₜ ~ LogNormal(μ_λ, σ²_λ)     [growth rate]
pₜ ~ Beta(α_p, β_p)     [detection probability]

Bayesian N-mixture models extend this framework to estimate abundance from replicated counts without individual identification, while capture-recapture models use individual detection histories to estimate survival, recruitment, and movement. Programs like JAGS, NIMBLE, and Stan have made these models accessible to field ecologists.

Community and Multi-Species Models

Bayesian hierarchical community models treat each species as a draw from a community-level distribution, borrowing strength across species to improve estimates for rare taxa. Multi-species occupancy models estimate species richness while accounting for the species that were present but never detected — the "dark diversity" that is invisible to naive counts.

"The question is not whether a species is present or absent, but what is the probability it is present given our imperfect observations — and how that probability changes with the environment." — J. Andrew Royle and Robert M. Dorazio, Hierarchical Models in Ecology

Spatial and Spatio-Temporal Extensions

Modern ecological applications increasingly use spatially explicit Bayesian models. Gaussian process priors or conditional autoregressive (CAR) structures capture spatial correlation in species occurrence, while spatio-temporal models track range shifts in response to climate change. The R-INLA package, which provides fast approximate Bayesian inference through integrated nested Laplace approximation, has been particularly transformative for spatial ecology, enabling models with thousands of spatial locations that would be computationally prohibitive with full MCMC.

Interactive Calculator

Each row is a survey record with site (site ID), visit (visit number), and detected (1 if species was detected, 0 if not). The calculator fits a Bayesian occupancy model estimating both the true occupancy probability (psi) and the detection probability (p) given occupancy, using Beta priors. Sites with no detections across visits may still be occupied if detection is imperfect.

Click Calculate to see results, or Animate to watch the statistics update one record at a time.

Related Topics

External Links