Statistical Rethinking — BayesianStatistics.com

Statistical Rethinking: A Bayesian Course with Examples in R and Stan, first published in 2015 with a second edition in 2020, has become one of the most widely recommended introductions to Bayesian statistics. Written by evolutionary anthropologist Richard McElreath, the book is distinctive for its pedagogical philosophy: it teaches statistical modeling as a craft — an iterative process of building generative models, checking their implications, and refining them against data — rather than as a set of standardized tests to be applied mechanically.

The book and its freely available video lectures have introduced thousands of researchers to Bayesian thinking, particularly in the social and biological sciences where traditional null-hypothesis significance testing has come under sustained criticism. Its emphasis on causal reasoning, simulation-based understanding, and honest uncertainty quantification has made it a touchstone of the modern applied Bayesian movement.

Pedagogical Approach

McElreath's teaching philosophy rests on several principles that distinguish Statistical Rethinking from conventional statistics textbooks.

Models, Not Tests

The book rejects the "flowchart" approach to statistics — where the analyst selects from a menu of tests (t-test, chi-squared, ANOVA) based on data type and sample size. Instead, every analysis begins with a generative model: a formal description of how the data could have been produced, including the scientific process, measurement error, and sampling design. Statistical inference is then the task of learning about the model's parameters from data.

Simulation Before Estimation

Before fitting any model to real data, the book trains readers to simulate data from their model and verify that the estimation procedure can recover known parameter values. This practice — called prior predictive simulation and posterior predictive checking — catches modeling errors early and builds intuition about what each model implies.

The "Golem" Metaphor

McElreath introduces statistical models as "golems" — powerful but mindless constructs that do exactly what they are told, whether or not what they are told makes sense. A linear regression does not know if your variables are confounded. A multilevel model does not know if your grouping structure is correct. The analyst's job is to think carefully about the scientific problem and build a golem that embodies the right assumptions. The golem then does the computation, but the thinking is the analyst's responsibility.

Core Content

The book covers a wide range of Bayesian methods, building from simple to complex in a carefully sequenced curriculum.

Central Modeling Framework yᵢ ~ Distribution(μᵢ, σ)           (likelihood)
μᵢ = f(xᵢ, β)                      (linear model or link function)
β ~ Prior(hyperparameters)           (priors on parameters)
σ ~ Prior(hyperparameters)           (priors on variance)

Foundations (Chapters 1–4)

The opening chapters introduce Bayesian updating through the metaphor of counting possibilities, build up to linear regression as a Bayesian model, and emphasize prior predictive simulation. The treatment of priors is pragmatic: priors are chosen to encode reasonable scientific knowledge and are checked by simulating from them.

Regression and Causal Inference (Chapters 5–6)

The book gives extensive treatment to the problem of confounding, using directed acyclic graphs (DAGs) to reason about which variables to include in a regression and which to exclude. This causal reasoning framework, drawing on Judea Pearl's work, distinguishes Statistical Rethinking from most statistics textbooks, which treat variable selection as a purely statistical problem.

Generalized Linear Models (Chapters 10–12)

Logistic regression, Poisson regression, and their variants are presented as members of a unified family. The emphasis is on understanding the link function, checking model assumptions via simulation, and interpreting parameters on the natural scale.

Multilevel Models (Chapters 13–14)

Multilevel (hierarchical) models are presented as the natural framework for data with grouping structure — students within schools, observations within individuals, species within genera. The Bayesian approach to multilevel models is particularly elegant: partial pooling emerges naturally from the hierarchical prior, and the degree of pooling is learned from the data.

Advanced Topics (Chapters 15–17)

The later chapters cover measurement error, missing data, Gaussian processes, and social network models, always maintaining the book's emphasis on building models from scientific understanding rather than applying statistical recipes.

"Statistics is not math. It is not even the application of math. Statistics is a set of methods for making sense of data in the service of scientific questions. The mathematical machinery is essential, but it is not the point." — Richard McElreath, Statistical Rethinking, 2nd edition (2020)

The Role of Stan and R

The book uses the R programming language with two interfaces to the Stan probabilistic programming language: the rethinking package (a teaching interface written by McElreath) and, in the second edition, references to brms (Burkner's Bayesian regression modeling package). Stan's Hamiltonian Monte Carlo (HMC) sampler provides efficient posterior inference for the models developed throughout the book.

Community Translations

The book's popularity has spawned community-driven translations of the code examples into Python (PyMC, NumPyro, TensorFlow Probability), Julia (Turing.jl), and other platforms. These translations extend the book's reach beyond the R ecosystem and demonstrate that the pedagogical content is independent of any particular software implementation. The most widely used Python translation uses PyMC and is freely available online.

The Video Lectures

McElreath has recorded and freely released full-semester video lecture courses corresponding to both editions of the book. These lectures — available on YouTube — are among the most watched statistics lecture series in the world. Their popularity stems from McElreath's clear, engaging teaching style and his ability to convey statistical concepts through vivid examples drawn from anthropology, ecology, and everyday life.

Impact and Reception

Since its publication, Statistical Rethinking has been adopted as a textbook in graduate programs worldwide, particularly in ecology, evolutionary biology, psychology, political science, and sociology. It is widely credited with making Bayesian statistics accessible to empirical researchers who are not primarily statisticians.

The book has also influenced how Bayesian methods are taught by professional statisticians. Its emphasis on prior predictive simulation, causal reasoning with DAGs, and workflow-oriented analysis has been incorporated into courses and textbooks by other authors, including Gelman et al.'s Regression and Other Stories (2020).

Relationship to Other Bayesian Textbooks

Statistical Rethinking occupies a distinctive niche. Gelman et al.'s Bayesian Data Analysis is more comprehensive and mathematically rigorous — it is the standard graduate reference. Kruschke's Doing Bayesian Data Analysis is gentler and more focused on replacing null-hypothesis tests with Bayesian alternatives. McElreath's contribution is the integration of Bayesian modeling with causal reasoning and the emphasis on generative simulation as the foundation of statistical understanding.

Why "Rethinking"?

The title reflects McElreath's conviction that statistics needs to be rethought from the ground up. The conventional curriculum — built around p-values, significance thresholds, and standardized tests — has produced a generation of researchers who can run tests but cannot build models. Statistical Rethinking proposes a different foundation: start with a scientific question, build a generative model, use Bayesian inference to learn from data, and check your model against reality. The "rethinking" is not just about switching from frequentist to Bayesian methods — it is about changing how scientists relate to their data.