Bayesian Statistics

Cochran–Mantel–Haenszel Statistics

The Cochran-Mantel-Haenszel (CMH) test combines evidence across stratified 2x2 contingency tables to test for a common association while controlling for confounders, and admits a natural Bayesian interpretation as a hierarchical model for stratum-specific odds ratios.

CMH = (Σₖ [aₖ − E(aₖ)])² / Σₖ Var(aₖ)

The Cochran–Mantel–Haenszel (CMH) statistic provides a method for testing the null hypothesis of no association between two binary variables, pooled across K independent strata. Originally developed in the frequentist tradition, the CMH framework has deep connections to Bayesian hierarchical modeling, where stratum-specific parameters are partially pooled toward a common effect through a shared prior distribution.

The CMH approach is ubiquitous in clinical trials (meta-analysis of 2×2 tables across sites), epidemiology (controlling for confounders in stratified case-control studies), and social science (adjusting for demographic strata in survey experiments). Its Bayesian generalization provides a principled way to handle between-stratum heterogeneity and incorporate prior information about effect sizes.

CMH Test Statistic CMH  =  [Σₖ (aₖ − E(aₖ))]² / Σₖ Var(aₖ)

Where for stratum k (2×2 table) aₖ = cell (1,1) count,  E(aₖ) = r₁ₖc₁ₖ/nₖ,  Var(aₖ) = r₁ₖr₂ₖc₁ₖc₂ₖ / (nₖ²(nₖ−1))

Under H₀, CMH ~ χ²(1) asymptotically

The Mantel-Haenszel Odds Ratio Estimator

Beyond the test of association, the CMH methodology provides a pooled estimator of the common odds ratio:

Mantel-Haenszel Odds Ratio OR_MH  =  Σₖ (aₖdₖ / nₖ)  /  Σₖ (bₖcₖ / nₖ)

where aₖ, bₖ, cₖ, dₖ are the four cells of the k-th 2×2 table.

This estimator is consistent under the assumption of a common odds ratio across strata (homogeneity). When odds ratios vary across strata (heterogeneity), the MH estimator provides a weighted average, but interpretation becomes more nuanced. The Breslow-Day test can detect heterogeneity, but has low power — a setting where Bayesian methods excel.

Bayesian Hierarchical Reformulation

The natural Bayesian analogue of the CMH framework is a hierarchical model for stratum-specific log-odds ratios:

Bayesian Hierarchical Model aₖ | θₖ  ~  Hypergeometric(r₁ₖ, c₁ₖ, nₖ) with log-OR θₖ
θₖ  ~  Normal(μ, σ²)   (random effects)
μ  ~  Normal(0, τ²)   (prior on common effect)
σ²  ~  Half-Cauchy(0, s)   (prior on heterogeneity)

This model simultaneously estimates the common effect μ, the between-stratum heterogeneity σ², and the individual stratum effects θk. When σ² is small, the model reduces to the fixed-effect (common odds ratio) assumption underlying the classical CMH test. When σ² is large, each stratum's estimate is primarily determined by its own data. The Bayesian approach adaptively determines the degree of pooling from the data itself.

Simpson's Paradox and Stratification

The CMH framework directly addresses Simpson's paradox — the phenomenon where an association that appears in every stratum reverses when strata are aggregated. By testing for association within strata rather than across the marginal table, the CMH test controls for the confounding variable that defines the strata. The Bayesian hierarchical model goes further: it can estimate the magnitude of confounding by comparing the unstratified effect with the stratum-adjusted posterior for μ, quantifying exactly how much the confounder biases the naive estimate.

Historical Development

1954

William Cochran proposes methods for combining 2×2 tables from stratified samples in Biometrics, laying the groundwork for pooled inference across strata.

1959

Nathan Mantel and William Haenszel publish their landmark paper introducing the combined test statistic and odds ratio estimator for stratified case-control studies.

1980

Breslow and Day develop the test for homogeneity of odds ratios across strata, complementing the CMH test with a diagnostic for the common-effect assumption.

1990s

Bayesian hierarchical models for meta-analysis (DuMouchel, DerSimonian-Laird with Bayesian extensions) generalize the CMH approach, allowing heterogeneity to be estimated rather than merely tested.

Advantages of the Bayesian Approach

The Bayesian hierarchical extension of CMH offers several advantages over the classical test. Posterior distributions for stratum-specific odds ratios provide uncertainty quantification for each stratum, not just the pooled estimate. The posterior for σ² directly quantifies heterogeneity, replacing the binary reject/fail-to-reject output of the Breslow-Day test with a continuous measure. And prior information from previous studies can be formally incorporated — particularly valuable in rare-disease settings where individual strata have few events.

Furthermore, the Bayesian model handles sparse strata (tables with zero cells) gracefully through the prior, whereas the classical CMH statistic requires continuity corrections or exclusion of empty strata. In regulatory settings, the FDA has accepted Bayesian hierarchical analyses of stratified data as primary evidence in clinical trial submissions, particularly for medical devices with multi-site pivotal trials.

"The Mantel-Haenszel method was the first widely used technique for combining evidence across studies. Its Bayesian generalization — the hierarchical model — has become the standard framework for meta-analysis." — Larry V. Hedges and Ingram Olkin, Statistical Methods for Meta-Analysis (1985)

Worked Example: Drug Efficacy Across Hospital Strata

A drug trial is conducted across three hospitals. We use the CMH test to assess treatment effect while controlling for hospital as a confounding stratum.

Given (2×2 Tables per Stratum) Hospital 1: Treated+Cured=8, Treated+NotCured=2, Control+Cured=3, Control+NotCured=7
Hospital 2: Treated+Cured=6, Treated+NotCured=4, Control+Cured=5, Control+NotCured=5
Hospital 3: Treated+Cured=9, Treated+NotCured=1, Control+Cured=4, Control+NotCured=6

Step 1: Stratum-Specific Odds Ratios Hospital 1: OR = (8×7)/(2×3) = 9.33
Hospital 2: OR = (6×5)/(4×5) = 1.50
Hospital 3: OR = (9×6)/(1×4) = 13.50

Step 2: Mantel-Haenszel Common OR OR_MH = Σ(aᵢdᵢ/nᵢ) / Σ(bᵢcᵢ/nᵢ)
= (8·7/20 + 6·5/20 + 9·6/20) / (2·3/20 + 4·5/20 + 1·4/20)
= (2.80 + 1.50 + 2.70) / (0.30 + 1.00 + 0.20)
= 7.00 / 1.50 = 4.67

Step 3: CMH Test χ²_CMH = (Σ(aᵢ − E[aᵢ]))² / Σ Var(aᵢ)
= (8−5.5 + 6−5.5 + 9−6.5)² / (1.32+1.32+1.18) = (5.5)² / 3.82 = 7.92
p-value (df=1) ≈ 0.005 → Reject conditional independence

The CMH common odds ratio of 4.67 indicates that treated patients have nearly 5 times the odds of being cured compared to controls, after adjusting for hospital differences. The marginal (crude) OR would be 5.45 — similar direction but slightly different magnitude. Both the CMH test (p = 0.005) and the pooled Bayesian estimate confirm a treatment effect that is consistent across strata.

Interactive Calculator

Each row has a stratum, exposure (yes/no), and outcome (yes/no). The calculator computes the Cochran-Mantel-Haenszel (CMH) test for conditional independence, the common odds ratio across strata, and a Bayesian pooled estimate. It reveals Simpson's paradox when the marginal and conditional associations differ.

Click Calculate to see results, or Animate to watch the statistics update one record at a time.

Related Topics

External Links