Spam Filtering — BayesianStatistics.com

Spam Filtering

Bayesian spam filtering, pioneered by Paul Graham's influential 2002 essay, applies naive Bayes classification to email content — learning word-level spam probabilities and combining them to produce an overall spam score that adapts to each user's mail.

P(spam | words) ∝ P(words | spam) · P(spam)

The spam filtering problem is one of the most successful and widely deployed applications of Bayesian inference. Every day, billions of emails are classified as spam or legitimate by filters that, at their core, apply Bayes' theorem to the words contained in the message. The approach is elegant: learn the probability of each word appearing in spam versus legitimate email, then combine these probabilities using Bayes' theorem to compute the overall probability that a new message is spam.

Paul Graham's Approach

In August 2002, Paul Graham published "A Plan for Spam," arguing that statistical classification based on word probabilities could outperform the hand-crafted rules used by existing spam filters. His approach was simple: for each word, compute the probability that a message containing that word is spam, using the ratio of spam messages containing the word to all messages containing the word. For a new message, combine the individual word probabilities using a simplified form of Bayes' theorem.

Individual Word Spam Probability P(spam | word) = P(word | spam) · P(spam) / P(word)

Combined Probability (Naive Bayes) P(spam | w₁, ..., wₙ) = (p₁ · p₂ · ... · pₙ) / [(p₁ · p₂ · ... · pₙ) + (1−p₁) · (1−p₂) · ... · (1−pₙ)]

where pᵢ = P(spam | wᵢ) for each word wᵢ in the message.

Graham's key insight was that the filter should be personalized — trained on each user's own email — and that the words most useful for classification are not just the obvious spam words ("viagra," "free") but also the legitimate words ("meeting," "project") whose absence signals spam. The combination formula assumes conditional independence of words given the class (the "naive" assumption), which is technically false but works remarkably well in practice.

The Naive Bayes Classifier

The naive Bayes classifier underlying spam filtering is one of the simplest and most robust machine learning algorithms. Despite the independence assumption, it performs well because classification depends on the sign of the log-odds rather than on exact probability estimates, and the independence assumption affects calibration more than discrimination. Bayesian smoothing (using a Beta prior on word probabilities) prevents zero-frequency problems when a word has been seen only in spam or only in legitimate email.

Why Naive Bayes Works for Spam

The naive independence assumption — that word occurrences are conditionally independent given the class — is clearly violated in natural text. Yet naive Bayes classifiers consistently achieve accuracy above 99% on spam filtering. This works because classification only requires getting the ordering of posterior probabilities right, not their exact values. The Bayes decision boundary is surprisingly robust to violations of the independence assumption, and the high dimensionality of text actually helps: errors in individual word probability estimates tend to cancel out when many words are combined.

Evolution and Modern Systems

Graham's original approach evolved rapidly. SpamBayes, Bogofilter, and CRM114 implemented more sophisticated Bayesian classifiers with features like bigram tokens, header analysis, and adaptive thresholds. SpamAssassin incorporated Bayesian scoring as one component in a multi-method approach. Modern email systems (Gmail, Outlook) use deep learning classifiers, but these are often calibrated using Bayesian methods and evaluated against Bayesian baselines.

Beyond Email: Bayesian Content Filtering

The Bayesian spam filtering paradigm has been extended to other content filtering tasks: detecting phishing emails, filtering abusive comments, classifying social media posts, and identifying fake reviews. In each case, the same Bayesian framework — learn class-conditional word distributions, combine using Bayes' theorem, update as new data arrive — provides an effective and interpretable baseline.

"I think Bayesian filtering will prove to be more important than any of the anti-spam techniques used today... If it's 99% accurate now, it's 99.9% accurate tomorrow, because the filter is learning." — Paul Graham, "A Plan for Spam" (2002)

Current Frontiers

Adversarial spam — messages crafted to evade Bayesian filters by injecting legitimate words — motivated research into robust Bayesian classification, adversarial priors, and ensemble methods. Bayesian online learning enables filters to adapt to evolving spam tactics in real time. And the principles of Bayesian spam filtering inform modern approaches to misinformation detection, content moderation, and the filtering of AI-generated text.

Interactive Calculator

Each row is a word with its spam_count (occurrences in spam emails) and ham_count (occurrences in legitimate emails). The calculator builds a Naive Bayes spam classifier using Laplace-smoothed word likelihoods, computes posterior P(spam|words) given all observed words, and identifies the most discriminative spam and ham indicator words.

Dataset (CSV)

Click Calculate to see results, or Animate to watch the statistics update one record at a time.