Jim Pitman (born 1945) is a probabilist at the University of California, Berkeley, whose work on random partitions, combinatorial stochastic processes, and exchangeability has had a transformative impact on Bayesian nonparametric statistics. His development of exchangeable random partition theory and his co-creation of the Pitman-Yor process (also known as the two-parameter Poisson-Dirichlet process) provided powerful new tools for modeling data with unknown numbers of categories, clusters, or species, and these tools have been widely adopted in machine learning, natural language processing, and population genetics.
Education and Career
Pitman studied mathematics at the University of Cambridge and received his PhD from the University of Sheffield. He joined the University of California, Berkeley, where he has spent his career in the Department of Statistics. His research combines deep probabilistic theory with a keen sense of the structures that arise in applications, and he has been influential in connecting abstract probability theory with practical statistical modeling.
Exchangeable Random Partitions
Pitman's most distinctive contribution to Bayesian statistics is his systematic development of the theory of exchangeable random partitions. Building on Kingman's work on random partitions and de Finetti's exchangeability framework, Pitman characterized the full class of exchangeable partition probability functions (EPPFs), showing how they arise from subordinators and random discrete distributions. This theory provides the mathematical foundation for understanding clustering in Bayesian nonparametric models.
In many applications—from topic modeling to species discovery to customer segmentation—the number of groups or categories is not known in advance. Exchangeable partition structures provide a principled probabilistic framework for letting the data determine the number and composition of clusters, and the Pitman-Yor process offers a flexible family of priors over such partitions.
The Pitman-Yor Process
The Pitman-Yor process, developed by Pitman and Marc Yor in the 1990s, generalizes the Dirichlet process by introducing a second parameter that controls the tail behavior of the cluster size distribution. While the Dirichlet process generates cluster sizes that follow a geometric-like decay, the Pitman-Yor process can produce power-law distributions, making it more suitable for modeling phenomena such as word frequencies in natural language, where a few items are very common and many items are rare.
“The study of exchangeable random partitions reveals the deep combinatorial structures underlying Bayesian nonparametric models.”— Jim Pitman (paraphrased)
Combinatorial Stochastic Processes
Pitman's 2006 monograph Combinatorial Stochastic Processes provides a comprehensive treatment of the theory of random partitions, random trees, and fragmentation-coalescent processes, showing how these combinatorial structures connect to Bayesian nonparametric models. The book has been highly influential in both probability theory and machine learning.
Legacy
Pitman's work has bridged pure probability theory and applied Bayesian statistics in ways that have enriched both fields. The Pitman-Yor process is now a standard tool in computational statistics and machine learning, and his theoretical framework for exchangeable partitions provides the mathematical language in which much of modern Bayesian nonparametrics is expressed.
Born in England.
Received PhD from the University of Sheffield.
Joined UC Berkeley Department of Statistics.
Developed the Pitman-Yor (two-parameter Poisson-Dirichlet) process with Marc Yor.
Published Combinatorial Stochastic Processes.