Probabilistic Soft Logic

Probabilistic soft logic is a framework for collective probabilistic reasoning that relaxes Boolean variables to the continuous interval [0,1], transforming discrete inference into a convex optimization problem that scales to millions of variables.

P(I) ∝ exp(−Σᵣ wᵣ · (max{0, lᵣ(I)})ᵖ) where I ∈ [0,1]ⁿ

Many real-world reasoning problems involve combining uncertain evidence from multiple sources with soft logical rules — rules that express tendencies rather than absolute constraints. Probabilistic soft logic (PSL) addresses this by defining a probability distribution over continuous truth values in [0,1], using the Lukasiewicz relaxation of Boolean logic. The result is a hinge-loss Markov random field (HL-MRF), whose most probable explanation (MAP inference) reduces to a convex optimization problem solvable in polynomial time.

Lukasiewicz Logic and Continuous Relaxation

Lukasiewicz T-norms A ∧ B = max(A + B − 1, 0)
A ∨ B = min(A + B, 1)
¬A = 1 − A

Hinge-Loss Markov Random Field P(I) ∝ exp(−Σᵣ wᵣ · φᵣ(I))
φᵣ(I) = (max{0, lᵣ(I)})ᵖ p ∈ {1, 2}
where lᵣ(I) is a linear function of continuous truth values

In standard Boolean logic, a disjunctive clause is either satisfied (1) or violated (0). In PSL's Lukasiewicz relaxation, satisfaction is a continuous quantity. The hinge-loss potential φᵣ measures by how much a rule is violated — it equals zero when the rule is satisfied and increases linearly (p=1) or quadratically (p=2) with the degree of violation. The weight wᵣ controls the importance of each rule.

2009–2012

Broecheler, Mihalkova, and Getoor develop the initial PSL framework at the University of Maryland, introducing the combination of Lukasiewicz logic with log-linear models.

2013

Bach, Broecheler, Huang, and Getoor formalize hinge-loss Markov random fields, proving convexity of MAP inference and developing efficient ADMM-based solvers.

2017

PSL 2.0 is released as a mature open-source system with a declarative modeling language, weight learning, and support for large-scale applications.

2020s

PSL is applied to knowledge graph completion, fairness-aware machine learning, drug interaction prediction, and social science modeling. Integration with neural embeddings extends its reach to hybrid neuro-symbolic systems.

Convexity and Scalability

The key computational advantage of PSL over discrete Markov logic networks is that MAP inference in an HL-MRF is a convex optimization problem. When p=2, the objective is a sum of squared hinge losses — a quadratic program. When p=1, it is a linear program. Both can be solved efficiently using the alternating direction method of multipliers (ADMM), which decomposes the global problem into local subproblems that can be solved in closed form. This enables PSL to scale to problems with millions of random variables and hundreds of millions of ground rules.

PSL vs. Markov Logic Networks

Both PSL and MLNs combine logical rules with probabilistic weights. The key difference is the variable domain: MLNs use Boolean variables (requiring discrete combinatorial inference), while PSL relaxes variables to [0,1] (enabling convex continuous optimization). This relaxation sacrifices some expressiveness — PSL's continuous truth values do not always have a natural interpretation — but gains enormous scalability. In practice, PSL is often the method of choice when the problem involves millions of entities and the primary goal is MAP inference rather than marginal probability computation.

Weight Learning

Given observed data, PSL's weights can be learned by maximizing likelihood or a structured-prediction objective. Because the MAP inference is convex, the weight learning outer loop can efficiently differentiate through the inference step. Maximum pseudo-likelihood and voted perceptron are common learning algorithms, with more recent work exploring Bayesian approaches that place priors on the rule weights to quantify uncertainty and prevent overfitting.

Applications

PSL has been successfully applied to knowledge graph completion (predicting missing links using ontological rules and observed triples), collective document classification (propagating labels through citation networks using similarity rules), social trust prediction (combining network structure with behavioral signals), drug-drug interaction prediction (integrating pharmacological rules with observed interaction data), and fairness-constrained machine learning (encoding fairness criteria as soft logical rules). Its declarative syntax allows domain experts to specify models in near-natural language, while the convex optimization backend ensures scalable inference.

"By relaxing logic from true/false to degrees of truth, we transform intractable combinatorial inference into tractable convex optimization — without losing the ability to express rich relational structure." — Lise Getoor, on the design philosophy of PSL

Lukasiewicz Logic and Continuous Relaxation

Convexity and Scalability

Weight Learning

Applications

Related Topics