Probabilistic unifying relations for modelling epistemic and aleatoric uncertainty: semantics and automated reasoning with theorem proving

Read original: arXiv:2303.09692 - Published 9/30/2024 by Kangfeng Ye, Jim Woodcock, Simon Foster

↗️

Overview

Probabilistic programming combines computer programming, statistical inference, and formal semantics to help systems make decisions under uncertainty.
Probabilistic programs are widely used, including in machine intelligence, but their automated verification based on formal semantics is a relatively new research area.
The paper presents "probabilistic unifying relations" (ProbURel), which takes steps to address the challenges in this research area.

Plain English Explanation

Probabilistic programming allows computer systems to make decisions when they are not completely certain about the information they have. This is important because real-world situations often involve uncertainty.

The paper focuses on a specific type of probabilistic programming called "predicative probabilistic programming," which was developed by Hehner. However, there are obstacles to wider adoption of this approach. The researchers here have addressed these obstacles by:

Formalizing the syntax and semantics of predicative probabilistic programming, including using a notation to separate mathematical relations from probabilities.
Representing relations using a framework called Unifying Theories of Programming (UTP) and modeling probabilities using mathematical summation.
Developing a way to reason about probabilistic loops using fixed-point theory.
Extending the semantics to handle different types of probability distributions.
Proving a unique fixed-point theorem to simplify reasoning about probabilistic loops.
Implementing the theory in a theorem-proving software called Isabelle/UTP, allowing automated reasoning.

The researchers demonstrate their approach with examples from robotics, machine learning, and the behavior of probabilistic programs.

Technical Explanation

The core contribution of the paper is the probabilistic unifying relations (ProbURel) framework, which builds on Hehner's predicative probabilistic programming.

The researchers first formalize the syntax and semantics of predicative probabilistic programming, introducing an Iverson bracket notation to separate mathematical relations from probabilities. They then represent the relations using the Unifying Theories of Programming (UTP) framework and model the probabilities using summation over the real number space.

A key aspect is the constructive semantics for probabilistic loops, which the researchers develop using Kleene's fixed-point theorem. To handle this constructive semantics, they enrich the semantics to deal with subdistributions and superdistributions (variants of probability distributions).

The researchers also prove a unique fixed-point theorem, which simplifies the reasoning about probabilistic loops. Finally, they mechanize the entire ProbURel theory in the Isabelle/UTP theorem-proving system, enabling automated reasoning.

The paper demonstrates the ProbURel framework with six examples, including problems in robot localization, classification in machine learning, and the termination of probabilistic loops.

Critical Analysis

The paper makes a significant contribution to the field of probabilistic programming by addressing several key obstacles in the adoption of Hehner's predicative probabilistic programming. The formalization of syntax and semantics, as well as the use of UTP and fixed-point theory, are important advancements.

However, the paper does not discuss the potential limitations or caveats of the ProbURel framework. For example, the performance and scalability of the approach, especially when dealing with complex probabilistic models, are not addressed. Additionally, the paper does not compare ProbURel to other probabilistic programming frameworks or discuss how it might fit into the broader landscape of probabilistic programming research.

Furthermore, the paper could have provided more insights into the practical implications and real-world applications of the ProbURel framework. While the examples are helpful, a deeper discussion of the challenges and trade-offs involved in applying the framework to complex, real-world problems would have been valuable.

Conclusion

The probabilistic unifying relations (ProbURel) framework presented in this paper represents a significant step forward in the field of probabilistic programming. By addressing the obstacles in Hehner's predicative probabilistic programming and developing a formal, mechanized theory, the researchers have laid the groundwork for more robust and verifiable probabilistic systems.

The potential impact of this work extends beyond the specific examples provided, as the ProbURel framework could be applied to a wide range of domains where uncertainty plays a crucial role, such as robotics, machine learning, and decision-making. Further research on the scalability, limitations, and broader applications of ProbURel will be important to fully realize its potential.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

↗️

Probabilistic unifying relations for modelling epistemic and aleatoric uncertainty: semantics and automated reasoning with theorem proving

Kangfeng Ye, Jim Woodcock, Simon Foster

Probabilistic programming combines general computer programming, statistical inference, and formal semantics to help systems make decisions when facing uncertainty. Probabilistic programs are ubiquitous, including having a significant impact on machine intelligence. While many probabilistic algorithms have been used in practice in different domains, their automated verification based on formal semantics is still a relatively new research area. In the last two decades, it has attracted much interest. Many challenges, however, remain. The work presented in this paper, probabilistic unifying relations (ProbURel), takes a step towards our vision to tackle these challenges. Our work is based on Hehner's predicative probabilistic programming, but there are several obstacles to the broader adoption of his work. Our contributions here include (1) the formalisation of its syntax and semantics by introducing an Iverson bracket notation to separate relations from arithmetic; (2) the formalisation of relations using Unifying Theories of Programming (UTP) and probabilities outside the brackets using summation over the topological space of the real numbers; (3) the constructive semantics for probabilistic loops using Kleene's fixed-point theorem; (4) the enrichment of its semantics from distributions to subdistributions and superdistributions to deal with the constructive semantics; (5) the unique fixed-point theorem to simplify the reasoning about probabilistic loops; and (6) the mechanisation of our theory in Isabelle/UTP, an implementation of UTP in Isabelle/HOL, for automated reasoning using theorem proving. We demonstrate our work with six examples, including problems in robot localisation, classification in machine learning, and the termination of probabilistic loops.

9/30/2024

🛸

Probabilities of the third type: Statistical Relational Learning and Reasoning with Relative Frequencies

Felix Weitkamper

Dependencies on the relative frequency of a state in the domain are common when modelling probabilistic dependencies on relational data. For instance, the likelihood of a school closure during an epidemic might depend on the proportion of infected pupils exceeding a threshold. Often, rather than depending on discrete thresholds, dependencies are continuous: for instance, the likelihood of any one mosquito bite transmitting an illness depends on the proportion of carrier mosquitoes. Current approaches usually only consider probabilities over possible worlds rather than over domain elements themselves. An exception are the recently introduced lifted Bayesian networks for conditional probability logic, which express discrete dependencies on probabilistic data. We introduce functional lifted Bayesian networks, a formalism that explicitly incorporates continuous dependencies on relative frequencies into statistical relational artificial intelligence, and compare and contrast them with lifted Bayesian networks for conditional probability logic. Incorporating relative frequencies is not only beneficial to modelling; it also provides a more rigorous approach to learning problems where training and test or application domains have different sizes. To this end, we provide a representation of the asymptotic probability distributions induced by functional lifted Bayesian networks on domains of increasing sizes. Since that representation has well-understood scaling behaviour across domain sizes, it can be used to estimate parameters for a large domain consistently from randomly sampled subpopulations. Furthermore, we show that in parametric families of FLBN, convergence is uniform in the parameters, which ensures a meaningful dependence of the asymptotic probabilities on the parameters of the model.

8/21/2024

📊

On Probabilistic and Causal Reasoning with Summation Operators

Duligur Ibeling, Thomas F. Icard, Milan Moss'e

Ibeling et al. (2023). axiomatize increasingly expressive languages of causation and probability, and Mosse et al. (2024) show that reasoning (specifically the satisfiability problem) in each causal language is as difficult, from a computational complexity perspective, as reasoning in its merely probabilistic or correlational counterpart. Introducing a summation operator to capture common devices that appear in applications -- such as the $do$-calculus of Pearl (2009) for causal inference, which makes ample use of marginalization -- van der Zander et al. (2023) partially extend these earlier complexity results to causal and probabilistic languages with marginalization. We complete this extension, fully characterizing the complexity of probabilistic and causal reasoning with summation, demonstrating that these again remain equally difficult. Surprisingly, allowing free variables for random variable values results in a system that is undecidable, so long as the ranges of these random variables are unrestricted. We finally axiomatize these languages featuring marginalization (or more generally summation), resolving open questions posed by Ibeling et al. (2023).

5/21/2024

🖼️

Declarative Probabilistic Logic Programming in Discrete-Continuous Domains

Pedro Zuidberg Dos Martires, Luc De Raedt, Angelika Kimmig

Over the past three decades, the logic programming paradigm has been successfully expanded to support probabilistic modeling, inference and learning. The resulting paradigm of probabilistic logic programming (PLP) and its programming languages owes much of its success to a declarative semantics, the so-called distribution semantics. However, the distribution semantics is limited to discrete random variables only. While PLP has been extended in various ways for supporting hybrid, that is, mixed discrete and continuous random variables, we are still lacking a declarative semantics for hybrid PLP that not only generalizes the distribution semantics and the modeling language but also the standard inference algorithm that is based on knowledge compilation. We contribute the measure semantics together with the hybrid PLP language DC-ProbLog (where DC stands for distributional clauses) and its inference engine infinitesimal algebraic likelihood weighting (IALW). These have the original distribution semantics, standard PLP languages such as ProbLog, and standard inference engines for PLP based on knowledge compilation as special cases. Thus, we generalize the state of the art of PLP towards hybrid PLP in three different aspects: semantics, language and inference. Furthermore, IALW is the first inference algorithm for hybrid probabilistic programming based on knowledge compilation

9/10/2024