Lifted Inference beyond First-Order Logic

Read original: arXiv:2308.11738 - Published 4/30/2024 by Sagar Malhotra, Davide Bizzaro, Luciano Serafini

🤯

Overview

The paper discusses Weighted First Order Model Counting (WFOMC), a fundamental technique for probabilistic inference in statistical relational learning models.
WFOMC is known to be computationally intractable in general, so the researchers explore logical fragments that can be solved efficiently.
They focus on extending the domain liftability of the two-variable fragment of first-order logic with counting quantifiers (C^2), which was previously shown to be domain-liftable.
The researchers demonstrate that C^2 sentences remain domain-liftable when certain relations are restricted to represent specific graph structures, such as directed acyclic graphs, connected graphs, trees, and forests.

Plain English Explanation

The paper explores a mathematical technique called Weighted First Order Model Counting (WFOMC), which is crucial for making probabilistic predictions in certain types of AI models that work with structured data, like social networks or citation networks.

Unfortunately, WFOMC is generally a very difficult problem to solve computationally. However, the researchers have identified a specific type of logical expression, called the "two-variable fragment with counting quantifiers" (C^2), that can be solved efficiently. This is known as being "domain-liftable."

The researchers wanted to expand the types of data structures that can be handled by C^2 while still maintaining this efficient solvability. They showed that C^2 expressions remain domain-liftable even when one of the relations in the expression is restricted to represent certain kinds of graphs, like ones with no cycles (directed acyclic graphs), ones where all the nodes are connected (connected graphs), tree-like structures, and forest-like structures.

This is an important result because many real-world datasets, like social networks or citation networks, have these kinds of underlying graph structures. By being able to leverage C^2 expressions to efficiently perform probabilistic inference on these types of datasets, the researchers are expanding the toolbox available for building more powerful AI models that can understand and reason about complex, structured data.

Technical Explanation

The paper focuses on the Weighted First Order Model Counting (WFOMC) problem, which is central to probabilistic inference in statistical relational learning models. Since WFOMC is known to be computationally intractable in general (NP-complete), the researchers investigate logical fragments that admit polynomial-time WFOMC, known as domain-liftable fragments.

Previous work has shown that the two-variable fragment of first-order logic extended with counting quantifiers (C^2) is domain-liftable. However, many properties of real-world data, such as acyclicity in citation networks and connectivity in social networks, cannot be easily modeled using C^2 or first-order logic in general.

In this paper, the researchers expand the domain liftability of C^2 by introducing several restrictions on the relations in C^2 sentences. Specifically, they demonstrate that C^2 sentences remain domain-liftable when one of the relations is restricted to represent a directed acyclic graph, a connected graph, a tree (or a directed tree), or a forest (or a directed forest).

The researchers achieve these results through a novel and general methodology called "counting by splitting," which allows them to efficiently count the number of models that satisfy a given C^2 sentence under these structural restrictions. Their approach builds upon and expands a significant body of previous work in discrete mathematics on directed acyclic graphs, phylogenetic networks, and related combinatorial structures.

Critical Analysis

The paper presents a significant advancement in the field of weighted first-order model counting, a fundamental technique for probabilistic inference in statistical relational learning. By expanding the domain liftability of C^2 to encompass a broader range of graph structures, the researchers have significantly increased the applicability of this efficient inference technique to real-world datasets.

One potential limitation of the work is that it focuses solely on the theoretical aspects of domain liftability, without providing any empirical evaluation of the practical performance of the proposed methods. It would be interesting to see how these theoretical results translate to tangible improvements in the performance of probabilistic inference tasks on real-world datasets with the specified graph structures, as discussed in the paper on fine-grained LLM-based physics.

Additionally, while the researchers have expanded the types of graph structures that can be handled by C^2, there may still be other important properties of real-world data that are not easily captured by the restrictions considered in this work. Exploring the domain liftability of C^2 under additional structural constraints or extensions to the logic could further broaden the applicability of this efficient inference technique.

Conclusion

This paper makes a significant contribution to the field of probabilistic inference in statistical relational learning by expanding the domain liftability of the two-variable fragment of first-order logic with counting quantifiers (C^2). The researchers demonstrate that C^2 sentences remain domain-liftable when certain relations are restricted to represent specific graph structures, such as directed acyclic graphs, connected graphs, trees, and forests.

This result is highly relevant for modeling and reasoning about complex, structured data encountered in real-world applications, like social networks and citation networks. By leveraging the efficient inference capabilities of domain-liftable C^2 expressions, researchers and practitioners can develop more powerful AI models that can better understand and make predictions about the intricate relationships and patterns present in these types of datasets.

The researchers' novel "counting by splitting" methodology also provides a general framework for counting combinatorial structures, which could have broader applications beyond the specific context of probabilistic inference. Overall, this work expands the toolbox available for building advanced AI systems that can effectively reason about and make sense of complex, structured data.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤯

Lifted Inference beyond First-Order Logic

Sagar Malhotra, Davide Bizzaro, Luciano Serafini

Weighted First Order Model Counting (WFOMC) is fundamental to probabilistic inference in statistical relational learning models. As WFOMC is known to be intractable in general ($#$P-complete), logical fragments that admit polynomial time WFOMC are of significant interest. Such fragments are called domain liftable. Recent works have shown that the two-variable fragment of first order logic extended with counting quantifiers ($mathrm{C^2}$) is domain-liftable. However, many properties of real-world data, like acyclicity in citation networks and connectivity in social networks, cannot be modeled in $mathrm{C^2}$, or first order logic in general. In this work, we expand the domain liftability of $mathrm{C^2}$ with multiple such properties. We show that any $mathrm{C^2}$ sentence remains domain liftable when one of its relations is restricted to represent a directed acyclic graph, a connected graph, a tree (resp. a directed tree) or a forest (resp. a directed forest). All our results rely on a novel and general methodology of counting by splitting. Besides their application to probabilistic inference, our results provide a general framework for counting combinatorial structures. We expand a vast array of previous results in discrete mathematics literature on directed acyclic graphs, phylogenetic networks, etc.

4/30/2024

Bridging Weighted First Order Model Counting and Graph Polynomials

Qipeng Kuang, Ondv{r}ej Kuv{z}elka, Yuanhong Wang, Yuyi Wang

The Weighted First-Order Model Counting Problem (WFOMC) asks to compute the weighted sum of models of a given first-order logic sentence over a given domain. It can be solved in time polynomial in the domain size for sentences from the two-variable fragment with counting quantifiers, known as $C^2$. This polynomial-time complexity is also retained when extending $C^2$ by one of the following axioms: linear order axiom, tree axiom, forest axiom, directed acyclic graph axiom or connectedness axiom. An interesting question remains as to which other axioms can be added to the first-order sentences in this way. We provide a new perspective on this problem by associating WFOMC with graph polynomials. Using WFOMC, we define Weak Connectedness Polynomial and Strong Connectedness Polynomials for first-order logic sentences. It turns out that these polynomials have the following interesting properties. First, they can be computed in polynomial time in the domain size for sentences from $C^2$. Second, we can use them to solve WFOMC with all of the existing axioms known to be tractable as well as with new ones such as bipartiteness, strong connectedness, being a spanning subgraph, having $k$ connected components, etc. Third, the well-known Tutte polynomial can be recovered as a special case of the Weak Connectedness Polynomial, and the Strict and Non-Strict Directed Chromatic Polynomials can be recovered from the Strong Connectedness Polynomials, which allows us to show that these important graph polynomials can be computed in time polynomial in the number of vertices for any graph that can be encoded by a fixed $C^2$ sentence and a conjunction of an arbitrary number of ground unary literals.

7/17/2024

📈

Lifted Algorithms for Symmetric Weighted First-Order Model Sampling

Yuanhong Wang, Juhua Pu, Yuyi Wang, Ondv{r}ej Kuv{z}elka

Weighted model counting (WMC) is the task of computing the weighted sum of all satisfying assignments (i.e., models) of a propositional formula. Similarly, weighted model sampling (WMS) aims to randomly generate models with probability proportional to their respective weights. Both WMC and WMS are hard to solve exactly, falling under the $#mathsf{P}$-hard complexity class. However, it is known that the counting problem may sometimes be tractable, if the propositional formula can be compactly represented and expressed in first-order logic. In such cases, model counting problems can be solved in time polynomial in the domain size, and are known as domain-liftable. The following question then arises: Is it also the case for weighted model sampling? This paper addresses this question and answers it affirmatively. Specifically, we prove the domain-liftability under sampling for the two-variables fragment of first-order logic with counting quantifiers in this paper, by devising an efficient sampling algorithm for this fragment that runs in time polynomial in the domain size. We then further show that this result continues to hold even in the presence of cardinality constraints. To empirically verify our approach, we conduct experiments over various first-order formulas designed for the uniform generation of combinatorial structures and sampling in statistical-relational models. The results demonstrate that our algorithm outperforms a start-of-the-art WMS sampler by a substantial margin, confirming the theoretical results.

6/17/2024

🔮

Adding Circumscription to Decidable Fragments of First-Order Logic: A Complexity Rollercoaster

Carsten Lutz, Quentin Mani`ere

We study extensions of expressive decidable fragments of first-order logic with circumscription, in particular the two-variable fragment FO$^2$, its extension C$^2$ with counting quantifiers, and the guarded fragment GF. We prove that if only unary predicates are minimized (or fixed) during circumscription, then decidability of logical consequence is preserved. For FO$^2$ the complexity increases from $textrm{coNexp}$ to $textrm{coNExp}^textrm{NP}$-complete, for GF it (remarkably!) increases from $textrm{2Exp}$ to $textrm{Tower}$-complete, and for C$^2$ the complexity remains open. We also consider querying circumscribed knowledge bases whose ontology is a GF sentence, showing that the problem is decidable for unions of conjunctive queries, $textrm{Tower}$-complete in combined complexity, and elementary in data complexity. Already for atomic queries and ontologies that are sets of guarded existential rules, however, for every $k geq 0$ there is an ontology and query that are $k$-$textrm{Exp}$-hard in data complexity.

7/31/2024