Enhancing Geometric Ontology Embeddings for $mathcal{EL}^{++}$ with Negative Sampling and Deductive Closure Filtering

Read original: arXiv:2405.04868 - Published 6/27/2024 by Olga Mashkova, Fernando Zhapa-Camacho, Robert Hoehndorf

👨‍🏫

Overview

Ontology embeddings map classes, relations, and individuals in ontologies into a high-dimensional vector space.
This allows for similarity computations and inference of new axioms within the vector space.
The paper focuses on ontologies in the Description Logic $\mathcal{EL}^{++}$ , and evaluates several embedding methods that explicitly generate models of the ontology.
The paper proposes modifications to existing methods to better utilize the deductive closure of the ontology and account for different types of negatives.
The goal is to improve ontology completion tasks compared to baseline ontology embedding methods.

Plain English Explanation

Ontologies are like structured knowledge databases that define the relationships between different concepts. Ontology embeddings take these ontologies and map the classes, relationships, and individual entities into a high-dimensional vector space. This allows us to compute the similarity between different entities in the ontology and even infer new logical connections that may not have been explicitly stated.

The paper focuses on a specific type of ontology called $\mathcal{EL}^{++}$ , which has some unique properties. The authors evaluate several methods for creating these ontology embeddings, but they find that the existing approaches have some limitations. Specifically, they don't always distinguish between statements that are simply unprovable versus those that are provably false. They also don't fully utilize the full set of logical inferences that can be made from the ontology.

To address these issues, the researchers propose some modifications to the embedding methods. They develop new "negative losses" that better account for the deductive closure of the ontology and the different types of negative statements. By incorporating these changes, they are able to improve the performance of the ontology embeddings on the task of knowledge base or ontology completion.

Technical Explanation

The paper focuses on ontology embeddings for ontologies in the Description Logic $\mathcal{EL}^{++}$ . Ontology embeddings map the classes, relations, and individuals in an ontology into a high-dimensional vector space ( $\mathbb{R}^n$ ), where the similarity between entities can be computed or new axioms inferred.

The authors evaluate a set of embedding methods for `$\mathcal{EL}^{++}$ ontologies that explicitly generate models of the ontology. However, they find that these existing methods have some limitations. Specifically, the methods do not distinguish between statements that are unprovable and those that are provably false, and they may use entailed statements as negative examples. Additionally, they do not fully utilize the deductive closure of the ontology to identify statements that are inferred but not asserted.

To address these issues, the researchers propose novel modifications to the embedding methods. They design new negative losses that account for both the deductive closure of the ontology and the different types of negatives (unprovable vs. provably false). The goal is to better leverage the structure and logical properties of the $\mathcal{EL}^{++}$ ontology in the embedding process.

The authors evaluate their proposed embedding methods on the task of knowledge base or ontology completion, and demonstrate that their approaches outperform the baseline ontology embedding methods. This suggests that incorporating the deductive closure and different types of negatives can lead to more effective ontology embeddings.

Critical Analysis

The paper presents a thoughtful approach to improving ontology embeddings for $\mathcal{EL}^{++}$ ontologies by addressing limitations in existing methods. The proposed modifications, such as the novel negative losses, seem well-justified and the experimental results support the benefits of these changes.

However, the paper does not delve deeply into the potential limitations or caveats of the research. For example, it would be interesting to understand how the embedding methods scale to larger or more complex ontologies, or how sensitive the performance is to the specific ontology structure or content.

Additionally, the paper does not explore the potential trade-offs or challenges in fully utilizing the deductive closure of the ontology. There may be computational or practical considerations that should be addressed, such as the time and memory requirements of working with the complete logical inferences.

Further research could also investigate how these ontology embedding techniques interact with other AI models and methods, such as hierarchical dynamic labeling or large language models used as oracles. Integrating ontology embeddings with these approaches could lead to even more powerful and versatile AI systems.

Conclusion

This paper presents a significant advancement in the field of ontology embeddings, particularly for ontologies in the $\mathcal{EL}^{++}$ Description Logic. By incorporating the deductive closure of the ontology and accounting for different types of negatives, the proposed embedding methods demonstrate improved performance on the task of ontology completion.

The insights from this research could have broad implications for knowledge representation, reasoning, and integration in AI systems. Effective ontology embeddings are a crucial component for building intelligent agents that can understand and reason about complex, structured knowledge. As the field of AI continues to evolve, advancements in this area will be essential for developing more robust and capable systems that can tackle real-world challenges.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

👨‍🏫

Enhancing Geometric Ontology Embeddings for $mathcal{EL}^{++}$ with Negative Sampling and Deductive Closure Filtering

Olga Mashkova, Fernando Zhapa-Camacho, Robert Hoehndorf

Ontology embeddings map classes, relations, and individuals in ontologies into $mathbb{R}^n$, and within $mathbb{R}^n$ similarity between entities can be computed or new axioms inferred. For ontologies in the Description Logic $mathcal{EL}^{++}$, several embedding methods have been developed that explicitly generate models of an ontology. However, these methods suffer from some limitations; they do not distinguish between statements that are unprovable and provably false, and therefore they may use entailed statements as negatives. Furthermore, they do not utilize the deductive closure of an ontology to identify statements that are inferred but not asserted. We evaluated a set of embedding methods for $mathcal{EL}^{++}$ ontologies based on high-dimensional ball representation of concept descriptions, incorporating several modifications that aim to make use of the ontology deductive closure. In particular, we designed novel negative losses that account both for the deductive closure and different types of negatives. We demonstrate that our embedding methods improve over the baseline ontology embedding in the task of knowledge base or ontology completion.

6/27/2024

🏅

Lattice-preserving $mathcal{ALC}$ ontology embeddings

Fernando Zhapa-Camacho, Robert Hoehndorf

Generating vector representations (embeddings) of OWL ontologies is a growing task due to its applications in predicting missing facts and knowledge-enhanced learning in fields such as bioinformatics. The underlying semantics of OWL ontologies is expressed using Description Logics (DLs). Initial approaches to generate embeddings relied on constructing a graph out of ontologies, neglecting the semantics of the logic therein. Recent semantic-preserving embedding methods often target lightweight DL languages like $mathcal{EL}^{++}$, ignoring more expressive information in ontologies. Although some approaches aim to embed more descriptive DLs like $mathcal{ALC}$, those methods require the existence of individuals, while many real-world ontologies are devoid of them. We propose an ontology embedding method for the $mathcal{ALC}$ DL language that considers the lattice structure of concept descriptions. We use connections between DL and Category Theory to materialize the lattice structure and embed it using an order-preserving embedding method. We show that our method outperforms state-of-the-art methods in several knowledge base completion tasks. We make our code and data available at https://github.com/bio-ontology-research-group/catE.

5/9/2024

Towards Ontology-Enhanced Representation Learning for Large Language Models

Francesco Ronzano, Jay Nanavati

Taking advantage of the widespread use of ontologies to organise and harmonize knowledge across several distinct domains, this paper proposes a novel approach to improve an embedding-Large Language Model (embedding-LLM) of interest by infusing the knowledge formalized by a reference ontology: ontological knowledge infusion aims at boosting the ability of the considered LLM to effectively model the knowledge domain described by the infused ontology. The linguistic information (i.e. concept synonyms and descriptions) and structural information (i.e. is-a relations) formalized by the ontology are utilized to compile a comprehensive set of concept definitions, with the assistance of a powerful generative LLM (i.e. GPT-3.5-turbo). These concept definitions are then employed to fine-tune the target embedding-LLM using a contrastive learning framework. To demonstrate and evaluate the proposed approach, we utilize the biomedical disease ontology MONDO. The results show that embedding-LLMs enhanced by ontological disease knowledge exhibit an improved capability to effectively evaluate the similarity of in-domain sentences from biomedical documents mentioning diseases, without compromising their out-of-domain performance.

6/3/2024

Ontology Embedding: A Survey of Methods, Applications and Resources

Jiaoyan Chen, Olga Mashkova, Fernando Zhapa-Camacho, Robert Hoehndorf, Yuan He, Ian Horrocks

Ontologies are widely used for representing domain knowledge and meta data, playing an increasingly important role in Information Systems, the Semantic Web, Bioinformatics and many other domains. However, logical reasoning that ontologies can directly support are quite limited in learning, approximation and prediction. One straightforward solution is to integrate statistical analysis and machine learning. To this end, automatically learning vector representation for knowledge of an ontology i.e., ontology embedding has been widely investigated in recent years. Numerous papers have been published on ontology embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field. To bridge this gap, we write this survey paper, which first introduces different kinds of semantics of ontologies, and formally defines ontology embedding from the perspectives of both mathematics and machine learning, as well as its property of faithfulness. Based on this, it systematically categorises and analyses a relatively complete set of over 80 papers, according to the ontologies and semantics that they aim at, and their technical solutions including geometric modeling, sequence modeling and graph propagation. This survey also introduces the applications of ontology embedding in ontology engineering, machine learning augmentation and life sciences, presents a new library mOWL, and discusses the challenges and future directions.

6/18/2024