IME: Integrating Multi-curvature Shared and Specific Embedding for Temporal Knowledge Graph Completion

Read original: arXiv:2403.19881 - Published 4/1/2024 by Jiapu Wang, Zheng Cui, Boyue Wang, Shirui Pan, Junbin Gao, Baocai Yin, Wen Gao

IME: Integrating Multi-curvature Shared and Specific Embedding for Temporal Knowledge Graph Completion

Introduction

Knowledge Graphs (KGs) are structured data representations that organize information as entities and their relationships. Triplets consisting of head entity, relation, and tail entity model real-world facts. For example, (Albert Einstein, born_in, Germany) states that Albert Einstein was born in Germany. KGs allow machines to understand and reason about structured knowledge, uncovering patterns and supporting applications like recommendation systems, information retrieval, and semantic search. Their semantic modeling of knowledge enables informed decision-making by machines based on the encapsulated structured data.

Figure 1. A brief description of IME. Learning multi-curvature representations through space-shared and space-specific properties. These features are later utilized for subsequent predictions by the adjustable multi-curvature pooling.

The paper introduces Temporal Knowledge Graphs (TKGs) which incorporate temporal information into traditional knowledge graphs, allowing the tracking of knowledge evolution over time. TKGs aim to represent knowledge as quadruplets (subject, relation, object, timestamp). Despite existing large TKGs, their incompleteness hinders knowledge-driven systems, highlighting the importance of Temporal Knowledge Graph Completion (TKGC) to predict missing elements.

The paper notes that different geometric spaces (hyperspherical, hyperbolic, Euclidean) excel at capturing different data structures. However, TKGs exhibit complex structures, and most TKGC methods model them in a single space, failing to effectively capture their intricate geometry. Current methods also overlook the spatial gap between curvature spaces, limiting expressive capacity. Additionally, existing feature fusion approaches have high computational complexity despite pooling strategies' effectiveness in reducing it.

To address these challenges, the paper proposes the Integrating Multi-curvature shared and specific Embedding (IME) model for TKGC tasks. IME simultaneously models TKGs in hyperspherical, hyperbolic, and Euclidean spaces, introducing a quadruplet distributor in each space. IME learns space-shared properties to mitigate the spatial gap by capturing shared information across curvature spaces, and space-specific properties to capture complementary information in each space.

The paper also introduces an Adjustable Multi-curvature Pooling (AMP) approach to learn appropriate pooling weights for superior pooling strategies and effective information retention. AMP aggregates space-shared and -specific representations for downstream predictions.

Key contributions include the novel IME model with space-shared and -specific properties, the AMP module for effective pooling, introducing structure loss for structural similarity across curvature spaces, and achieving competitive performance on TKGC tasks.

Figure 2. The framework of IME. Specifically, IME models the query (Albert Einstein, Born In, ?, 1879-3-14) in multi-curvature spaces through information aggregation and information distribution. Subsequently, IME explores space-shared and space-specific properties to learn the commonalities and characteristics across different curvature spaces, effectively reducing spatial gaps among them. Finally, these identified features are employed for adjustable multi-curvature pooling in subsequent predictions.

Related work

The passage provides an overview of knowledge graph completion (KGC) methods from two perspectives: Euclidean embedding-based methods and Non-Euclidean embedding-based methods.

Euclidean Embedding-based Methods:

Static knowledge graph completion (SKGC) focuses on predicting missing triplets in static knowledge graphs.
Methods include translation-based (TransE, RotatE, TransR, SimplE, BoxE), semantic matching-based (DistMult, ComplEx, CapsE, TuckER, McRL, MLI), and convolutional neural network-based (ConvE, R-GCN, TDN).
Temporal knowledge graph completion (TKGC) aims to predict missing quadruplets (entities, relations, timestamps) in temporal knowledge graphs.
Methods like TTransE, TA-TransE, TA-DistMult, ChronoR, TuckERTNT, BoxTE, HyTE, TeRo, TComplEx, DE-SimplE, ATiSE, TeLM, EvoExplore, BDME, and QDN are discussed.

Non-Euclidean Embedding-based Methods:

These methods embed knowledge graphs into non-Euclidean spaces to capture complex geometric structures.
For SKGC, methods like ATTH, BiQUE, MuRMP, and GIE are discussed.
For TKGC, methods such as DyERNIE and BiQCap are mentioned.

The passage does not provide summaries for specific sections.

Problem Definition

The paper introduces the concept of a temporal knowledge graph, denoted as 𝒢={ℰ,ℛ,𝒯,𝒬}𝒢ℰℛ𝒯𝒬\mathcal{G}={\mathcal{E},\ \mathcal{R},\ \mathcal{T},\ \mathcal{Q}}caligraphic_G = { caligraphic_E , caligraphic_R , caligraphic_T , caligraphic_Q }. This graph consists of a set of entities ℰℰ\mathcal{E}caligraphic_E, relations ℛℛ\mathcal{R}caligraphic_R, and timestamps 𝒯𝒯\mathcal{T}caligraphic_T. Each element in the graph is a quadruplet (𝐬,𝐫,𝐨,𝐭)∈𝒬𝐬𝐫𝐨𝐭𝒬(\mathbf{s},\ \mathbf{r},\ \mathbf{o},\ \mathbf{t})\in\mathcal{Q}( bold_s , bold_r , bold_o , bold_t ) ∈ caligraphic_Q, where 𝐬,𝐨∈ℰ𝐬𝐨ℰ\mathbf{s},\ \mathbf{o}\in\mathcal{E}bold_s , bold_o ∈ caligraphic_E are the head and tail entities, 𝐫∈ℛ𝐫ℛ\mathbf{r}\in\mathcal{R}bold_r

Methodology

The section describes the proposed model IME (Intrinsic Multi-curvature Embeddings) for temporal knowledge graph reasoning. IME consists of three main stages:

Multi-curvature Embeddings: Entities, relations, and timestamps are simultaneously modeled in Euclidean, hyperbolic, and hyperspherical spaces to capture complex geometric structures in temporal knowledge graphs. An information aggregation and distribution process is used to update the representations across spaces.
Space-shared and -specific Representations: Encoding functions with shared and specific parameters are employed to capture the commonalities and unique characteristics across different curvature spaces.
Adjustable Multi-curvature Pooling (AMP): An adjustable pooling approach is proposed to effectively retain important information when aggregating the space-shared and -specific representations for prediction.

Additionally, the loss function consists of four components: the task loss for link prediction, similarity loss to bridge spatial gaps, difference loss to capture space-specific features, and structure loss to ensure structural similarity across spaces.

The model is designed to comprehensively capture the complex structures and characteristics present in temporal knowledge graphs across multiple curvature spaces.

Experiment

This section describes the datasets used, experimental setup, evaluation metrics, and analysis of results for the proposed temporal knowledge graph completion (TKGC) model.

Three commonly used temporal knowledge graph (TKG) datasets are employed: ICEWS14, ICEWS05-15, and GDELT. The proposed model is compared against various baselines, including static KG completion (SKGC) methods like TransE, DistMult, SimplE, RotatE, and temporal KGC (TKGC) methods such as TA-DistMult, TeRo, ChronoR, ATiSE, TeLM, TuckERTNT, BoxTE, BDME, EvoExplore, DyERNIE, BiQCap, and QDN.

Link prediction metrics used are Mean Reciprocal Rank (MRR) and Hits@N (N=1,3,10). Hyperparameters like loss weights, embedding dimension, and learning rate are tuned using grid search on the validation set.

The key results and analyses are:

The proposed model outperforms state-of-the-art baselines across most metrics on all three datasets, showing its effectiveness in modeling complex geometric structures and reducing spatial gaps between different curvature spaces.
Ablation studies highlight the importance of the similarity loss in learning common features, the difference loss in capturing space-specific characteristics, and the structure loss in constraining embeddings across spaces.
Using an adjustable multi-curvature pooling approach improves performance over max pooling.
There is an optimal embedding dimension (500) beyond which performance degrades, likely due to overfitting or increased complexity.

The results demonstrate the proposed model's ability to effectively represent and complete temporal knowledge graphs by leveraging multiple curvature spaces and reducing spatial gaps between them.

Conclusion

The paper proposes a novel temporal knowledge graph completion (TKGC) method called Integrating Multi-curvature shared and specific Embedding (IME). IME models temporal knowledge graphs in multi-curvature spaces to capture complex geometric structures. It learns space-specific properties to comprehensively capture characteristic information and space-shared properties to reduce spatial gaps caused by heterogeneity across different curvature spaces.

IME innovatively introduces an Adjustable Multi-curvature Pooling (AMP) approach to effectively strengthen the retention of important information. Experimental results on several well-established datasets demonstrate that IME achieves competitive performance compared to state-of-the-art TKGC methods.

The work was funded by the National Key R&D Program of China, National Natural Science Foundation of China, and the R&D Program of Beijing Municipal Education Commission.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

IME: Integrating Multi-curvature Shared and Specific Embedding for Temporal Knowledge Graph Completion

Jiapu Wang, Zheng Cui, Boyue Wang, Shirui Pan, Junbin Gao, Baocai Yin, Wen Gao

Temporal Knowledge Graphs (TKGs) incorporate a temporal dimension, allowing for a precise capture of the evolution of knowledge and reflecting the dynamic nature of the real world. Typically, TKGs contain complex geometric structures, with various geometric structures interwoven. However, existing Temporal Knowledge Graph Completion (TKGC) methods either model TKGs in a single space or neglect the heterogeneity of different curvature spaces, thus constraining their capacity to capture these intricate geometric structures. In this paper, we propose a novel Integrating Multi-curvature shared and specific Embedding (IME) model for TKGC tasks. Concretely, IME models TKGs into multi-curvature spaces, including hyperspherical, hyperbolic, and Euclidean spaces. Subsequently, IME incorporates two key properties, namely space-shared property and space-specific property. The space-shared property facilitates the learning of commonalities across different curvature spaces and alleviates the spatial gap caused by the heterogeneous nature of multi-curvature spaces, while the space-specific property captures characteristic features. Meanwhile, IME proposes an Adjustable Multi-curvature Pooling (AMP) approach to effectively retain important information. Furthermore, IME innovatively designs similarity, difference, and structure loss functions to attain the stated objective. Experimental results clearly demonstrate the superior performance of IME over existing state-of-the-art TKGC models.

4/1/2024

From Semantics to Hierarchy: A Hybrid Euclidean-Tangent-Hyperbolic Space Model for Temporal Knowledge Graph Reasoning

Siling Feng, Zhisheng Qi, Cong Lin

Temporal knowledge graph (TKG) reasoning predicts future events based on historical data, but it's challenging due to the complex semantic and hierarchical information involved. Existing Euclidean models excel at capturing semantics but struggle with hierarchy. Conversely, hyperbolic models manage hierarchical features well but fail to represent complex semantics due to limitations in shallow models' parameters and the absence of proper normalization in deep models relying on the L2 norm. Current solutions, as curvature transformations, are insufficient to address these issues. In this work, a novel hybrid geometric space approach that leverages the strengths of both Euclidean and hyperbolic models is proposed. Our approach transitions from single-space to multi-space parameter modeling, effectively capturing both semantic and hierarchical information. Initially, complex semantics are captured through a fact co-occurrence and autoregressive method with normalizations in Euclidean space. The embeddings are then transformed into Tangent space using a scaling mechanism, preserving semantic information while relearning hierarchical structures through a query-candidate separated modeling approach, which are subsequently transformed into Hyperbolic space. Finally, a hybrid inductive bias for hierarchical and semantic learning is achieved by combining hyperbolic and Euclidean scoring functions through a learnable query-specific mixing coefficient, utilizing embeddings from hyperbolic and Euclidean spaces. Experimental results on four TKG benchmarks demonstrate that our method reduces error relatively by up to 15.0% in mean reciprocal rank on YAGO compared to previous single-space models. Additionally, enriched visualization analysis validates the effectiveness of our approach, showing adaptive capabilities for datasets with varying levels of semantic and hierarchical complexity.

9/4/2024

On The Expressive Power of Knowledge Graph Embedding Methods

Jiexing Gao, Dmitry Rodin, Vasily Motolygin, Denis Zaytsev

Knowledge Graph Embedding (KGE) is a popular approach, which aims to represent entities and relations of a knowledge graph in latent spaces. Their representations are known as embeddings. To measure the plausibility of triplets, score functions are defined over embedding spaces. Despite wide dissemination of KGE in various tasks, KGE methods have limitations in reasoning abilities. In this paper we propose a mathematical framework to compare reasoning abilities of KGE methods. We show that STransE has a higher capability than TransComplEx, and then present new STransCoRe method, which improves the STransE by combining it with the TransCoRe insights, which can reduce the STransE space complexity.

7/29/2024

Croppable Knowledge Graph Embedding

Yushan Zhu, Wen Zhang, Zhiqiang Liu, Mingyang Chen, Lei Liang, Huajun Chen

Knowledge Graph Embedding (KGE) is a common method for Knowledge Graphs (KGs) to serve various artificial intelligence tasks. The suitable dimensions of the embeddings depend on the storage and computing conditions of the specific application scenarios. Once a new dimension is required, a new KGE model needs to be trained from scratch, which greatly increases the training cost and limits the efficiency and flexibility of KGE in serving various scenarios. In this work, we propose a novel KGE training framework MED, through which we could train once to get a croppable KGE model applicable to multiple scenarios with different dimensional requirements, sub-models of the required dimensions can be cropped out of it and used directly without any additional training. In MED, we propose a mutual learning mechanism to improve the low-dimensional sub-models performance and make the high-dimensional sub-models retain the capacity that low-dimensional sub-models have, an evolutionary improvement mechanism to promote the high-dimensional sub-models to master the knowledge that the low-dimensional sub-models can not learn, and a dynamic loss weight to balance the multiple losses adaptively. Experiments on 3 KGE models over 4 standard KG completion datasets, 3 real application scenarios over a real-world large-scale KG, and the experiments of extending MED to the language model BERT show the effectiveness, high efficiency, and flexible extensibility of MED.

7/4/2024