Heterogeneous network and graph attention auto-encoder for LncRNA-disease association prediction

Read original: arXiv:2405.02354 - Published 5/7/2024 by Jin-Xing Liu, Wen-Yu Xi, Ling-Yun Dai, Chun-Hou Zheng, Ying-Lian Gao
Total Score

0

🌐

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • The paper discusses the use of long non-coding RNAs (lncRNAs) in the identification of complex human diseases.
  • It proposes a novel deep learning model called HGATELDA that effectively integrates both linear and nonlinear characteristics of lncRNAs and diseases to predict lncRNA-disease associations (LDAs).
  • The model achieves an impressive AUC (area under the curve) value of 0.9692 in cross-validation, outperforming several recent prediction models.

Plain English Explanation

Long non-coding RNAs (lncRNAs) are a type of RNA molecule that do not code for proteins but play important roles in various biological processes. The paper explains that lncRNAs are associated with a range of complex human diseases, but most existing methods struggle to identify the nonlinear relationships between lncRNAs and diseases.

To address this challenge, the researchers developed a novel deep learning model called HGATELDA. This model takes advantage of multiple sources of biomedical data to construct comprehensive characteristics of lncRNAs and diseases, including both linear and nonlinear features.

The linear features are derived from the interactions between lncRNAs, microRNAs (miRNAs), and diseases, which are represented as matrices. The nonlinear features are extracted using a graph attention auto-encoder, which can capture critical information and effectively aggregate the neighborhood information of the nodes (lncRNAs and diseases) in the network.

By combining the linear and nonlinear characteristics, the HGATELDA model can accurately predict the associations between lncRNAs and diseases. The model's impressive performance in cross-validation, with an AUC of 0.9692, suggests that it is a promising computational tool for identifying new lncRNA-disease associations, which can be valuable for disease warning and treatment.

Technical Explanation

The paper proposes a novel deep learning model, HGATELDA, for the accurate identification of lncRNA-disease associations (LDAs). The model effectively integrates both linear and nonlinear characteristics of lncRNAs and diseases to predict these associations.

To construct the linear features, the researchers utilized the miRNA-lncRNA interaction matrix and the miRNA-disease interaction matrix. These matrices capture the relationships between lncRNAs, miRNAs, and diseases, which are known to be important for disease development.

For the nonlinear features, the researchers employed a graph attention auto-encoder, which is a type of graph neural network that can effectively aggregate the neighborhood information of nodes. This approach, inspired by time-aware heterogeneous graph transformers and hyperbolic heterogeneous graph attention networks, allows the model to retain critical information and capture the complex, nonlinear relationships between lncRNAs and diseases.

The final LDA predictions are made by fusing the linear and nonlinear characteristics of lncRNAs and diseases. In the evaluation, the HGATELDA model achieved an AUC of 0.9692 in a 5-fold cross-validation, outperforming several recent LDA prediction models.

Critical Analysis

The paper presents a comprehensive and innovative approach to predicting lncRNA-disease associations, which is an important task for understanding the underlying mechanisms of complex diseases and developing new treatments. The use of both linear and nonlinear features, as well as the integration of multiple data sources, is a strength of the HGATELDA model.

However, the paper does not address some potential limitations or areas for further research. For example, the model's performance on specific disease types or rare diseases is not explored, and the interpretability of the nonlinear feature extraction process could be discussed in more depth. Additionally, the paper does not compare the model's performance to state-of-the-art methods that use advanced language models like BioBERT for biomedical text processing and feature extraction.

Overall, the HGATELDA model appears to be a valuable contribution to the field of lncRNA-disease association prediction, but further research and evaluation could help to address these potential limitations and solidify its position as a leading computational tool in this area.

Conclusion

The paper presents a novel deep learning model, HGATELDA, that effectively integrates linear and nonlinear characteristics of lncRNAs and diseases to accurately predict lncRNA-disease associations. The model's impressive performance, with an AUC of 0.9692 in cross-validation, suggests that it is a promising computational tool for identifying new lncRNA-disease associations, which can be crucial for disease warning, treatment, and the understanding of complex disease mechanisms.

The use of graph attention auto-encoders to capture nonlinear features, combined with the integration of linear characteristics, is a novel and insightful approach that could inspire further advancements in the field of biomedical data analysis and disease prediction.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌐

Total Score

0

Heterogeneous network and graph attention auto-encoder for LncRNA-disease association prediction

Jin-Xing Liu, Wen-Yu Xi, Ling-Yun Dai, Chun-Hou Zheng, Ying-Lian Gao

The emerging research shows that lncRNAs are associated with a series of complex human diseases. However, most of the existing methods have limitations in identifying nonlinear lncRNA-disease associations (LDAs), and it remains a huge challenge to predict new LDAs. Therefore, the accurate identification of LDAs is very important for the warning and treatment of diseases. In this work, multiple sources of biomedical data are fully utilized to construct characteristics of lncRNAs and diseases, and linear and nonlinear characteristics are effectively integrated. Furthermore, a novel deep learning model based on graph attention automatic encoder is proposed, called HGATELDA. To begin with, the linear characteristics of lncRNAs and diseases are created by the miRNA-lncRNA interaction matrix and miRNA-disease interaction matrix. Following this, the nonlinear features of diseases and lncRNAs are extracted using a graph attention auto-encoder, which largely retains the critical information and effectively aggregates the neighborhood information of nodes. In the end, LDAs can be predicted by fusing the linear and nonlinear characteristics of diseases and lncRNA. The HGATELDA model achieves an impressive AUC value of 0.9692 when evaluated using a 5-fold cross-validation indicating its superior performance in comparison to several recent prediction models. Meanwhile, the effectiveness of HGATELDA in identifying novel LDAs is further demonstrated by case studies. the HGATELDA model appears to be a viable computational model for predicting LDAs.

Read more

5/7/2024

🔮

Total Score

0

LncRNA-disease association prediction method based on heterogeneous information completion and convolutional neural network

Wen-Yu Xi, Juan Wang, Yu-Lin Zhang, Jin-Xing Liu, Yin-Lian Gao

The emerging research shows that lncRNA has crucial research value in a series of complex human diseases. Therefore, the accurate identification of lncRNA-disease associations (LDAs) is very important for the warning and treatment of diseases. However, most of the existing methods have limitations in identifying nonlinear LDAs, and it remains a huge challenge to predict new LDAs. In this paper, a deep learning model based on a heterogeneous network and convolutional neural network (CNN) is proposed for lncRNA-disease association prediction, named HCNNLDA. The heterogeneous network containing the lncRNA, disease, and miRNA nodes, is constructed firstly. The embedding matrix of a lncRNA-disease node pair is constructed according to various biological premises about lncRNAs, diseases, and miRNAs. Then, the low-dimensional feature representation is fully learned by the convolutional neural network. In the end, the XGBoot classifier model is trained to predict the potential LDAs. HCNNLDA obtains a high AUC value of 0.9752 and AUPR of 0.9740 under the 5-fold cross-validation. The experimental results show that the proposed model has better performance than that of several latest prediction models. Meanwhile, the effectiveness of HCNNLDA in identifying novel LDAs is further demonstrated by case studies of three diseases. To sum up, HCNNLDA is a feasible calculation model to predict LDAs.

Read more

6/6/2024

Heterogeneous Causal Metapath Graph Neural Network for Gene-Microbe-Disease Association Prediction
Total Score

0

Heterogeneous Causal Metapath Graph Neural Network for Gene-Microbe-Disease Association Prediction

Kexin Zhang, Feng Huang, Luotao Liu, Zhankun Xiong, Hongyu Zhang, Yuan Quan, Wen Zhang

The recent focus on microbes in human medicine highlights their potential role in the genetic framework of diseases. To decode the complex interactions among genes, microbes, and diseases, computational predictions of gene-microbe-disease (GMD) associations are crucial. Existing methods primarily address gene-disease and microbe-disease associations, but the more intricate triple-wise GMD associations remain less explored. In this paper, we propose a Heterogeneous Causal Metapath Graph Neural Network (HCMGNN) to predict GMD associations. HCMGNN constructs a heterogeneous graph linking genes, microbes, and diseases through their pairwise associations, and utilizes six predefined causal metapaths to extract directed causal subgraphs, which facilitate the multi-view analysis of causal relations among three entity types. Within each subgraph, we employ a causal semantic sharing message passing network for node representation learning, coupled with an attentive fusion method to integrate these representations for predicting GMD associations. Our extensive experiments show that HCMGNN effectively predicts GMD associations and addresses association sparsity issue by enhancing the graph's semantics and structure.

Read more

6/28/2024

🔮

Total Score

0

Optimizing Disease Prediction with Artificial Intelligence Driven Feature Selection and Attention Networks

D. Dhinakaran, S. Edwin Raja, M. Thiyagarajan, J. Jeno Jasmine, P. Raghavan

The rapid integration of machine learning methodologies in healthcare has ignited innovative strategies for disease prediction, particularly with the vast repositories of Electronic Health Records (EHR) data. This article delves into the realm of multi-disease prediction, presenting a comprehensive study that introduces a pioneering ensemble feature selection model. This model, designed to optimize learning systems, combines statistical, deep, and optimally selected features through the innovative Stabilized Energy Valley Optimization with Enhanced Bounds (SEV-EB) algorithm. The objective is to achieve unparalleled accuracy and stability in predicting various disorders. This work proposes an advanced ensemble model that synergistically integrates statistical, deep, and optimally selected features. This combination aims to enhance the predictive power of the model by capturing diverse aspects of the health data. At the heart of the proposed model lies the SEV-EB algorithm, a novel approach to optimal feature selection. The algorithm introduces enhanced bounds and stabilization techniques, contributing to the robustness and accuracy of the overall prediction model. To further elevate the predictive capabilities, an HSC-AttentionNet is introduced. This network architecture combines deep temporal convolution capabilities with LSTM, allowing the model to capture both short-term patterns and long-term dependencies in health data. Rigorous evaluations showcase the remarkable performance of the proposed model. Achieving a 95% accuracy and 94% F1-score in predicting various disorders, the model surpasses traditional methods, signifying a significant advancement in disease prediction accuracy. The implications of this research extend beyond the confines of academia.

Read more

8/7/2024