Integration of Genetic Algorithms and Deep Learning for the Generation and Bioactivity Prediction of Novel Tyrosine Kinase Inhibitors

Read original: arXiv:2408.07155 - Published 8/15/2024 by Ricardo Romero

🤿

Overview

Researchers combined genetic algorithms and deep learning models to accelerate drug discovery.
The approach focused on generating novel tyrosine kinase inhibitors and predicting their bioactivity.
Genetic algorithms were used to create new drug-like molecules with optimized properties.
A deep learning model predicted the bioactivity of these generated molecules against tyrosine kinases.
Integrating these computational methods can speed up early-stage drug discovery.

Plain English Explanation

Drug discovery is a critical but challenging process. Researchers are using a combination of genetic algorithms and deep learning to make it more efficient.

Genetic algorithms are a type of optimization technique inspired by natural selection. They can be used to generate new drug-like molecules with properties that are optimized for things like how easily the body can absorb, distribute, metabolize, and eliminate the drug (ADMET), as well as overall "drug-likeness."

At the same time, the researchers used a deep learning model to predict how well these generated molecules would interact with and inhibit tyrosine kinases. Tyrosine kinases are enzymes involved in many cellular processes, including the progression of cancer.

By combining these two powerful computational approaches, the researchers demonstrate a framework that can speed up the early stages of drug discovery. Instead of having to test many random compounds, this method can generate and evaluate promising new drug candidates more efficiently.

Technical Explanation

The researchers developed a combined approach using genetic algorithms and deep learning models to address two key aspects of drug discovery:

Generating novel tyrosine kinase inhibitors: The researchers used a generative model based on genetic algorithms to create new small-molecule compounds with optimized ADMET (absorption, distribution, metabolism, excretion, and toxicity) and drug-likeness properties.
Predicting bioactivity: Concurrently, the researchers employed a deep learning model to predict the bioactivity of the generated molecules against tyrosine kinases, a family of enzymes involved in various cellular processes and cancer progression.

By integrating these advanced computational techniques, the researchers demonstrated a powerful framework for accelerating the generation and identification of potential tyrosine kinase inhibitors. This approach can contribute to more efficient and effective early-stage drug discovery processes.

Critical Analysis

The researchers acknowledge that their approach is limited to generating and evaluating compounds against tyrosine kinases, and further research is needed to extend the framework to other target families.

Additionally, the deep learning model used for bioactivity prediction may be sensitive to the quality and diversity of the training data. Potential issues include bias in the dataset or insufficient coverage of chemical space, which could impact the model's ability to accurately predict the activity of novel compounds.

Further research and validation would be necessary to fully assess the practical impact and limitations of this combined genetic algorithm and deep learning approach in real-world drug discovery pipelines.

Conclusion

By integrating genetic algorithms and deep learning models, the researchers have developed a promising framework for accelerating the generation and identification of potential tyrosine kinase inhibitors. This approach can contribute to more efficient and effective early-stage drug discovery processes, potentially leading to the development of new therapies for diseases such as cancer.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

Integration of Genetic Algorithms and Deep Learning for the Generation and Bioactivity Prediction of Novel Tyrosine Kinase Inhibitors

Ricardo Romero

The intersection of artificial intelligence and bioinformatics has enabled significant advancements in drug discovery, particularly through the application of machine learning models. In this study, we present a combined approach using genetic algorithms and deep learning models to address two critical aspects of drug discovery: the generation of novel tyrosine kinase inhibitors and the prediction of their bioactivity. The generative model leverages genetic algorithms to create new small molecules with optimized ADMET (absorption, distribution, metabolism, excretion, and toxicity) and drug-likeness properties. Concurrently, a deep learning model is employed to predict the bioactivity of these generated molecules against tyrosine kinases, a key enzyme family involved in various cellular processes and cancer progression. By integrating these advanced computational methods, we demonstrate a powerful framework for accelerating the generation and identification of potential tyrosine kinase inhibitors, contributing to more efficient and effective early-stage drug discovery processes.

8/15/2024

🤿

Deep Lead Optimization: Leveraging Generative AI for Structural Modification

Odin Zhang, Haitao Lin, Hui Zhang, Huifeng Zhao, Yufei Huang, Yuansheng Huang, Dejun Jiang, Chang-yu Hsieh, Peichen Pan, Tingjun Hou

The idea of using deep-learning-based molecular generation to accelerate discovery of drug candidates has attracted extraordinary attention, and many deep generative models have been developed for automated drug design, termed molecular generation. In general, molecular generation encompasses two main strategies: de novo design, which generates novel molecular structures from scratch, and lead optimization, which refines existing molecules into drug candidates. Among them, lead optimization plays an important role in real-world drug design. For example, it can enable the development of me-better drugs that are chemically distinct yet more effective than the original drugs. It can also facilitate fragment-based drug design, transforming virtual-screened small ligands with low affinity into first-in-class medicines. Despite its importance, automated lead optimization remains underexplored compared to the well-established de novo generative models, due to its reliance on complex biological and chemical knowledge. To bridge this gap, we conduct a systematic review of traditional computational methods for lead optimization, organizing these strategies into four principal sub-tasks with defined inputs and outputs. This review delves into the basic concepts, goals, conventional CADD techniques, and recent advancements in AIDD. Additionally, we introduce a unified perspective based on constrained subgraph generation to harmonize the methodologies of de novo design and lead optimization. Through this lens, de novo design can incorporate strategies from lead optimization to address the challenge of generating hard-to-synthesize molecules; inversely, lead optimization can benefit from the innovations in de novo design by approaching it as a task of generating molecules conditioned on certain substructures.

5/1/2024

Synthetic Data from Diffusion Models Improve Drug Discovery Prediction

Bing Hu, Ashish Saragadam, Anita Layton, Helen Chen

Artificial intelligence (AI) is increasingly used in every stage of drug development. Continuing breakthroughs in AI-based methods for drug discovery require the creation, improvement, and refinement of drug discovery data. We posit a new data challenge that slows the advancement of drug discovery AI: datasets are often collected independently from each other, often with little overlap, creating data sparsity. Data sparsity makes data curation difficult for researchers looking to answer key research questions requiring values posed across multiple datasets. We propose a novel diffusion GNN model Syngand capable of generating ligand and pharmacokinetic data end-to-end. We show and provide a methodology for sampling pharmacokinetic data for existing ligands using our Syngand model. We show the initial promising results on the efficacy of the Syngand-generated synthetic target property data on downstream regression tasks with AqSolDB, LD50, and hERG central. Using our proposed model and methodology, researchers can easily generate synthetic ligand data to help them explore research questions that require data spanning multiple datasets.

5/8/2024

drGAT: Attention-Guided Gene Assessment of Drug Response Utilizing a Drug-Cell-Gene Heterogeneous Network

Yoshitaka Inoue, Hunmin Lee, Tianfan Fu, Augustin Luna

Drug development is a lengthy process with a high failure rate. Increasingly, machine learning is utilized to facilitate the drug development processes. These models aim to enhance our understanding of drug characteristics, including their activity in biological contexts. However, a major challenge in drug response (DR) prediction is model interpretability as it aids in the validation of findings. This is important in biomedicine, where models need to be understandable in comparison with established knowledge of drug interactions with proteins. drGAT, a graph deep learning model, leverages a heterogeneous graph composed of relationships between proteins, cell lines, and drugs. drGAT is designed with two objectives: DR prediction as a binary sensitivity prediction and elucidation of drug mechanism from attention coefficients. drGAT has demonstrated superior performance over existing models, achieving 78% accuracy (and precision), and 76% F1 score for 269 DNA-damaging compounds of the NCI60 drug response dataset. To assess the model's interpretability, we conducted a review of drug-gene co-occurrences in Pubmed abstracts in comparison to the top 5 genes with the highest attention coefficients for each drug. We also examined whether known relationships were retained in the model by inspecting the neighborhoods of topoisomerase-related drugs. For example, our model retained TOP1 as a highly weighted predictive feature for irinotecan and topotecan, in addition to other genes that could potentially be regulators of the drugs. Our method can be used to accurately predict sensitivity to drugs and may be useful in the identification of biomarkers relating to the treatment of cancer patients.

5/16/2024