Guided Multi-objective Generative AI to Enhance Structure-based Drug Design

Read original: arXiv:2405.11785 - Published 5/21/2024 by Amit Kadan, Kevin Ryczko, Adrian Roitberg, Takeshi Yamazaki

🤖

Overview

Generative AI has the potential to revolutionize drug discovery
Existing models struggle to generate molecules that satisfy all desired physicochemical properties
IDOLpro: a novel generative chemistry AI combining deep diffusion with multi-objective optimization for structure-based drug design
IDOLpro generates novel ligands by optimizing a plurality of target physicochemical properties
Demonstrated effectiveness in generating ligands with optimized binding affinity and synthetic accessibility

Plain English Explanation

The paper describes a new artificial intelligence (AI) system called IDOLpro that can design new drug molecules. Designing effective new drugs is a major challenge in the pharmaceutical industry. While recent advances in machine learning have led to progress in this area, existing models still struggle to create molecules that have all the desired physical and chemical properties needed for a successful drug.

The researchers behind IDOLpro have developed a novel approach that combines two powerful machine learning techniques: deep diffusion and multi-objective optimization. The deep diffusion model explores uncharted chemical space to generate novel drug molecule candidates, while the multi-objective optimization guides the model to generate molecules that optimize multiple target properties, such as binding affinity and synthetic accessibility.

The researchers demonstrate that IDOLpro outperforms other state-of-the-art approaches in generating drug molecules with optimized binding affinity and synthetic accessibility. Importantly, IDOLpro is the first AI model to surpass the performance of actual experimentally observed drug molecules on a test set. This suggests IDOLpro can accelerate the drug discovery process by rapidly generating high-quality drug candidates that can then be further developed and tested.

Technical Explanation

The paper introduces IDOLpro, a novel generative chemistry AI that combines deep diffusion with multi-objective optimization for structure-based drug design. The latent variables of the diffusion model are guided by differentiable scoring functions to explore uncharted chemical space and generate novel ligands in silico, optimizing a plurality of target physicochemical properties.

The researchers demonstrate the effectiveness of IDOLpro by generating ligands with optimized binding affinity and synthetic accessibility on two benchmark sets. IDOLpro produces ligands with binding affinities over 10% higher than the next best state-of-the-art on each test set. Crucially, on a test set of experimental complexes, IDOLpro is the first to surpass the performance of experimentally observed ligands.

The flexible nature of IDOLpro allows it to accommodate other scoring functions (e.g. ADME-Tox) to accelerate hit-finding, hit-to-lead, and lead optimization for drug discovery. This multimodal approach to modeling molecular properties can further enhance the drug discovery process.

Critical Analysis

The paper presents a promising approach to generative chemistry for drug discovery, but it also acknowledges several limitations and areas for further research. The authors note that while IDOLpro outperforms other state-of-the-art models, it still has room for improvement in terms of generating molecules that satisfy all desired properties.

Additionally, the paper does not address the computational cost and training time required for the IDOLpro model, which could be a practical concern for real-world drug discovery applications. Further research is needed to optimize the efficiency of the model.

The authors also highlight the need to expand the test sets and evaluate IDOLpro's performance on a wider range of drug discovery tasks and target properties. Validating the model's performance on more diverse and challenging datasets would provide a more comprehensive understanding of its capabilities and limitations.

Despite these caveats, the core ideas behind IDOLpro, such as the combination of deep diffusion and multi-objective optimization, represent a significant advancement in the field of generative chemistry. As the authors suggest, the flexibility of the approach in accommodating different scoring functions holds the potential to accelerate various stages of the drug discovery pipeline.

Conclusion

The paper presents a novel generative chemistry AI called IDOLpro that combines deep diffusion with multi-objective optimization to generate drug molecule candidates with optimized physicochemical properties. The researchers demonstrate the effectiveness of IDOLpro in outperforming other state-of-the-art models and, importantly, surpassing the performance of experimentally observed ligands on a test set.

This work has the potential to significantly impact the drug discovery process by accelerating the generation of high-quality drug candidates, which can then be further developed and tested. The flexible nature of the IDOLpro approach, allowing for the integration of various scoring functions, suggests it could be a valuable tool for hit-finding, hit-to-lead, and lead optimization stages of drug discovery.

While the paper acknowledges some limitations and areas for future research, the core ideas and demonstrated performance of IDOLpro represent an important step forward in the application of generative AI to the challenging problem of rational drug design.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤖

Guided Multi-objective Generative AI to Enhance Structure-based Drug Design

Amit Kadan, Kevin Ryczko, Adrian Roitberg, Takeshi Yamazaki

Generative AI has the potential to revolutionize drug discovery. Yet, despite recent advances in machine learning, existing models cannot generate molecules that satisfy all desired physicochemical properties. Herein, we describe IDOLpro, a novel generative chemistry AI combining deep diffusion with multi-objective optimization for structure-based drug design. The latent variables of the diffusion model are guided by differentiable scoring functions to explore uncharted chemical space and generate novel ligands in silico, optimizing a plurality of target physicochemical properties. We demonstrate its effectiveness by generating ligands with optimized binding affinity and synthetic accessibility on two benchmark sets. IDOLpro produces ligands with binding affinities over 10% higher than the next best state-of-the-art on each test set. On a test set of experimental complexes, IDOLpro is the first to surpass the performance of experimentally observed ligands. IDOLpro can accommodate other scoring functions (e.g. ADME-Tox) to accelerate hit-finding, hit-to-lead, and lead optimization for drug discovery.

5/21/2024

🤿

Deep Lead Optimization: Leveraging Generative AI for Structural Modification

Odin Zhang, Haitao Lin, Hui Zhang, Huifeng Zhao, Yufei Huang, Yuansheng Huang, Dejun Jiang, Chang-yu Hsieh, Peichen Pan, Tingjun Hou

The idea of using deep-learning-based molecular generation to accelerate discovery of drug candidates has attracted extraordinary attention, and many deep generative models have been developed for automated drug design, termed molecular generation. In general, molecular generation encompasses two main strategies: de novo design, which generates novel molecular structures from scratch, and lead optimization, which refines existing molecules into drug candidates. Among them, lead optimization plays an important role in real-world drug design. For example, it can enable the development of me-better drugs that are chemically distinct yet more effective than the original drugs. It can also facilitate fragment-based drug design, transforming virtual-screened small ligands with low affinity into first-in-class medicines. Despite its importance, automated lead optimization remains underexplored compared to the well-established de novo generative models, due to its reliance on complex biological and chemical knowledge. To bridge this gap, we conduct a systematic review of traditional computational methods for lead optimization, organizing these strategies into four principal sub-tasks with defined inputs and outputs. This review delves into the basic concepts, goals, conventional CADD techniques, and recent advancements in AIDD. Additionally, we introduce a unified perspective based on constrained subgraph generation to harmonize the methodologies of de novo design and lead optimization. Through this lens, de novo design can incorporate strategies from lead optimization to address the challenge of generating hard-to-synthesize molecules; inversely, lead optimization can benefit from the innovations in de novo design by approaching it as a task of generating molecules conditioned on certain substructures.

5/1/2024

Synthetic Data from Diffusion Models Improve Drug Discovery Prediction

Bing Hu, Ashish Saragadam, Anita Layton, Helen Chen

Artificial intelligence (AI) is increasingly used in every stage of drug development. Continuing breakthroughs in AI-based methods for drug discovery require the creation, improvement, and refinement of drug discovery data. We posit a new data challenge that slows the advancement of drug discovery AI: datasets are often collected independently from each other, often with little overlap, creating data sparsity. Data sparsity makes data curation difficult for researchers looking to answer key research questions requiring values posed across multiple datasets. We propose a novel diffusion GNN model Syngand capable of generating ligand and pharmacokinetic data end-to-end. We show and provide a methodology for sampling pharmacokinetic data for existing ligands using our Syngand model. We show the initial promising results on the efficacy of the Syngand-generated synthetic target property data on downstream regression tasks with AqSolDB, LD50, and hERG central. Using our proposed model and methodology, researchers can easily generate synthetic ligand data to help them explore research questions that require data spanning multiple datasets.

5/8/2024

TAGMol: Target-Aware Gradient-guided Molecule Generation

Vineeth Dorna, D. Subhalingam, Keshav Kolluru, Shreshth Tuli, Mrityunjay Singh, Saurabh Singal, N. M. Anoop Krishnan, Sayan Ranu

3D generative models have shown significant promise in structure-based drug design (SBDD), particularly in discovering ligands tailored to specific target binding sites. Existing algorithms often focus primarily on ligand-target binding, characterized by binding affinity. Moreover, models trained solely on target-ligand distribution may fall short in addressing the broader objectives of drug discovery, such as the development of novel ligands with desired properties like drug-likeness, and synthesizability, underscoring the multifaceted nature of the drug design process. To overcome these challenges, we decouple the problem into molecular generation and property prediction. The latter synergistically guides the diffusion sampling process, facilitating guided diffusion and resulting in the creation of meaningful molecules with the desired properties. We call this guided molecular generation process as TAGMol. Through experiments on benchmark datasets, TAGMol demonstrates superior performance compared to state-of-the-art baselines, achieving a 22% improvement in average Vina Score and yielding favorable outcomes in essential auxiliary properties. This establishes TAGMol as a comprehensive framework for drug generation.

6/5/2024