gRNAde: Geometric Deep Learning for 3D RNA inverse design

Read original: arXiv:2305.14749 - Published 5/28/2024 by Chaitanya K. Joshi, Arian R. Jamasb, Ramon Vi~nas, Charles Harris, Simon V. Mathis, Alex Morehead, Rishabh Anand, Pietro Li`o
Total Score

0

🤿

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • The paper introduces gRNAde, a new computational pipeline for designing RNA sequences that account for 3D structure and dynamics, rather than just targeting a single desired secondary structure.
  • gRNAde uses a multi-state Graph Neural Network to generate candidate RNA sequences based on one or more 3D backbone structures, where the base identities are unknown.
  • On a benchmark of 14 RNA structures, gRNAde outperformed the Rosetta design tool in terms of recovering the native sequence, while being much faster.
  • The paper also demonstrates gRNAde's utility for designing sequences for structurally flexible RNAs and ranking mutational fitness landscapes.

Plain English Explanation

RNA, the chemical cousin of DNA, plays a crucial role in various biological processes. Designing RNA sequences that adopt specific 3D structures is an important task in fields like synthetic biology and drug development. However, previous approaches have focused on designing sequences based on a single desired secondary structure, without considering the full 3D geometry and conformational diversity of the RNA.

The researchers behind gRNAde recognized this limitation and developed a new computational pipeline that explicitly accounts for the 3D structure and dynamics of RNA. At the heart of gRNAde is a Graph Neural Network that generates candidate RNA sequences based on one or more 3D backbone structures, where the identities of the individual bases are unknown.

The key advantage of gRNAde is that it can design RNA sequences that are more likely to fold into the desired 3D structure, rather than just targeting a specific secondary structure. This is important because the 3D structure of an RNA molecule often determines its function, and the same secondary structure can adopt multiple 3D conformations.

In tests, gRNAde outperformed the widely used Rosetta design tool in terms of recovering the native sequences of 14 RNA structures from the Protein Data Bank. Importantly, gRNAde was much faster, taking under a second to produce designs, compared to the hours reported for Rosetta.

The researchers also demonstrated the utility of gRNAde for designing sequences for structurally flexible RNAs, as well as for ranking the fitness of mutations in the structure of an RNA polymerase ribozyme, a complex RNA-based enzyme.

Technical Explanation

The researchers framed the RNA design task as an "inverse problem," where the goal is to find sequences that adopt a desired 3D structure. Previous approaches have typically focused on designing sequences that adopt a single desired secondary structure, without considering the full 3D geometry and conformational diversity of the RNA.

To address this limitation, the researchers developed gRNAde, a geometric RNA design pipeline that operates directly on 3D RNA backbones. At the core of gRNAde is a multi-state Graph Neural Network that generates candidate RNA sequences conditioned on one or more 3D backbone structures, where the identities of the bases are unknown.

The researchers evaluated gRNAde on a single-state fixed backbone re-design benchmark of 14 RNA structures from the Protein Data Bank, identified by Das et al. (2010). On this benchmark, gRNAde obtained higher native sequence recovery rates (56% on average) compared to the Rosetta design tool (45% on average), while taking under a second to produce designs, compared to the reported hours for Rosetta.

The researchers further demonstrated the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent RNA polymerase ribozyme structure.

Critical Analysis

The researchers acknowledge that while gRNAde outperformed Rosetta on the fixed backbone re-design benchmark, the overall sequence recovery rates were still relatively low, suggesting that there is room for improvement in the design of RNA sequences that adopt specific 3D structures.

Additionally, the researchers note that the multi-state design benchmark for structurally flexible RNAs is a more challenging task, and further research is needed to fully address the design of RNAs with complex conformational dynamics.

One potential limitation of the gRNAde approach is that it relies on the availability of high-quality 3D structural data for the target RNAs, which may not always be the case, especially for novel or understudied RNA structures. In such cases, the performance of gRNAde may be limited, and alternative approaches, such as those based on RNA secondary structure prediction or transformer-based models, may be more suitable.

Furthermore, the researchers did not address the potential limitations of using Graph Neural Networks for this task, such as their sensitivity to the quality of the input graph representations or their inability to capture long-range dependencies in the RNA structure.

Conclusion

The introduction of gRNAde represents a significant advance in the field of computational RNA design by explicitly accounting for the 3D structure and dynamics of RNA molecules. The ability to design RNA sequences that adopt specific 3D conformations has important implications for applications in synthetic biology, drug development, and the study of RNA-based biological processes.

While the current performance of gRNAde is promising, the researchers acknowledge that further improvements are needed, particularly in the design of structurally flexible RNAs and the use of alternative structural data sources. Continued research in this area may lead to even more powerful computational tools for RNA design, ultimately accelerating our understanding and engineering of these crucial biomolecules.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

Total Score

0

gRNAde: Geometric Deep Learning for 3D RNA inverse design

Chaitanya K. Joshi, Arian R. Jamasb, Ramon Vi~nas, Charles Harris, Simon V. Mathis, Alex Morehead, Rishabh Anand, Pietro Li`o

Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a multi-state Graph Neural Network that generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. [2010], gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent RNA polymerase ribozyme structure. Open source code: https://github.com/chaitjo/geometric-rna-design

Read more

5/28/2024

RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching
Total Score

0

RNAFlow: RNA Structure & Sequence Design via Inverse Folding-Based Flow Matching

Divya Nori, Wengong Jin

The growing significance of RNA engineering in diverse biological applications has spurred interest in developing AI methods for structure-based RNA design. While diffusion models have excelled in protein design, adapting them for RNA presents new challenges due to RNA's conformational flexibility and the computational cost of fine-tuning large structure prediction models. To this end, we propose RNAFlow, a flow matching model for protein-conditioned RNA sequence-structure design. Its denoising network integrates an RNA inverse folding model and a pre-trained RosettaFold2NA network for generation of RNA sequences and structures. The integration of inverse folding in the structure denoising process allows us to simplify training by fixing the structure prediction network. We further enhance the inverse folding model by conditioning it on inferred conformational ensembles to model dynamic RNA conformations. Evaluation on protein-conditioned RNA structure and sequence generation tasks demonstrates RNAFlow's advantage over existing RNA design methods.

Read more

6/11/2024

3D-based RNA function prediction tools in rnaglib
Total Score

0

3D-based RNA function prediction tools in rnaglib

Carlos Oliver, Vincent Mallet, J'er^ome Waldispuhl

Understanding the connection between complex structural features of RNA and biological function is a fundamental challenge in evolutionary studies and in RNA design. However, building datasets of RNA 3D structures and making appropriate modeling choices remains time-consuming and lacks standardization. In this chapter, we describe the use of rnaglib, to train supervised and unsupervised machine learning-based function prediction models on datasets of RNA 3D structures.

Read more

5/6/2024

RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design
Total Score

0

RNA-FrameFlow: Flow Matching for de novo 3D RNA Backbone Design

Rishabh Anand, Chaitanya K. Joshi, Alex Morehead, Arian R. Jamasb, Charles Harris, Simon V. Mathis, Kieran Didi, Bryan Hooi, Pietro Li`o

We introduce RNA-FrameFlow, the first generative model for 3D RNA backbone design. We build upon SE(3) flow matching for protein backbone generation and establish protocols for data preparation and evaluation to address unique challenges posed by RNA modeling. We formulate RNA structures as a set of rigid-body frames and associated loss functions which account for larger, more conformationally flexible RNA backbones (13 atoms per nucleotide) vs. proteins (4 atoms per residue). Toward tackling the lack of diversity in 3D RNA datasets, we explore training with structural clustering and cropping augmentations. Additionally, we define a suite of evaluation metrics to measure whether the generated RNA structures are globally self-consistent (via inverse folding followed by forward folding) and locally recover RNA-specific structural descriptors. The most performant version of RNA-FrameFlow generates locally realistic RNA backbones of 40-150 nucleotides, over 40% of which pass our validity criteria as measured by a self-consistency TM-score >= 0.45, at which two RNAs have the same global fold. Open-source code: https://github.com/rish-16/rna-backbone-design

Read more

6/21/2024