reproducibilityindex.ai

RDesign: Hierarchical Data-efficient Representation Learning for Tertiary Structure-based RNA Design

Authors: Cheng Tan, Yijie Zhang, Zhangyang Gao, Bozhen Hu, Siyuan Li, Zicheng Liu, Stan Z. Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate RDesign on the tertiary structure-based RNA design task by comparing it with four categories of baseline models:
Researcher Affiliation	Academia	1Zhejiang University, Hangzhou, China 3Mc Gill University, Montr eal, Qu ebec, Canada 2AI Lab, Research Center for Industries of the Future, Westlake University, Hangzhou, China {tancheng,gaozhangyang}@westlake.edu.cn; yj.zhang@mail.mcgill.ca
Pseudocode	No	The paper describes algorithms and pipelines but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code and benchmark dataset are available at github.com/A4Bio/RDesign.
Open Datasets	Yes	We train and assess performance on our proposed RNA structure benchmark dataset which aggregates and cleans data from RNAsolo (Adamczyk et al., 2022b) and the Protein Data Bank (PDB) (Bank, 1971; Berman ets al., 2000). ... To test the generalization ability, we apply pre-trained models to the Rfam (Gardner et al., 2009; Nawrocki et al., 2015) and RNA-Puzzles (Miao et al., 2020) datasets that contain non-overlapping structures.
Dataset Splits	Yes	The benchmark dataset consists of 2218 RNA tertiary structures, which are divided into training (1774 structures), testing (223 structures), and validation (221 structures) sets based on their structural similarity.
Hardware Specification	Yes	We ran the models on Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz CPU and NVIDIA A100 GPU.
Software Dependencies	Yes	The model was implemented based on the standard Py Torch Geometric (Fey & Lenssen, 2019) library using the Py Torch 1.11.0 library.
Experiment Setup	Yes	We trained the model for 200 epochs using the Adam optimizer with a learning rate of 0.001. The batch size was set as 64. The model s encoder and decoder each had three layers. With a dropout rate of 0.1, it considered 30 nearest neighbors and a vocabulary size matching RNA s four alphabets.