reproducibilityindex.ai

UniIF: Unified Molecule Inverse Folding

Authors: Zhangyang Gao, Jue Wang, Cheng Tan, Lirong Wu, Yufei Huang, Siyuan Li, Zhirui Ye, Stan Z. Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through comprehensive evaluations across various tasks such as protein design, RNA design, and material design, we demonstrate that our proposed method surpasses state-of-the-art methods on all tasks.
Researcher Affiliation	Academia	Zhangyang Gao 1,2, , Jue Wang 1,2, , Cheng Tan 1,2, , Lirong Wu 2, Yufei Huang 2, Siyuan Li 2, Zhirui Ye 2, Stan Z. Li 2, 1 Zhejiang University 2 Westlake University
Pseudocode	No	The paper describes its model architecture and components using text and mathematical equations, but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	No	The code will be released upon acceptance.
Open Datasets	Yes	We evaluate of Uni IF on the CATH4.3 dataset [30] following prior works [11, 8]. For structural time-split evaluation, we use the CASP15 dataset [11]... For sequence time-split evaluation, we use the Novel Pro dataset [8]... We conduct experiments RNA on the dataset collected by RDesign [34]... We evaluated Uni IF on the CHILI-3K dataset [6]...
Dataset Splits	Yes	The dataset is split by the CATH topology classification code, yielding 16,631 training, 1,516 validation, and 1,864 testing samples. 2218 RNA tertiary structures, which are divided into training (1774 structures), testing (223 structures), and validation (221 structures) sets based on their structural similarity. the dataset is randomly split into training (80%), validation (10%), and testing (10%) sets.
Hardware Specification	Yes	All experiments are conducted on an NVIDIA A100 with 80G memory. The longest training time is about 1 day.
Software Dependencies	No	The paper mentions using the Adam optimizer and various network architectures (MLP, GNN, Transformer) but does not provide specific software dependency versions (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	Uni IF consists of 10 layers of Block GAT with a hidden dimension of 128. It is trained using the Adam optimizer with a learning rate of 1e-3 and a batch size of 8 for 50 epochs. Experiments are repeated three times with different seeds, using early stopping with a patience of 50 epochs, and trained up to 1000 epochs. randomly drop out the nodes/edges with a probability of p to prevent overfitting. The best performance is achieved when p = 0.05.