Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Geometric Algebra-Enhanced Bayesian Flow Network for RNA Inverse Design

Authors: Rubo Wang, Xingyu Gao, Peilin Zhao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct a comprehensive comparison of RBFN against state-of-the-art methods on the Das Benchmark. The benchmark comprising 14 RNA structures extracted from the PDB database comprehensively represents diverse RNA functional categories.
Researcher Affiliation	Academia	Rubo Wang1, 2, Xingyu Gao1, 2 , Peilin Zhao3 1Institute of Microelectronics, Chinese Academy of Sciences, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3School of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, China EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 BFN Sampling with Backbone Conditioning
Open Source Code	No	We promise that we will open-source the data and code after paper acceptance.
Open Datasets	Yes	We use RNASolo [33] for training, validation, and testing. It contains all RNA-containing structures obtained from the Protein Data Bank (PDB).
Dataset Splits	Yes	For the Single-state split, we use the partitioning method in [36] to identify the structural clusters (including riboswitches, aptamers, and ribozymes) belonging to the RNAs identified in [36]. A total of 100 sequences are obtained for testing. Based on the remaining sequences, 100 are randomly selected as the validation set, and the rest are used as the training set. This benchmark is called the Das Benchmark. For the Multi-state split, we calculate the pairwise C4 RMSD of the structures corresponding to each sequence. The top 100 samples from clusters with the highest median intra-sequence RMSD are added to the test set, which is called the Multi-state Benchmark. The next 100 samples are added to the validation set, and the rest are used as the training set for training.
Hardware Specification	Yes	The implementation utilizes Py Torch 2.0.1 and is run on one NVIDIA Tesla V100-SXM2-32GB GPU.
Software Dependencies	Yes	The implementation utilizes Py Torch 2.0.1 and is run on one NVIDIA Tesla V100-SXM2-32GB GPU.
Experiment Setup	Yes	We employed the Adam W optimizer with an initial learning rate of η = 1 10 4. The learning rate schedule combines a linear warmup phase. Specifically, if the validation loss does not decrease for 5 consecutive epochs, the learning rate will decay by a factor of 0.9. During the training process, Exponential Moving Average (EMA) is utilized with a decay factor of 0.99. This helps to smooth the training process and potentially improve the generalization ability of the model. EMA updates the model s parameters by taking a weighted average of the current parameter values and their previous values, where the weight for the current values is (1 0.99) and for the previous values is 0.99. The implementation utilizes Py Torch 2.0.1 and is run on one NVIDIA Tesla V100-SXM2-32GB GPU. We set the random seed to 42 to ensure reproducibility. The entire training process lasts for 100 epochs.