Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Equi-mRNA: Protein Translation Equivariant Encoding for mRNA Language Models

Authors: Mehdi Yazdani-Jahromi, Ali Khodabandeh Yalabadi, Ozlem Ozmen Garibay

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On downstream property-prediction tasks including expression, stability, and riboswitch switching Equi-m RNA delivers up to 10% improvements in accuracy. In sequence generation, it produces m RNA constructs that are up to 4 more realistic under Fréchet Bio Distance metrics and 28% better preserve functional properties compared to vanilla baseline.
Researcher Affiliation	Academia	Mehdi Yazdani-Jahromi Department of Computer Science University of Central Florida Orlando, FL 32816 EMAIL Ali Khodabandeh Yalabadi Department of Industrial Engineering University of Central Florida Orlando, FL 32816 EMAIL Ozlem Ozmen Garibay Department of Computer Science and Industrial Engineering University of Central Florida Orlando, FL 32816 EMAIL
Pseudocode	No	The paper provides detailed mathematical formulations and descriptions of the methodology, but does not include any explicitly labeled pseudocode or algorithm blocks with structured, code-like steps.
Open Source Code	Yes	The data are publicly available. We can provide the link to code at any time. For the review version, we did not include the link to keep it as an anonymous review. The supplementary Zip file including code has been attached.
Open Datasets	Yes	We curate and release a unified coding-region corpus of 25M protein-coding sequences plus a stratified 1M sequence subset to standardize benchmarking across studies. Pretraining Corpus We constructed a large-scale pretraining corpus by drawing 25 million annotated protein-coding sequences from 56 million Ref Seq entries...
Dataset Splits	Yes	All datasets are split consistently into training, validation, and test subsets at ratios of 70%, 15%, and 15%, respectively, for model training and evaluation.
Hardware Specification	Yes	Pretraining was conducted on thirty-two NVIDIA H100 GPUs (ablation used eight NVIDIA H200 GPUs); runtimes and resource utilization are detailed in Appendix A.11.
Software Dependencies	No	The paper mentions using a 'GPT2 Transformer backbone' and 'hybrid Mamba Transformer backbone', and refers to the 'geoopt library' for implementing Stiefel manifold optimization, but does not provide specific version numbers for any of these software components or other libraries.
Experiment Setup	Yes	All pretraining arguments and hyperparameters, as well as downstream generation and property-prediction hyperparameters, are provided in Appendix A.10. Table 4: Pretraining hyperparameters (identical for all 12 variants). Table 5: Pretraining hyperparameters for the 25 M-sequence corpus (identical for both GPT-2 and Mamba hybrid architecture).