Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search

Authors: Mengdi Liu, Xiaoxue Cheng, Zhangyang Gao, Hong Chang, Cheng Tan, Shiguang Shan, Xilin Chen

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, Prot Inv Tree outperforms state-of-the-art baselines across multiple benchmarks, generating structurally consistent yet diverse sequences, including those far from the native ground truth. The code is available at https://github.com/A4Bio/Protein Inv Bench/. ... 5 Experiments
Researcher Affiliation	Academia	1 State Key Laboratory of AI Safety, Institute of Computing Technology, CAS, China 2 University of Chinese Academy of Sciences (CAS), China 3 AI Lab, Research Center for Industries of the Future, Westlake University 4 Gaoling School of Artificial Intelligence, Renmin University of China EMAIL, EMAIL, EMAIL
Pseudocode	Yes	A Algorithms The overall workflow of the Prot Inv Tree is provided in Algorithm 1. Algorithm 1 Prot Inv Tree: Reward-Guided Tree Search for Protein Inverse Folding
Open Source Code	Yes	The code is available at https://github.com/A4Bio/Protein Inv Bench/.
Open Datasets	Yes	Datasets. We conduct experiments on both CATH v4.2 and CATH v4.3 [34], where proteins are categorized based on the CATH hierarchical classification of protein structure, to ensure a comprehensive analysis. ... We also include a set of de novo proteins collected from the CASP15 competition to provide a more realistic assessment. Following the previous work Prot Inv Bench [14], we download the public TS-domains structures from CASP15 which consists of 45 structures, namely TS45.
Dataset Splits	Yes	Following the standard data splitting [22, 20], CATH v4.2 dataset consists of 18,024 proteins for training, 608 proteins for validation, and 1,120 proteins for testing; CATH v4.3 dataset consists of 16,153 proteins for training, 1,457 proteins for validation, and 1,797 proteins for testing.
Hardware Specification	Yes	All experiments are conducted on NVIDIA-A100 GPUs with 80G memory.
Software Dependencies	No	We choose ESM-3 [18] as our policy model because it is the first protein foundation model that directly supports inverse folding without task-specific fine-tuning. The Jumpy Denoising strategy also leverages it, which is capable of filling in arbitrary mask ratios. ... To ensure fast structural feedback for reward computation, we use ESMFold [29] to predict the 3D structures of candidate sequences.
Experiment Setup	Yes	For Prot Inv Tree, we set the maximum number of MCTS iterations M to 50. The selection numbers Kt at each step follow a cosine schedule. In the UCT algorithm, the weight w balancing the exploration and exploitation is set to 0.01.