Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning conformational ensembles of proteins based on backbone geometry

Authors: Nicolas Wolf, Leif Seute, Vsevolod Viliuga, Simon Wagner, Jan Stühmer, Frauke Gräter

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we demonstrate that the proposed model achieves competitive performance with reduced inference time, across not only an established benchmark of naturally occurring proteins but also de novo proteins, for which evolutionary information is scarce or absent. BBFlow is available at https://github.com/graeter-group/bbflow. For benchmarking BBFlow, we train and test the model on the ATLAS dataset [42]...
Researcher Affiliation	Academia	1Max Planck Institute for Polymer Research, Mainz, Germany 2Heidelberg Institute for Theoretical Studies, Heidelberg, Germany 3IWR, Heidelberg University, Heidelberg, Germany 4Sci Life Lab and DBB at Stockholm University, Stockholm, Sweden 5IAR, Karlsruhe Institute of Technology, Karlsruhe, Germany
Pseudocode	Yes	We summarize the training procedure in Algorithm 1.
Open Source Code	Yes	BBFlow is available at https://github.com/graeter-group/bbflow.
Open Datasets	Yes	For benchmarking BBFlow, we train and test the model on the ATLAS dataset [42], which contains a curated set of 300 ns long Molecular Dynamics trajectories for 1390 proteins the same dataset used for training Alpha Flow.
Dataset Splits	Yes	we train BBFlow on the ATLAS dataset [42] with the same split into training, validation and test proteins. The ATLAS dataset consists of three trajectories of 100 ns long all-atom Molecular Dynamics (MD) simulations for 1390 structurally diverse proteins, of which Jing et al. [16] select 1265 for training, 39 for validation and 82 for testing.
Hardware Specification	Yes	We train the model, and variants where we leave out key features for an ablation study, for 3 days on two NVIDIA A100-40GB GPUs from scratch, i.e. without initial weights from a pre-trained folding model. ... We evaluate the inference time per generated conformation of the 302-residue protein 7c45A, and on the entire ATLAS test set in Fig. 5, using an NVIDIA A100-80GB GPU.
Software Dependencies	Yes	MD simulations are performed using GROMACS v2023 [1], utilizing the CHARMM27 all-atom force field. ... Throughout the simulations, covalent bonds involving hydrogen are constrained using the LINCS algorithm [13].
Experiment Setup	Yes	For all experiments, we use the same hyperparameters as in Frame Flow [51] and GAFL [44], except for the number of timesteps, which we set to 20. Also the respective feature dimensions are increased by 128 for embedding the amino acid identity as node feature and by 22 or 25, respectively, for embedding the equilibrium structure encoding with or without direction as edge feature.