Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Authors: Daniel D. Richman, Jessica Karaguesian, Carl-Mikael Suomivuori, Ron Dror

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present Confor Mix, an inference-time algorithm that enhances sampling of conformational distributions... Case studies of biologically critical proteins demonstrate the scalability, accuracy, and utility of this method.
Researcher Affiliation	Academia	Daniel D. Richman Stanford University EMAIL Jessica Karaguesian Stanford University EMAIL Carl-Mikael Suomivuori Stanford University EMAIL Ron O. Dror Stanford University EMAIL
Pseudocode	Yes	Algorithm 1 Confor Mix RMSD for exploration of conformational landscapes Input: Biomolecular structure prediction model p, input system s, target RMSD values R, constraint strength α, number of samples per RMSD Nsamples Output: Samples {x R,i} for all R and i 1: xd p(x \| s) # Predict a structure using default sampling 2: mask RIGIDELEMENTS(xd) # Identify atoms within secondary structure elements 3: gr(y \| x) exp α RMSD(x, xd; mask) r 2 # Define conditioning potential 4: for r R do 5: for i = 1, , Nsamples do 6: xr,i CONFORMIX(x; gr, s) 7: end for 8: end for 9: return {xr,i}Nsamples i=1 for all r R
Open Source Code	Yes	The code for this project is available at github.com/drorlab/conformix.
Open Datasets	Yes	We first implement Confor Mix in Boltz-1 [36], an open-source diffusion-based structure prediction model similar to Alpha Fold 3. Importantly, Boltz was trained on the Protein Data Bank (PDB), and not with any other dynamics information. ... The set of 38 proteins that exhibit domain motions are the combination of proteins curated by Lewis et al. [17] and in OC23 by Kalakoti et al. [13]. The set of 15 membrane transporters that exhibit inward-open and outward-open conformations were curated in TP16 by Xie & Huang [38]. The set of 31 proteins that exhibit cryptic pocket formation were curated by Lewis et al. [17]. The set of 15 fold switchers proteins are a subset of those curated by Porter et al. [26].
Dataset Splits	No	The paper uses an existing model (Boltz-1) trained on the Protein Data Bank (PDB) but does not specify the training/validation/test splits used for Boltz-1. For the evaluation of Confor Mix, the paper identifies specific sets of proteins (e.g., domain motion proteins, membrane transporters) as case studies, but these are used as evaluation sets rather than explicitly defined training/test/validation splits for the Confor Mix methodology itself.
Hardware Specification	No	The paper states: 'Timing statistics are provided in Table S1.' and in the NeurIPS checklist: 'Compute resources are discussed in the Supplement.' However, the supplement is not provided in the main text, so specific hardware details are not available within the given text.
Software Dependencies	No	The paper mentions several software components like 'Boltz', 'Bio Emu', and 'pymbar module', along with their licenses and links, but it does not specify any version numbers for these software dependencies, which is required for a reproducible description.
Experiment Setup	Yes	Algorithm 1 Confor Mix RMSD for exploration of conformational landscapes Input: Biomolecular structure prediction model p, input system s, target RMSD values R, constraint strength α, number of samples per RMSD Nsamples. ... We seek to discover the flexibility of each protein by generating structure predictions spanning the range 0 to 20 Å RMSD to xd. ... We reject samples where any 10-residue sliding window has an average pLDDT value of more than 20% below that of the default prediction, as well as structures with clashes.