Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FEAT: Free energy Estimators with Adaptive Transport

Authors: Yuanqi Du, Jiajun He, Francisco Vargas, Yuanqing Wang, Carla P. Gomes, José Miguel Hernández-Lobato, Eric Vanden-Eijnden

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental validation on toy examples, molecular simulations, and quantum field theory demonstrates promising improvements over existing learning-based methods. We evaluate FEAT on a diverse range of systems, from toy examples to molecular simulations and quantum field theory.
Researcher Affiliation	Collaboration	1University of Cambridge, 2Cornell University, 3Xaira Therapeutics, 4ML Lab, Capital Fund Management, 5Courant Institute of Mathematical Sciences, NYU
Pseudocode	Yes	An outline of FEAT is provided in Algorithm 1. Algorithm 1 Free energy estimation with FEAT
Open Source Code	Yes	Our Py Torch implementation is available at https://github.com/jiajunhe98/FEAT.
Open Datasets	No	The paper primarily generates data through molecular simulations and toy examples (e.g., Gaussian mixtures, Lennard-Jones particles, Alanine dipeptide simulations, φ4 quantum field theory) rather than utilizing pre-existing, publicly available datasets for which concrete access information is provided.
Dataset Splits	No	The paper describes simulation setups and sample sizes used for free energy estimation (e.g., '20,000 samples for each marginal', '1,000 samples for ALA-4'), but it does not specify explicit training/test/validation splits for a dataset.
Hardware Specification	Yes	All experiments are run on a single 80G NVIDIA H100.
Software Dependencies	Yes	Specifically, the samples were gathered from a 5 microsecond simulation under 300K with Generalized Born implicit solvent implemented in openmmtools Chodera et al. [2025]. The reference provides 'choderalab/openmmtools: 0.24.1, Jan. 2025.'
Experiment Setup	Yes	We include hyperparameters for model training and evaluation in Table 8. Table 8 details hyperparameters such as 'learning rate', 'batch size', 'iteration number', and 'number of discretization steps'.