Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Amortized Sampling with Transferable Normalizing Flows
Authors: Charlie B. Tan, Majdi Hassan, Leon Klein, Saifuddin Syed, Dominique Beaini, Michael Bronstein, Alexander Tong, Kirill Neklyudov
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive empirical evaluation we demonstrate the efficacy of PROSE as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based fine-tuning procedure to achieve competitive performance to established methods such as sequential Monte Carlo. |
| Researcher Affiliation | Collaboration | 1University of Oxford 2Université de Montréal 3Mila Quebec AI Institute 4Freie Universität Berlin 5Valence Labs 6AITHYRA 7Institut Courtois |
| Pseudocode | No | The paper describes algorithms and mathematical formulations but does not contain a clearly labeled pseudocode block or algorithm presented in a structured, code-like format. |
| Open Source Code | Yes | We open source our codebase https://github.com/transferable-samplers/ transferable-samplers, Many Peptides MD dataset https://huggingface.co/ datasets/transferable-samplers/many-peptides-md and model weights https: //huggingface.co/transferable-samplers/model-weights. |
| Open Datasets | Yes | We introduce Many Peptides MD; a novel dataset of peptide MD trajectories for sequences ranging from 2 to 8 residues in length1. ... 1Available at https://huggingface.co/datasets/transferable-samplers/many-peptides-md |
| Dataset Splits | Yes | For training, a total of 21,700 uniformly sampled sequences are simulated for 200 ns. For evaluation, 30 sequences of length 2, 4, and 8 are randomly sampled such that all amino acids are represented equally, and simulated for 5 µs. Further details on dataset collection and MD configuration provided in Appendix B. Table 1: Number of sequences used per peptide length for training and evaluation. Sequence length 2 3 4 5 6 7 8 Training 200 1,000 1,500 2,000 3,000 4,000 10,000 Evaluation 30 30 30 |
| Hardware Specification | Yes | All training experiments are run NVIDIA H100 GPUs using distributed data parallelism. ... All evaluation timings are recorded using NVIDIA L40S GPUs. |
| Software Dependencies | No | The paper mentions software components like 'Adam W optimizer', 'Dormand Prince5 (dopri5) adaptive solver', and 'POT [Flamary et al., 2021] linear optimal transport solver', but does not provide specific version numbers for these or other key software libraries (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | All models are trained for 5 · 105 iterations using a batch size of 512 with the Adam W optimizer [Loshchilov and Hutter, 2018]. We employ a cosine learning rate schedule in which the initial and final learning rates are a reduction of the maximal value by factor of 500, as well as exponential moving average with decay of 0.999. ... Continuous Normalizing Flows. We use the ECNF++ training recipe defined by Tan et al. [2025]; this entails a learning rate of 5 · 10⁹4 and weight decay of 1 · 10⁹2, with default Adam W hyperparameters of Adam W β1, β2 of (0.9, 0.999). |