Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On scalable and efficient training of diffusion samplers

Authors: Minkyu Kim, Kiyoung Seong, Dongyeop Woo, Sungsoo Ahn, Minsu Kim

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that SGDS, despite its simplicity, produces substantial gains over baseline diffusion samplers across benchmarks: classical Gaussian mixtures and the Manywell task; particle simulation problems like LJ-13 and LJ-55; and real-world molecules, Alanine Di-, Tri-, and Tetra-peptide. Our method significantly improves sample efficiency and scalability, marking a practical path towards highdimensional diffusion-based inference.
Researcher Affiliation	Academia	1Korea Advanced Institute of Science and Technology (KAIST) 2Mila Quebec AI Institute
Pseudocode	Yes	Algorithm 1 Training search-guided diffusion samplers (SGDS)
Open Source Code	Yes	Source code: https://github.com/minkyu1022/SGDS
Open Datasets	Yes	The reference samples can be downloaded from https://zenodo.org/records/15436773.
Dataset Splits	No	The paper describes methods for generating samples (e.g., MCMC chains, burn-in steps) and training on those generated samples, but it does not specify traditional train/validation/test splits of a pre-existing dataset.
Hardware Specification	No	The paper mentions memory limitations for PIS that require halving batch sizes due to the forward SDE computational graph, implying GPU memory constraints. However, it does not specify any particular GPU models, CPU models, or other hardware components used for running experiments.
Software Dependencies	No	The paper mentions "Torch ANI [16], a Py Torch implementation of ANI deep learning potentials", indicating the use of PyTorch and Torch ANI. However, specific version numbers for these software dependencies are not provided.
Experiment Setup	Yes	In all the experiments, we use four different random seeds and average the results of each run. We provide details of experimental settings in Appendix A.4, Table 4, and Table 5, and additional results in Appendix B. All methods adopt the PIS architecture [47, 39], with a joint network consisting of a two-layer MLP with 256 hidden dimensions. We run 25K epochs in both the first round and the second round. We train PIS at a learning rate of 1e-4, TB at a learning rate of 2e-4, and SGDS at a learning rate of 5e-4. We use 4 and 32 batch sizes for all methods except PIS in LJ-13 and LJ-55, respectively.