Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Energy-based generator matching: A neural sampler for general state space

Authors: Dongyeop Woo, Minsu Kim, Minkyu Kim, Kiyoung Seong, Sungsoo Ahn

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate EGM on both discrete and multimodal tasks up to 100 and 20 dimensions, respectively.
Researcher Affiliation	Academia	1Korea Advanced Institute of Science and Technology (KAIST) 2Mila Quebec AI Institute
Pseudocode	Yes	Algorithm 1 Iterated training with EGM loss with bootstrapping
Open Source Code	Yes	The code is available at here.
Open Datasets	Yes	We validate our work through experiments on various target distributions: discrete Ising model and three joint discrete-continuous tasks, i.e., the Gaussian-Bernoulli restricted Boltzmann machine (GBRBM) [9], joint double-well potential (Joint DW4), and joint mixture of Gaussians (Joint Mo G).
Dataset Splits	Yes	We measure the distances between 2000 empirical samples generated by our samplers and 2000 ground truth samples uniformly selected from extensive Gibbs sampling or exact sampling processes.
Hardware Specification	Yes	Training was conducted on an NVIDIA-3090 GPU (24GB VRAM). All experiments are conducted on an NVIDIA RTX 3090 GPU.
Software Dependencies	No	Evaluation metrics in our experiments primarily utilize Wasserstein distances, computed via the Python Optimal Transport (POT) library [13] using exact linear programming. (No version numbers specified for Python or POT itself)
Experiment Setup	Yes	We employed 2000 Monte Carlo samples for estimations and a training batch size of 300. Both EGM and bootstrapping utilized 100 outer-loop iterations, with each iteration collecting 2000 samples into a buffer with a maximum size of 10k. The inner-loop iterations were set to 100 for EGM and 1000 for bootstrapping. We adopted a linear masking schedule (κt = t), a linear conditional OT schedule (αt = t), and an exponential variance exploding (VE) schedule (σt = σmax(σmin/σmax)t). All samplers were trained using the Adam W optimizer with an initial learning rate of 10-3, applying a cosine learning rate schedule with ηmin = 10-5.