Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hamiltonian Score Matching and Generative Flows
Authors: Peter Holderrieth, Yilun Xu, Tommi Jaakkola
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 9 Experiments |
| Researcher Affiliation | Collaboration | Peter Holderrieth MIT CSAIL EMAIL Yilun Xu NVIDIA EMAIL Tommi Jaakkola MIT CSAIL EMAIL |
| Pseudocode | No | No explicit pseudocode or algorithm block found. Methods are described in prose and mathematical equations. |
| Open Source Code | No | Code can be provided upon request. |
| Open Datasets | Yes | Specifically, we train a Oscillation HGF on CIFAR-10 unconditional and conditional. FFHQ (unconditional)-64x64 |
| Dataset Splits | No | Appendix L lists training details like "We set the reference batch size to 516 on CIFAR-10 and 256 on FFHQ. We train for 200 million images in total", but does not explicitly state the train/validation/test data splits. |
| Hardware Specification | Yes | All the experiments are run on 8 NVIDIA A100 GPUs. |
| Software Dependencies | No | We used Py Torch as a library for automatic differentiation [38]. |
| Experiment Setup | Yes | We set the reference batch size to 516 on CIFAR-10 and 256 on FFHQ. We train for 200 million images in total, corresponding to approximately 3000 epochs and 48 hours of training time for CIFAR-10 and 96 hours for FFHQ. As outlined in the experiments section, the hyperparameters and training procedure are the same as [26]: namely, we used the Adam optimizer with learning rate 0.001, exponential moving average (EMA) with momentum 0.5, data augmentation pipeline adapted from [28], dropout probability of 0.13, and FP32 precision. For sampling, we use the 2nd order Heun s sampler [26]. |