Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning normalized image densities via dual score matching
Authors: Florentin Guth, Zahra Kadkhodaie, Eero P. Simoncelli
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train an energy network with this dual score matching objective on the Image Net64 dataset, and obtain a cross-entropy (negative log likelihood) value comparable to the state of the art. We further validate our approach by showing that our energy model strongly generalizes: log probabilities estimated with two networks trained on nonoverlapping data subsets are nearly identical. |
| Researcher Affiliation | Academia | Florentin Guth Center for Data Science, New York University Flatiron Institute, Simons Foundation EMAIL Zahra Kadkhodaie Flatiron Institute, Simons Foundation EMAIL Eero P. Simoncelli New York University Flatiron Institute, Simons Foundation EMAIL |
| Pseudocode | No | The paper describes the methodology and architecture in detail using mathematical equations and textual descriptions, but it does not include an explicit pseudocode block or algorithm figure. |
| Open Source Code | Yes | Pre-trained models and software for running all experiments are available at https://github.com/Florentin Guth/Dual Score Matching. |
| Open Datasets | Yes | We train our energy model on Image Net64 (Russakovsky et al., 2015; Chrabaszcz et al., 2017) |
| Dataset Splits | Yes | We train an energy network with this dual score matching objective on the Image Net64 dataset... The previous section demonstrated that our energy-based model achieves near state-of-the-art NLL on Image Net64. That is, the model on average assigns high probability to a set of held-out test images. To this end, we use the strong generalization test developed in Kadkhodaie et al. (2024). We partition the training data into two non-overlapping sets, train a separate energy-based model on each set, and then compare the energies computed by these two models on images from both training subsets. Figure 2 shows the results of this experiment. The two models assign very different probabilities to the same image when the training set size, N, is small. But they converge gradually and compute nearly the same values at N = 10^5. |
| Hardware Specification | Yes | All models are trained on a single NVIDIA H100 GPU, which takes about 5 days for Image Net64. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and Fourier features for time embedding but does not specify version numbers for any software libraries or programming languages. |
| Experiment Setup | Yes | All models are trained for 1M steps, with a batch size of 128. We use the Adam optimizer with default parameters and an initial learning rate of 0.0005 (except for the generalization experiments which used a learning rate of 0.0002) that is halved every 100, 000 steps. In our experiments we use tmin = 10^-9 and tmax = 10^3, and the training image intensities are rescaled to have values in [0, 1]. A time embedding e(t) R^256 is computed with Fourier features cos(ωkt), sin(ωkt) (we use 32 frequencies (ωk)k that are linearly spaced in the log domain and ranging from 1/tmax to 1/tmin) followed by a shallow MLP. |