reproducibilityindex.ai

Improved sampling via learned diffusions

Authors: Lorenz Richter, Julius Berner

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 NUMERICAL EXPERIMENTS. We evaluate the different methods on the following three numerical benchmark examples.
Researcher Affiliation	Collaboration	Lorenz Richter Zuse Institute Berlin dida Datenschmiede Gmb H richter@zib.de Julius Berner Caltech jberner@caltech.edu
Pseudocode	Yes	Algorithm 1 Training of a generalized time-reversed diffusion sampler
Open Source Code	Yes	The repository can be found at https://github.com/juliusberner/sde_sampler.
Open Datasets	Yes	Gaussian mixture model (GMM): We consider ρ(x) = 1 m Pm i=1 N(x; µi, Σi) and choose m = 9, Σi = 0.3 I, (µi)9 i=1 = { 5, 0, 5} { 5, 0, 5} R2 to obtain well-separated modes, see Figure 2.
Dataset Splits	No	The paper describes generating samples from target distributions and evaluating their quality against ground truth. It does not mention traditional training, validation, or test splits of a fixed dataset, as its methodology involves learning to transport a prior distribution to a target distribution rather than splitting a pre-existing dataset. Therefore, specific dataset split information is not applicable or provided.
Hardware Specification	No	The paper vaguely mentions: “Every experiment is executed on a single GPU”. This is insufficient as it does not specify the model, memory, or any other relevant details of the GPU or other hardware components used.
Software Dependencies	No	The paper mentions using a “Py Torch implementation” and the “Adam optimizer”, but it does not specify the version numbers for PyTorch or any other software libraries, which is crucial for reproducibility.
Experiment Setup	Yes	In particular, we use the Fourier MLPs of Zhang & Chen (2022), a batch size of 2048, and the Adam optimizer. To facilitate the comparisons, we use a fixed number of 200 steps for the Euler-Maruyama scheme. A difference to Berner et al. (2024) is that we observed better performance (for all considered methods and losses) by using an exponentially decaying learning rate starting at 0.005 and decaying every 100 steps to a final learning rate of 10 4. We use 60000 gradient steps for the experiments with d <= 10 and 120000 gradient steps otherwise to approximately achieve convergence.