Improved sampling via learned diffusions
Authors: Lorenz Richter, Julius Berner
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 NUMERICAL EXPERIMENTS. We evaluate the different methods on the following three numerical benchmark examples. |
| Researcher Affiliation | Collaboration | Lorenz Richter Zuse Institute Berlin dida Datenschmiede Gmb H richter@zib.de Julius Berner Caltech jberner@caltech.edu |
| Pseudocode | Yes | Algorithm 1 Training of a generalized time-reversed diffusion sampler |
| Open Source Code | Yes | The repository can be found at https://github.com/juliusberner/sde_sampler. |
| Open Datasets | Yes | Gaussian mixture model (GMM): We consider ρ(x) = 1 m Pm i=1 N(x; µi, Σi) and choose m = 9, Σi = 0.3 I, (µi)9 i=1 = { 5, 0, 5} { 5, 0, 5} R2 to obtain well-separated modes, see Figure 2. |
| Dataset Splits | No | The paper describes generating samples from target distributions and evaluating their quality against ground truth. It does not mention traditional training, validation, or test splits of a fixed dataset, as its methodology involves learning to transport a prior distribution to a target distribution rather than splitting a pre-existing dataset. Therefore, specific dataset split information is not applicable or provided. |
| Hardware Specification | No | The paper vaguely mentions: “Every experiment is executed on a single GPU”. This is insufficient as it does not specify the model, memory, or any other relevant details of the GPU or other hardware components used. |
| Software Dependencies | No | The paper mentions using a “Py Torch implementation” and the “Adam optimizer”, but it does not specify the version numbers for PyTorch or any other software libraries, which is crucial for reproducibility. |
| Experiment Setup | Yes | In particular, we use the Fourier MLPs of Zhang & Chen (2022), a batch size of 2048, and the Adam optimizer. To facilitate the comparisons, we use a fixed number of 200 steps for the Euler-Maruyama scheme. A difference to Berner et al. (2024) is that we observed better performance (for all considered methods and losses) by using an exponentially decaying learning rate starting at 0.005 and decaying every 100 steps to a final learning rate of 10 4. We use 60000 gradient steps for the experiments with d <= 10 and 120000 gradient steps otherwise to approximately achieve convergence. |