Improved off-policy training of diffusion samplers
Authors: Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at (link) as a base for future work on diffusion models for amortized inference. |
| Researcher Affiliation | Collaboration | Marcin Sendera Mila, Université de Montréal Jagiellonian University Minsu Kim Mila, Université de Montréal KAIST Sarthak Mittal Mila, Université de Montréal Pablo Lemos Mila, Université de Montréal Ciela Institute Dreamfold Luca Scimeca Mila, Université de Montréal Jarrid Rector-Brooks Mila, Université de Montréal Dreamfold Alexandre Adam Mila, Université de Montréal Ciela Institute Yoshua Bengio Mila, Université de Montréal CIFAR Nikolay Malkin Mila, Université de Montréal University of Edinburgh |
| Pseudocode | Yes | Algorithm 1 GFlow Net Training with Local search |
| Open Source Code | Yes | Our code for the sampling methods and benchmarks studied is made public at (link) as a base for future work on diffusion models for amortized inference. |
| Open Datasets | Yes | sampling from energy distributions a 2-dimensional mixture of Gaussians with 25 modes (25GMM), the 10-dimensional Funnel, the 32-dimensional Manywell distribution, and the 1600-dimensional Log-Gaussian Cox process and conditional sampling from the latent posterior of a variational autoencoder (VAE; [41, 61]). |
| Dataset Splits | No | The paper mentions training and testing data but does not explicitly provide details about a validation dataset split, specific percentages, or sample counts. |
| Hardware Specification | Yes | In each experiment, we train models on a single NVIDIA A100-Large GPU, if not stated explicitly otherwise. |
| Software Dependencies | No | The paper does not provide specific version numbers for key software dependencies or libraries used in the experiments, such as deep learning frameworks or numerical libraries. |
| Experiment Setup | Yes | For all our experiments, we used a learning rate of 10 3. Additionally, we used a higher learning rate for learning the flow parameterization, which is set as 10 1 when using the TB loss and 10 2 with the Sub TB loss. |