Moment Matching Denoising Gibbs Sampling
Authors: Mingtian Zhang, Alex Hawkins-Hooker, Brooks Paige, David Barber
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We explore the benefits of our approach compared to related methods and demonstrate how to scale the method to high-dimensional datasets. ... We demonstrate the generation of high-quality images using only a single level of fixed noise. Furthermore, we showcase the application of our proposed method in multi-level noise scenarios, closely resembling a diffusion model. |
| Researcher Affiliation | Collaboration | Mingtian Zhang Centre for Artificial Intelligence University College London m.zhang@cs.ucl.ac.uk Alex Hawkins-Hooker Centre for Artificial Intelligence University College London a.hawkins-hooker@cs.ucl.ac.uk Brooks Paige Centre for Artificial Intelligence University College London b.paige@ucl.ac.uk David Barber Centre for Artificial Intelligence University College London david.barber@ucl.ac.uk This work was partially done during an internship in Huawei Noah s Ark Lab. |
| Pseudocode | Yes | Algorithm 1 Sampling with Langevin Dynamics, Algorithm 2 Sampling with the proposed pseudo Gibbs Sampling |
| Open Source Code | Yes | The code of the experiments can be found in https://github.com/zmtomorrow/MMDGS_NeurIPS. |
| Open Datasets | Yes | We then apply the proposed method to model the grey-scale MNIST [21] dataset. ... We then apply the same method to model the more complicated CIFAR 10 [20] dataset. ... In Figure 9 and 10, we visualize the samples from models that are trained on CIFAR10 and Celeb A separately. |
| Dataset Splits | No | The paper mentions using training data (e.g., "we sample 10,000 data points from pd as our training data") but does not provide specific train/validation/test dataset splits with percentages, sample counts, or explicit instructions for how data was partitioned for validation purposes. |
| Hardware Specification | Yes | All the experiments conducted in this paper are run on one single NVDIA GTX 3090. |
| Software Dependencies | No | The paper mentions "Py Torch [27]" but does not provide specific version numbers for software dependencies such as PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | For the KL-trained Gibbs sampler described in Section 2.2, we use a network with 3 hidden layers with 400 hidden units, Swish activation [28] and output size 4 to generate both mean and log standard deviation of the Gaussian approximation. For the moment-matching Gibbs sampler (including both full and isotropic covariance), we use the same network architecture but with output size 1 to get the scalar energy and DSM as the training objective. Both networks are trained with batch size 100 and Adam [19] optimizer with learning rate 1 10 4 for 100 epochs. ... We train both networks for 300 epochs with learning rate 1 10 4 and batch-size 100. |