Moment Matching Denoising Gibbs Sampling

Authors: Mingtian Zhang, Alex Hawkins-Hooker, Brooks Paige, David Barber

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore the benefits of our approach compared to related methods and demonstrate how to scale the method to high-dimensional datasets. ... We demonstrate the generation of high-quality images using only a single level of fixed noise. Furthermore, we showcase the application of our proposed method in multi-level noise scenarios, closely resembling a diffusion model.
Researcher Affiliation Collaboration Mingtian Zhang Centre for Artificial Intelligence University College London m.zhang@cs.ucl.ac.uk Alex Hawkins-Hooker Centre for Artificial Intelligence University College London a.hawkins-hooker@cs.ucl.ac.uk Brooks Paige Centre for Artificial Intelligence University College London b.paige@ucl.ac.uk David Barber Centre for Artificial Intelligence University College London david.barber@ucl.ac.uk This work was partially done during an internship in Huawei Noah s Ark Lab.
Pseudocode Yes Algorithm 1 Sampling with Langevin Dynamics, Algorithm 2 Sampling with the proposed pseudo Gibbs Sampling
Open Source Code Yes The code of the experiments can be found in https://github.com/zmtomorrow/MMDGS_NeurIPS.
Open Datasets Yes We then apply the proposed method to model the grey-scale MNIST [21] dataset. ... We then apply the same method to model the more complicated CIFAR 10 [20] dataset. ... In Figure 9 and 10, we visualize the samples from models that are trained on CIFAR10 and Celeb A separately.
Dataset Splits No The paper mentions using training data (e.g., "we sample 10,000 data points from pd as our training data") but does not provide specific train/validation/test dataset splits with percentages, sample counts, or explicit instructions for how data was partitioned for validation purposes.
Hardware Specification Yes All the experiments conducted in this paper are run on one single NVDIA GTX 3090.
Software Dependencies No The paper mentions "Py Torch [27]" but does not provide specific version numbers for software dependencies such as PyTorch, Python, or CUDA.
Experiment Setup Yes For the KL-trained Gibbs sampler described in Section 2.2, we use a network with 3 hidden layers with 400 hidden units, Swish activation [28] and output size 4 to generate both mean and log standard deviation of the Gaussian approximation. For the moment-matching Gibbs sampler (including both full and isotropic covariance), we use the same network architecture but with output size 1 to get the scalar energy and DSM as the training objective. Both networks are trained with batch size 100 and Adam [19] optimizer with learning rate 1 10 4 for 100 epochs. ... We train both networks for 300 epochs with learning rate 1 10 4 and batch-size 100.