Improved Autoregressive Modeling with Distribution Smoothing
Authors: Chenlin Meng, Jiaming Song, Yang Song, Shengjia Zhao, Stefano Ermon
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show with extensive experimental results that our approach is able to drastically improve the sample quality of current autoregressive models on several synthetic datasets and real-world image datasets, while obtaining competitive likelihoods on synthetic datasets. We empirically demonstrate that our method can also be applied to density estimation, image inpainting, and image denoising. In this section, we demonstrate empirically that by appropriately choosing the smoothness level of randomized smoothing, our approach is able to drastically improve the sample quality of existing autoregressive models on several synthetic and real-world datasets while retaining competitive likelihoods on synthetic datasets. We also present results on image inpainting in Appendix C.2. |
| Researcher Affiliation | Academia | Chenlin Meng, Jiaming Song, Yang Song, Shengjia Zhao & Stefano Ermon Stanford University {chenlin,tsong,yangsong,sjzhao,ermon}@cs.stanford.edu |
| Pseudocode | No | The paper describes algorithms and methods but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code release. |
| Open Datasets | Yes | In this section, we focus on three common image datasets, namely MNIST, CIFAR-10 (Krizhevsky et al., 2009) and Celeb A (Liu et al., 2015). |
| Dataset Splits | No | The paper mentions using datasets like MNIST, CIFAR-10, and Celeb A, which are commonly split into train/test/validation sets, but it does not explicitly detail the specific splits used (e.g., percentages or counts) within the paper for its experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Pixel CNN++ as the model architecture, but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We train all the models with Adam optimizer with learning rate 0.0002. For the image experiments, we first rescale images to [ 1, 1] and then perturb the images with q( x|x) = N( x|x, σ2I). We use σ = 0.5 for MNIST and σ = 0.3 for both CIFAR-10 and Celeb A. |