Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
Authors: Vivek Jayaram, John Thickstun
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present qualitative and quantitive results of Pn F sampling for Wave Net models of audio (van den Oord et al., 2016a) and a Pixel CNN++ model of images (Salimans et al., 2017). In Section 4.2 we show that Pn F sampling can produce samples of comparable quality to ancestral sampling. In Section 4.3 we show that stochastic Pn F sampling is faster than ancestral sampling, when parallelized across a modest number of devices. |
| Researcher Affiliation | Academia | Vivek Jayaram * 1 John Thickstun * 1 1Department of Computer Science, University of Washington. Correspondence to: Vivek Jayaram <vjayaram@cs.washington.edu>, John Thickstun <thickstn@cs.washington.edu>. |
| Pseudocode | Yes | Algorithm 1 Parallel and Flexible Sampling and Algorithm 2 Stochastic Parallel and Flexible Sampling |
| Open Source Code | Yes | Code and examples of Pn F sampling are available at: https://grail.cs.washington.edu/projects/ pnf-sampling/. |
| Open Datasets | Yes | For audio experiments we use the VCTK dataset (Veaux et al., 2016) consisting of 44 hours of speech, as well as the Supra Piano dataset (Shi et al., 2019) consisting of 52 hours of piano recordings. For image experiments we use the CIFAR-10 dataset (Krizhevsky, 2009) with the standard train-test split. |
| Dataset Splits | No | We use a random 80-20 train-test split of VCTK speakers and piano recordings for evaluation. For image experiments we use the CIFAR-10 dataset (Krizhevsky, 2009) with the standard train-test split. While train and test splits are mentioned, an explicit validation split is not described. |
| Hardware Specification | Yes | This behavior is demonstrated in Figure 3 for spectrogram-conditioned Wave Net stochastic Pn F sampling using a cluster of 8 Nvidia Titan Xp GPU s and T = 256. |
| Software Dependencies | No | The paper mentions models like Wave Net and Pixel CNN++, but does not specify software dependencies with version numbers. |
| Experiment Setup | No | The paper states 'Additional training and hyperparameter details can be found in the appendix.' Since the appendix is not provided, and no specific hyperparameters like learning rate, batch size, or optimizer settings are present in the main text, the details are not explicitly provided here. |