Sample as you Infer: Predictive Coding with Langevin Dynamics
Authors: Umais Zahid, Qinghai Guo, Zafeirios Fountas
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare LPC against VAEs by training generative models on benchmark datasets; our experiments demonstrate superior sample quality and faster convergence for LPC in a fraction of SGD training iterations, while matching or exceeding VAE performance across key metrics like FID, diversity and coverage. |
| Researcher Affiliation | Industry | 1Huawei Technologies R&D, London, UK 2Huawei Technologies Co., Ltd., Shenzhen, Guangdong, China. |
| Pseudocode | Yes | Our final preconditioned algorithm with amortised warm-starts is described in Algorithm 1. Algorithm 1 Preconditioned Langevin PC with Amortized Warm-Starts trained with Jeffrey s Divergence. |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available. |
| Open Datasets | Yes | We begin by investigating the performance of our three approximate inference objectives, the forward KL, reverse KL and Jeffrey s divergence on the quality of our samples when trained with CIFAR-10 (Krizhevsky, 2009), SVHN (Netzer et al., 2011) and Celeb A (64x64) (Liu et al., 2015). |
| Dataset Splits | No | The paper mentions using benchmark datasets (CIFAR-10, SVHN, Celeb A) and training for a certain number of epochs, but it does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used for reproducibility. |
| Hardware Specification | Yes | Batch times and end-to-end slowdowns for LPC algorithms as recorded on a single GPU, equipped with 24GB of GDDR6X memory, providing approximately 83 tera FLOPS. |
| Software Dependencies | No | The paper mentions 'Optimizer Adam' in Table 3 but does not provide specific version numbers for Adam or any other software components (e.g., programming languages, libraries, or frameworks) used in the experiments. |
| Experiment Setup | Yes | Default hyperparameters used in experiments unless explicitly stated. Note: some of these are varied as part of ablation tests, see main text for more details. Optimizer Adam, Learning Rate (α) 1e-3, Batch size 64, Output Likelihood Discretised Gaussian, Max Sampling Steps (T) 300, Preconditioning Decay Rate (β) 0.99. Optimal learning rates for VAE were found to be 1e-3, 8e-4 and 1e-3 for CIFAR10, Celeb A and SVHN respectively. For LPC, optimal inference learning rates were found to be 1e-1, 1e-1, and 1e-3 with β equal to 0.25, 0.25 and 0 (No preconditioning), for CIFAR10, Celeb A and SVHN respectively. |