Sample as you Infer: Predictive Coding with Langevin Dynamics

Authors: Umais Zahid, Qinghai Guo, Zafeirios Fountas

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare LPC against VAEs by training generative models on benchmark datasets; our experiments demonstrate superior sample quality and faster convergence for LPC in a fraction of SGD training iterations, while matching or exceeding VAE performance across key metrics like FID, diversity and coverage.
Researcher Affiliation Industry 1Huawei Technologies R&D, London, UK 2Huawei Technologies Co., Ltd., Shenzhen, Guangdong, China.
Pseudocode Yes Our final preconditioned algorithm with amortised warm-starts is described in Algorithm 1. Algorithm 1 Preconditioned Langevin PC with Amortized Warm-Starts trained with Jeffrey s Divergence.
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the methodology described is publicly available.
Open Datasets Yes We begin by investigating the performance of our three approximate inference objectives, the forward KL, reverse KL and Jeffrey s divergence on the quality of our samples when trained with CIFAR-10 (Krizhevsky, 2009), SVHN (Netzer et al., 2011) and Celeb A (64x64) (Liu et al., 2015).
Dataset Splits No The paper mentions using benchmark datasets (CIFAR-10, SVHN, Celeb A) and training for a certain number of epochs, but it does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or sample counts) used for reproducibility.
Hardware Specification Yes Batch times and end-to-end slowdowns for LPC algorithms as recorded on a single GPU, equipped with 24GB of GDDR6X memory, providing approximately 83 tera FLOPS.
Software Dependencies No The paper mentions 'Optimizer Adam' in Table 3 but does not provide specific version numbers for Adam or any other software components (e.g., programming languages, libraries, or frameworks) used in the experiments.
Experiment Setup Yes Default hyperparameters used in experiments unless explicitly stated. Note: some of these are varied as part of ablation tests, see main text for more details. Optimizer Adam, Learning Rate (α) 1e-3, Batch size 64, Output Likelihood Discretised Gaussian, Max Sampling Steps (T) 300, Preconditioning Decay Rate (β) 0.99. Optimal learning rates for VAE were found to be 1e-3, 8e-4 and 1e-3 for CIFAR10, Celeb A and SVHN respectively. For LPC, optimal inference learning rates were found to be 1e-1, 1e-1, and 1e-3 with β equal to 0.25, 0.25 and 0 (No preconditioning), for CIFAR10, Celeb A and SVHN respectively.