Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Contrastive Divergence for Combining Variational Inference and MCMC
Authors: Francisco Ruiz, Michalis Titsias
ICML 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show experimentally that optimizing the VCD leads to better predictive performance on two latent variable models: logistic matrix factorization and variational autoencoders (VAEs). Here we demonstrate the algorithm described in Section 2.5, which minimizes the variational contrastive divergence (VCD) with respect to the variational parameters θ. |
| Researcher Affiliation | Collaboration | 1University of Cambridge, Cambridge, UK 2Columbia University, New York, USA 3Deep Mind, London, UK. |
| Pseudocode | Yes | Algorithm 1 Minimization of the VCD |
| Open Source Code | Yes | Code is available online at https://github.com/ franrruiz/vcd_divergence. |
| Open Datasets | Yes | We use two datasets. The first one is the binarized MNIST data (Salakhutdinov & Murray, 2008), which contains 50,000 training images and 10,000 test images of hand-written digits. The second dataset is Fashion-MNIST (Xiao et al., 2017), which contains 60,000 training images and 10,000 test images of clothing items. |
| Dataset Splits | No | The paper specifies training and test set sizes (e.g., '50,000 training images and 10,000 test images' for MNIST) but does not explicitly state a separate validation set split or how it was used. |
| Hardware Specification | No | The paper states 'No parallelism or GPU acceleration was used.' but does not provide specific details on the CPU, memory, or other hardware components used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for dependencies such as programming languages, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | We set the number of HMC iterations t = 8, using 5 leapfrog steps. We set the learning rate η = 5 10 4 for the variational parameters corresponding to the mean, η = 2.5 10 4 for the variational parameters corresponding to the covariance, and η = 5 10 4 for the model parameters φ. We additionally decrease the learning rate by a factor of 0.9 every 15,000 iterations. We run 400,000 iterations of each optimization algorithm. We perform stochastic VI by subsampling a minibatch of observations at each iteration (Hoffman et al., 2013); we set the minibatch size to 100. |