reproducibilityindex.ai

Monte Carlo Variational Auto-Encoders

Authors: Achille Thin, Nikita Kotelevskii, Arnaud Doucet, Alain Durmus, Eric Moulines, Maxim Panov

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply these new methods to build novel Monte Carlo VAEs, and show their efﬁciency on real-world datasets.
Researcher Affiliation	Academia	1CMAP, Ecole Polytechnique, Universite Paris-Saclay, France 2CDISE, Skolkovo Institute of Science and Technology, Moscow, Russia 3Ecole Nationale Sup erieure Paris-Saclay, France 4HDI Lab, HSE University, Moscow, Russia 5University of Oxford.
Pseudocode	Yes	Algorithm 1 Langevin Monte Carlo VAE
Open Source Code	Yes	The code to reproduce all of the experiments is available online at https://github.com/premolab/metflow/.
Open Datasets	Yes	We evaluate our models on three different datasets: MNIST, CIFAR-10 and Celeb A.
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits with specific details, or detailed splitting methodology) needed to reproduce the data partitioning. It mentions 'held-out loglikelihood' but not how the split was made.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	All the models are implemented using Py Torch (Paszke et al., 2019) and optimized using the Adam optimizer (Kingma & Ba, 2014) for 100 epochs each. The training process is using Py Torch Lightning toolkit (Falcon, 2019).
Experiment Setup	Yes	A crucial hyperparameter of our method is the step size η. In principle, it could be learned by including it as an additional inference parameter φ and by maximizing the ELBO. However, it is then difﬁcult to ﬁnd a good tradeoff between having a high A/R ratio and a large step size η at the same time. Instead, we suggest adjusting η by targeting a ﬁxed A/R ratio ρ. It has proven effective to use a preconditioned version of (11), i.e. Zk = Zk 1 + η log γk(Zk 1) + 2η Uk with η Rp, where we adapt each component of η using the following rule η(i) = 0.9η(i) +0.1η0/ ϵ+std[ z(i) log pθ(x, z)]. Here std denotes the standard deviation over the batch x of the quantity z(i) log pθ(x, z), and ϵ > 0. The scalar η0 is a tuning parameter which is adjusted to target the A/R ratio ρ. This strategy follows the same heuristics as Adam (Kingma & Ba, 2014). In the following ρ is set to 0.8 for A-MCVAE and 0.9 for L-MCVAE (keeping it high for LMCVAE ensures that the Langevin dynamics stays almost reversible , thus keeping a low variance SIS estimator). An optimal choice of the temperature schedule {βk}K k=0 for SIS and AIS is a difﬁcult problem. We have focused in our experiments on three different settings. First, we consider the temperature schedule ﬁxed and regularly spaced between 0 and 1. Following (Grosse et al., 2015), the second option is the sigmoidal tempering scheme where βk = ( βk β1)/( βK β1) with, βk = σ δ(2k/K 1) , σ is the sigmoid function and δ > 0 is a parameter that we optimize during the training phase. The last schedule consists in learning the temperatures {βk}K k=0 directly as additional inference parameters φ.