Explaining the effects of non-convergent MCMC in the training of Energy-Based Models

Authors: Elisabeth Agoritsas, Giovanni Catania, Aurélien Decelle, Beatriz Seoane

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we test these predictions numerically on a Conv Net EBM and a Boltzmann machine. and In Fig. 3, we reproduce essentially the same results using the Conv Net from Ref. (Nijkamp et al., 2020) (same architecture and hyperparameters) trained on the CIFAR-10 (Krizhevsky et al., 2009). Here, we use the Frechet Inception Distance Score (Heusel et al., 2017) to evaluate the generation quality at different training times and as a function of sampling time.
Researcher Affiliation Academia 1Department of Quantum Matter Physics, University of Geneva, 1211 Geneva, Switzerland 2Departamento de Fısica Te orica, Universidad Complutense de Madrid, 28040 Madrid, Spain 3Universit e Paris-Saclay, CNRS, INRIA Tau team, LISN, 91190 Gif-sur-Yvette, France.
Pseudocode No The paper does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link for open-source code availability for the described methodology.
Open Datasets Yes trained on the CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits No The paper does not explicitly provide specific percentages, sample counts, or clear citations to predefined splits for training, validation, and test datasets.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for running experiments.
Software Dependencies No The paper mentions Conv Net architecture and sampling methods but does not provide specific version numbers for software libraries or dependencies used in the experiments.
Experiment Setup Yes In Fig. 3, we reproduce essentially the same results using the Conv Net from Ref. (Nijkamp et al., 2020) (same architecture and hyperparameters) trained on the CIFAR-10 (Krizhevsky et al., 2009). and Learning is done with ksteps of nonconvergent heat-bath Markov chains initialized with random conditions. For k = 5 and for machines obtained at different stages of the learning process.... The learning rate is set to γ = 10 2.