Explaining the effects of non-convergent MCMC in the training of Energy-Based Models
Authors: Elisabeth Agoritsas, Giovanni Catania, Aurélien Decelle, Beatriz Seoane
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we test these predictions numerically on a Conv Net EBM and a Boltzmann machine. and In Fig. 3, we reproduce essentially the same results using the Conv Net from Ref. (Nijkamp et al., 2020) (same architecture and hyperparameters) trained on the CIFAR-10 (Krizhevsky et al., 2009). Here, we use the Frechet Inception Distance Score (Heusel et al., 2017) to evaluate the generation quality at different training times and as a function of sampling time. |
| Researcher Affiliation | Academia | 1Department of Quantum Matter Physics, University of Geneva, 1211 Geneva, Switzerland 2Departamento de Fısica Te orica, Universidad Complutense de Madrid, 28040 Madrid, Spain 3Universit e Paris-Saclay, CNRS, INRIA Tau team, LISN, 91190 Gif-sur-Yvette, France. |
| Pseudocode | No | The paper does not include pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability for the described methodology. |
| Open Datasets | Yes | trained on the CIFAR-10 (Krizhevsky et al., 2009). |
| Dataset Splits | No | The paper does not explicitly provide specific percentages, sample counts, or clear citations to predefined splits for training, validation, and test datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used for running experiments. |
| Software Dependencies | No | The paper mentions Conv Net architecture and sampling methods but does not provide specific version numbers for software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | In Fig. 3, we reproduce essentially the same results using the Conv Net from Ref. (Nijkamp et al., 2020) (same architecture and hyperparameters) trained on the CIFAR-10 (Krizhevsky et al., 2009). and Learning is done with ksteps of nonconvergent heat-bath Markov chains initialized with random conditions. For k = 5 and for machines obtained at different stages of the learning process.... The learning rate is set to γ = 10 2. |