Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Authors: Erik Nijkamp, Mitch Hill, Song-Chun Zhu, Ying Nian Wu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 5 Experimental Results In this section, we will demonstrate (1) realistic synthesis, (2) smooth interpolation, (3) faithful reconstruction of observed examples, and, (4) the influence of hyperparameters. K denotes the number of MCMC steps in equation (4). nf denotes the number of output features maps in the first layer of fθ. See Appendix for additional results. We emphasize the simplicity of the algorithm and models, see Appendix 7.3 and 7.4, respectively. 5.1 Fidelity We evaluate the fidelity of generated examples on various datasets, each reduced to 40,000 observed examples. Figure 6 depicts generated samples for various datasets with K = 100 Langevin steps for both training and evaluation. For CIFAR-10 we set the number of features nf = 128, whereas for Celeb A and LSUN we use nf = 64. We use 200,000 iterations of model updates, then gradually decrease the learning rate η and injected noise εi N(0,σ2I) for observed examples. Table 1 (a) compares the Inception Score (IS) [45, 4] and Fréchet Inception Distance (FID) [20] with Inception v3 classifier [47] on 40,000 generated examples. Despite its simplicity, short-run MCMC is competitive.
Researcher Affiliation Academia Erik Nijkamp UCLA Department of Statistics enijkamp@ucla.edu Mitch Hill UCLA Department of Statistics mkhill@ucla.edu Song-Chun Zhu UCLA Department of Statistics sczhu@stat.ucla.edu Ying Nian Wu UCLA Department of Statistics ywu@ucla.edu
Pseudocode Yes Algorithm 1: Learning short-run MCMC. See code in Appendix 7.3.
Open Source Code Yes The code can be found in the Appendix. and Algorithm 1: Learning short-run MCMC. See code in Appendix 7.3.
Open Datasets Yes Figure 1: Synthesis by short-run MCMC: Generating synthesized examples by running 100 steps of Langevin dynamics initialized from uniform noise for Celeb A (64 64). and Table 1: Quality of synthesis and reconstruction for CIFAR-10 (32 32), Celeb A (64 64), and LSUN Bedroom (64 64). These are well-known public datasets.
Dataset Splits No The paper mentions using 40,000 observed examples for evaluating fidelity and 1,000 observed leave-out examples for reconstruction, but it does not specify the complete training/validation/test splits (e.g., percentages, explicit counts for each part, or how these relate to the total dataset size for each split) or refer to predefined splits with citations for reproducibility.
Hardware Specification No No specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running the experiments are provided in the paper.
Software Dependencies No The paper mentions ADAM [26] as an optimizer, but it does not provide specific version numbers for any key software components or libraries (e.g., Python, TensorFlow, PyTorch, or other relevant packages) used for replication.
Experiment Setup Yes Section 5.1: For CIFAR-10 we set the number of features nf = 128, whereas for Celeb A and LSUN we use nf = 64. We use 200,000 iterations of model updates, then gradually decrease the learning rate η and injected noise εi N(0,σ2I) for observed examples. Algorithm 1 specifies training steps T, initial weights θ0, observed examples {xi}n i=1, batch size m, variance of noise σ2, Langevin descretization τ and steps K, learning rate η. Tables 2 and 3 also provide specific values for K, σ, and nf.