reproducibilityindex.ai

Learning Non-Convergent Non-Persistent Short-Run MCMC Toward Energy-Based Model

Authors: Erik Nijkamp, Mitch Hill, Song-Chun Zhu, Ying Nian Wu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Section 5 Experimental Results In this section, we will demonstrate (1) realistic synthesis, (2) smooth interpolation, (3) faithful reconstruction of observed examples, and, (4) the inﬂuence of hyperparameters. K denotes the number of MCMC steps in equation (4). nf denotes the number of output features maps in the ﬁrst layer of fθ. See Appendix for additional results. We emphasize the simplicity of the algorithm and models, see Appendix 7.3 and 7.4, respectively. 5.1 Fidelity We evaluate the ﬁdelity of generated examples on various datasets, each reduced to 40,000 observed examples. Figure 6 depicts generated samples for various datasets with K = 100 Langevin steps for both training and evaluation. For CIFAR-10 we set the number of features nf = 128, whereas for Celeb A and LSUN we use nf = 64. We use 200,000 iterations of model updates, then gradually decrease the learning rate η and injected noise εi N(0,σ2I) for observed examples. Table 1 (a) compares the Inception Score (IS) [45, 4] and Fréchet Inception Distance (FID) [20] with Inception v3 classiﬁer [47] on 40,000 generated examples. Despite its simplicity, short-run MCMC is competitive.
Researcher Affiliation	Academia	Erik Nijkamp UCLA Department of Statistics enijkamp@ucla.edu Mitch Hill UCLA Department of Statistics mkhill@ucla.edu Song-Chun Zhu UCLA Department of Statistics sczhu@stat.ucla.edu Ying Nian Wu UCLA Department of Statistics ywu@ucla.edu
Pseudocode	Yes	Algorithm 1: Learning short-run MCMC. See code in Appendix 7.3.
Open Source Code	Yes	The code can be found in the Appendix. and Algorithm 1: Learning short-run MCMC. See code in Appendix 7.3.
Open Datasets	Yes	Figure 1: Synthesis by short-run MCMC: Generating synthesized examples by running 100 steps of Langevin dynamics initialized from uniform noise for Celeb A (64 64). and Table 1: Quality of synthesis and reconstruction for CIFAR-10 (32 32), Celeb A (64 64), and LSUN Bedroom (64 64). These are well-known public datasets.
Dataset Splits	No	The paper mentions using 40,000 observed examples for evaluating fidelity and 1,000 observed leave-out examples for reconstruction, but it does not specify the complete training/validation/test splits (e.g., percentages, explicit counts for each part, or how these relate to the total dataset size for each split) or refer to predefined splits with citations for reproducibility.
Hardware Specification	No	No specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running the experiments are provided in the paper.
Software Dependencies	No	The paper mentions ADAM [26] as an optimizer, but it does not provide specific version numbers for any key software components or libraries (e.g., Python, TensorFlow, PyTorch, or other relevant packages) used for replication.
Experiment Setup	Yes	Section 5.1: For CIFAR-10 we set the number of features nf = 128, whereas for Celeb A and LSUN we use nf = 64. We use 200,000 iterations of model updates, then gradually decrease the learning rate η and injected noise εi N(0,σ2I) for observed examples. Algorithm 1 specifies training steps T, initial weights θ0, observed examples {xi}n i=1, batch size m, variance of noise σ2, Langevin descretization τ and steps K, learning rate η. Tables 2 and 3 also provide specific values for K, σ, and nf.