reproducibilityindex.ai

Bounds all around: training energy-based models with bidirectional bounds

Authors: Cong Geng, Jia Wang, Zhiyong Gao, Jes Frellsen, Søren Hauberg

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the bounds we develop a new and efﬁcient estimator of the Jacobi-determinant of the EBM generator. We demonstrate that these developments stabilize training and yield high-quality density estimation and sample generation. Experimentally, we demonstrate that our approach matches or surpasses state-of-the-art on diverse tasks at negligible performance increase.
Researcher Affiliation	Academia	Cong Geng, Jia Wang, Zhiyong Gao Shanghai Jiao Tong University {gengcong, jiawang, zhiyong.gao}@sjtu.edu.cn Jes Frellsen , Søren Hauberg Technical University of Denmark {jefr, sohau}@dtu.dk
Pseudocode	No	The paper describes algorithms (e.g., LOBPCG) and methods, but does not present them in a pseudocode block or a clearly labeled algorithm format.
Open Source Code	Yes	Our implementation is available at https://github.com/gengcong940126/EBM-BB.
Open Datasets	Yes	We train our generative model on the Stacked MNIST dataset... on MNIST (Le Cun, 1998)... on the standard benchmark 32x32 CIFAR-10 (Krizhevsky et al., 2009) dataset and the 64x64 cropped ANIMEFACE3 dataset...
Dataset Splits	No	The paper mentions training and test sets and performs cross-validation for some metrics, but it does not explicitly detail the split percentages or counts for training, validation, and test datasets in a general reproducible manner for all experiments.
Hardware Specification	Yes	All experiments are conducted on a single 12GB NVIDIA Titan GPU using a pytorch (Paszke et al., 2017) implementation. All models were trained on a single 12GB Titan GPU.
Software Dependencies	Yes	All experiments are conducted on a single 12GB NVIDIA Titan GPU using a pytorch (Paszke et al., 2017) implementation.
Experiment Setup	Yes	On toy data and MNIST, all models are realized with multi-layer perceptrons (MLPs), while for natural images, we use the convolutional architecture from Studio GAN (Kang and Park, 2020). Details of the network architecture are given in the supplementary material. To improve training, we use a positive margin for our energy function to balance the bounds. Specifically, we let L(θ) = L(θ) + MEx pg(x) [\|x Eθ(x) + x log pg(x)\|p]+ζ where []+ = max(0, ) is the usual hinge. In most experiments, ζ = 1 as this allows us to apply the simplified bound in Equation (15). But in general we recommend starting with ζ = 0 and trying successively larger values to stabilize training.