Training Deep Energy-Based Models with f-Divergence Minimization

Authors: Lantao Yu, Yang Song, Jiaming Song, Stefano Ermon

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate the superiority of f-EBM over contrastive divergence, as well as the benefits of training EBMs using f-divergences other than KL.
Researcher Affiliation Academia 1Department of Computer Science, Stanford University, Stanford, CA 94305 USA.
Pseudocode Yes Algorithm 1 Single-Step f-EBM. See code implementation in Appendix E.1.
Open Source Code Yes Our implementation of f-EBM can be found at: https: //github.com/ermongroup/f-EBM
Open Datasets Yes We conduct experiments with two commonly used image datasets, Celeb A (Liu et al., 2015) and CIFAR-10 (Krizhevsky et al., 2009).
Dataset Splits Yes We conduct experiments with two commonly used image datasets, Celeb A (Liu et al., 2015) and CIFAR-10 (Krizhevsky et al., 2009). These are standard benchmark datasets with well-defined splits, implicitly covering validation data.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with specific version numbers.
Experiment Setup Yes Since the performance is sensitive to the model architectures, for fair comparisons, we use the same architecture and training hyper-parameters for f-EBMs and the contrastive divergence baseline (Du & Mordatch, 2019). See Appendix G.9 for more details.