Guiding Energy-based Models via Contrastive Latent Variables

Authors: Hankook Lee, Jongheon Jeong, Sejun Park, Jinwoo Shin

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of the proposed framework through extensive experiments. For example, our EBM achieves 8.61 FID under unconditional CIFAR-10 generation, which is lower than those of existing EBM models. and 4 EXPERIMENTS We verify the effectiveness of our Contrastive Latent-guided Energy Learning (CLEL) framework under various scenarios: (a) unconditional generation (Section 4.1), (b) out-of-distribution detection (Section 4.2), (c) conditional sampling (Section 4.3), and (d) compositional sampling (Section 4.4).
Researcher Affiliation Collaboration Hankook Lee A Jongheon Jeong B Sejun Park C Jinwoo Shin B ALG AI Research B KAIST C Korea University hankook.lee@lgresearch.ai, {jongheonj, jinwoos}@kaist.ac.kr, sejun.park000@gmail.com
Pseudocode Yes A TRAINING PROCEDURE OF CLEL Algorithm 1 Contrastive Latent-guided Energy Learning (CLEL)
Open Source Code Yes The code is available at https://github.com/hankook/CLEL.
Open Datasets Yes To this end, we train our CLEL framework on CIFAR-10 (Krizhevsky et al., 2009) and Image Net 32 32 (Deng et al., 2009; Chrabaszcz et al., 2017) under the unsupervised setting.
Dataset Splits No The paper describes training parameters like iterations and batch size but does not explicitly specify how the dataset was split into training, validation, and test sets, or mention a dedicated validation set.
Hardware Specification Yes e.g., we use single RTX3090 GPU only and training time and GPU memory footprint on single RTX3090 GPU of 24G memory.
Software Dependencies No The paper mentions optimizers (Adam, SGD) and architectural components (ResNet, SimCLR) but does not provide specific version numbers for software libraries, programming languages, or other dependencies needed to replicate the experiment environment.
Experiment Setup Yes For the EBM parameter θ, we use Adam optimizer (Kingma & Ba, 2015) with β1 = 0, β2 = 0.999, and a learning rate of 10 4. We use the linear learning rate warmup for the first 2k training iterations. For the encoder parameter ϕ, we use SGD optimizer with a learning rate of 3 10 2, a weight decay of 5 10 4, and a momentum of 0.9 as described in Chen & He (2020). For all experiments, we train our models up to 100k iterations with a batch size of 64, unless otherwise stated. For hyperparameters, we use α = 1 following Du & Mordatch (2019), and β = 0.01.