Guiding Energy-based Models via Contrastive Latent Variables
Authors: Hankook Lee, Jongheon Jeong, Sejun Park, Jinwoo Shin
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of the proposed framework through extensive experiments. For example, our EBM achieves 8.61 FID under unconditional CIFAR-10 generation, which is lower than those of existing EBM models. and 4 EXPERIMENTS We verify the effectiveness of our Contrastive Latent-guided Energy Learning (CLEL) framework under various scenarios: (a) unconditional generation (Section 4.1), (b) out-of-distribution detection (Section 4.2), (c) conditional sampling (Section 4.3), and (d) compositional sampling (Section 4.4). |
| Researcher Affiliation | Collaboration | Hankook Lee A Jongheon Jeong B Sejun Park C Jinwoo Shin B ALG AI Research B KAIST C Korea University hankook.lee@lgresearch.ai, {jongheonj, jinwoos}@kaist.ac.kr, sejun.park000@gmail.com |
| Pseudocode | Yes | A TRAINING PROCEDURE OF CLEL Algorithm 1 Contrastive Latent-guided Energy Learning (CLEL) |
| Open Source Code | Yes | The code is available at https://github.com/hankook/CLEL. |
| Open Datasets | Yes | To this end, we train our CLEL framework on CIFAR-10 (Krizhevsky et al., 2009) and Image Net 32 32 (Deng et al., 2009; Chrabaszcz et al., 2017) under the unsupervised setting. |
| Dataset Splits | No | The paper describes training parameters like iterations and batch size but does not explicitly specify how the dataset was split into training, validation, and test sets, or mention a dedicated validation set. |
| Hardware Specification | Yes | e.g., we use single RTX3090 GPU only and training time and GPU memory footprint on single RTX3090 GPU of 24G memory. |
| Software Dependencies | No | The paper mentions optimizers (Adam, SGD) and architectural components (ResNet, SimCLR) but does not provide specific version numbers for software libraries, programming languages, or other dependencies needed to replicate the experiment environment. |
| Experiment Setup | Yes | For the EBM parameter θ, we use Adam optimizer (Kingma & Ba, 2015) with β1 = 0, β2 = 0.999, and a learning rate of 10 4. We use the linear learning rate warmup for the first 2k training iterations. For the encoder parameter ϕ, we use SGD optimizer with a learning rate of 3 10 2, a weight decay of 5 10 4, and a momentum of 0.9 as described in Chen & He (2020). For all experiments, we train our models up to 100k iterations with a batch size of 64, unless otherwise stated. For hyperparameters, we use α = 1 following Du & Mordatch (2019), and β = 0.01. |