Improving Adversarial Energy-Based Model via Diffusion Process

Authors: Cong Geng, Tian Han, Peng-Tao Jiang, Hao Zhang, Jinwei Chen, Søren Hauberg, Bo Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show significant improvement in generation compared to existing adversarial EBMs, while also providing a useful energy function for efficient density estimation. 4. Experiments We evaluate our DDAEBM in different scenarios across different data scales from 2-dimension toy datasets to large-scale image datasets. We test our energy function mainly on toy datasets and MNIST datasets which are easy to visualize and intuitive to measure. For large-scale datasets, we focus on image generation. We further perform some additional studies such as out-of-distribution (OOD) detection and ablation studies to verify our model s superiority.
Researcher Affiliation Collaboration 1vivo Mobile Communication Co., Ltd, China 2Department of Computer Science, Stevens Institute of Technology, USA 3Technical University of Denmark, Copenhagen, Denmark.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about the release of open-source code or a direct link to a code repository for the methodology described.
Open Datasets Yes We evaluate our DDAEBM in different scenarios across different data scales from 2-dimension toy datasets to large-scale image datasets. We test our energy function mainly on toy datasets and MNIST datasets which are easy to visualize and intuitive to measure. For large-scale datasets, we focus on image generation. We further perform some additional studies such as out-of-distribution (OOD) detection and ablation studies to verify our model s superiority. ... training on 32 32 CIFAR-10 (Krizhevsky et al., 2009), 64 64 Celeb A (Liu et al., 2015), and 128 128 LSUN church (Yu et al., 2015) datasets. ... SVHN (Netzer et al., 2011), Texture (Cimpoi et al., 2014), CIFAR-100 (Krizhevsky et al., 2009) and Celeb A.
Dataset Splits No The paper mentions training epochs and test sets, but does not provide specific details about validation dataset splits (e.g., percentages, counts, or methodology for creation).
Hardware Specification No The paper mentions '4 GPUs' in Table 10 (Hyper-parameters for our training optimization), but it does not specify any particular GPU models, CPU models, memory details, or other specific hardware configurations used for running the experiments.
Software Dependencies Yes We use Pytorch 1.10.0 and CUDA 11.3 for training.
Experiment Setup Yes We specify the hyperparameters used for our generators and training optimization on each dataset in Table 9 and Table 10. Table 9. Hyper-parameters for our generator network (CIFAR-10, Celeb A, LSUN church, # of Res Net blocks per scale, Initial # of channels, Channel multiplier, Scale of attention block, Latent Dimension, # of latent mapping layers, Latent embedding dimension). Table 10. Hyper-parameters for our training optimization (MNIST, CIFAR-10, Celeb A, LSUN church, Initial learning rate, βmin, βmax in Eq. (47), w, wmid in Eq. (21), Adam optimizer β1, β2, EMA, Batch size, # of training epochs, # of GPUs).