Learning Latent Space Hierarchical EBM Diffusion Models

Authors: Jiali Cui, Tian Han

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments demonstrate a superior performance of our diffusion-learned EBM prior on various challenging tasks. We assess our model on the standard benchmark CIFAR-10 and the challenging high-resolution Celeb A-HQ-256 and large-scale LSUN-Church-64. We recruit Fr e Chet Inception Distance (FID) and Inception Score (IS) metrics to evaluate the quality of image synthesis. We report our results in Tab. 1 and Tab. 2 as well as the FID score of the reconstructed images. More quantitative and qualitative results can be found in the ablation studies and App. B.
Researcher Affiliation Academia 1Department of Computer Science, Stevens Institute of Technology. Correspondence to: Tian Han <than6@stevens.edu>.
Pseudocode Yes Algorithm 1 Learning EBM parameter ω; Algorithm 2 Sampling and Image Synthesis
Open Source Code Yes Our project page is available at https://jcui1224.github.io/diffusion-hierarchical-ebm-proj/.
Open Datasets Yes We assess our model on the standard benchmark CIFAR-10 and the challenging high-resolution Celeb A-HQ-256 and large-scale LSUN-Church-64.
Dataset Splits No The paper mentions CIFAR-10, Celeb A-HQ-256, and LSUN-Church-64 datasets but does not explicitly state the training, validation, and test splits (e.g., percentages or specific counts for each split).
Hardware Specification No No specific hardware (GPU models, CPU models, memory) used for the experiments is mentioned in the paper.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We conduct such experiments to demonstrate a smooth energy landscape learned for our EBMs prior. Diffusion step T. First, we train our diffusion-based EBM prior with more diffusion steps, e.g., T = 6. Langevin step K. By using more Langevin steps, we should explore the energy landscape better and obtain more effective EBM samples for learning. We show our results in Tab. 3 where using 50 steps (denoted as K = 50) delivers better synthesis than using 30 steps, while using 100 steps only shows a minor improvement but costs much more training and sampling time.