Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models
Authors: Sangwoong Yoon, Himchan Hwang, Dohyun Kwon, Yung-Kyun Noh, Frank Park
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide experimental results demonstrating the effectiveness of Dx MI in training diffusion models and EBMs. On image generation tasks, Dx MI can train strong short-run diffusion models that generate samples in 4 or 10 neural network evaluations. Also, Dx MI can be used to train strong energy-based anomaly detectors. |
| Researcher Affiliation | Collaboration | Sangwoong Yoon1, Himchan Hwang2, Dohyun Kwon1,3 , Yung-Kyun Noh1,4 , Frank C. Park2,5 1Korea Institute for Advanced Study, 2Seoul National University, 3University of Seoul, 4Hanyang University, 5Saige Research |
| Pseudocode | Yes | Algorithm 1 Diffusion by Maximum Entropy IRL; Algorithm 2 Diffusion by Maximum Entropy IRL for Image Generation |
| Open Source Code | Yes | The code for Dx MI can be found in https://github.com/swyoon/Diffusion-by-Max Ent IRL.git. |
| Open Datasets | Yes | On image generation tasks, we show that Dx MI can be used to fine-tune a diffusion model with reduced generation steps, such as T = 4 or 10. We test Dx MI on unconditional CIFAR-10 [52] (32 × 32), conditional Image Net [53] downsampled to 64 × 64, and LSUN Bedroom [54] (256 × 256), using three diffusion model backbones, DDPM [3], DDGAN [46], and variance exploding version of EDM [50]. |
| Dataset Splits | Yes | On image generation tasks, we show that Dx MI can be used to fine-tune a diffusion model with reduced generation steps, such as T = 4 or 10. We test Dx MI on unconditional CIFAR-10 [52] (32 × 32), conditional Image Net [53] downsampled to 64 × 64, and LSUN Bedroom [54] (256 × 256)... When computing FID, the whole 50,000 training images of CIFAR-10 are used. To select the best model, we periodically generate 10,000 images for CIFAR-10 and 5,000 images for Image Net. The checkpoint with the best FID score is selected as the final model. For Image Net, we use the batch stat file provided by https://github.com/openai/guided-diffusion. For MVTec-AD, “The training dataset contains normal object images from 15 categories without any labels. The test set consists of both normal and defective object images...” |
| Hardware Specification | Yes | In practice, our CIFAR-10 experiment completes in under 24 hours on two A100 GPUs, while the Image Net 64 experiment takes approximately 48 hours on four A100 GPUs. |
| Software Dependencies | No | The paper mentions optimizers (Adam, RAdam) and mixed precision training but does not provide specific software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python version). |
| Experiment Setup | Yes | We set τ1 = 0.1 and τ2 = 0.01. The sigmoid time cost is used for all image generation experiments. For all runs, we use a batch size of 128. In the CIFAR-10 experiments, we use the Adam optimizer with a learning rate of 10−7 for the sampler weights, 10−5 for the value weights, and 10−5 for the σt s. |