Understanding the Generalization of Pretrained Diffusion Models on Out-of-Distribution Data
Authors: Sai Niranjan Ramachandran, Rudrabha Mukhopadhyay, Madhav Agarwal, C.V. Jawahar, Vinay Namboodiri
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct thorough empirical evaluations to support and validate our claims. |
| Researcher Affiliation | Academia | Sai Niranjan Ramachandran1* , Rudrabha Mukhopadhyay2*, Madhav Agarwal2*, C.V. Jawahar2, Vinay Namboodiri3 1Indian Institute of Science, Bangalore 2International Institute of Information Technology, Hyderabad 3University of Bath |
| Pseudocode | Yes | Algorithm 1: Geodesic Interpolation Data: IS, IT RH W Result: Iint z S Invert(IS) z T Invert(IT ) ϕ cos 1(< z S, z T >) α Uni([0, 1]) zint sin((1 α)ϕ) sin(ϕ) z S + sin(αϕ) sin(ϕ) z T Iint Recon(zint) |
| Open Source Code | No | Please find more details about our project at http: //cvit.iiit.ac.in/research/projects/cvit-projects/diffusion OOD |
| Open Datasets | Yes | For models trained on the FFHQ dataset (Karras et al. 2020), we add the dataset with images matching FFHQ s preprocessing criteria, including those from Celeb A (Liu et al. 2015). In contrast, diverse datasets like Image Net (Deng et al. 2009) and Image Net Sketches (Wang et al. 2019) register PAD values between 1.98 and 2.28, marking them as ideal OOD examples. |
| Dataset Splits | No | The paper mentions using specific datasets for training and testing (e.g., FFHQ, LHQ-256) but does not provide specific train/validation/test dataset splits, percentages, or absolute sample counts. |
| Hardware Specification | No | The paper describes the experimental setup in terms of models and datasets but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for training or inference. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and an MLP network, but does not provide specific version numbers for software dependencies or libraries such as Python, PyTorch, TensorFlow, or CUDA. |
| Experiment Setup | Yes | We use the Adam optimizer (Kingma and Ba 2014) with a learning rate of 10 3. Significant improvement is achieved through spherical regularization (Menon et al. 2020), which projects the network output back onto the sphere of our latent space. This approach, feasible due to the known manifold geometry of any diffusion space, ensures consistent latent space optimization, thereby preserving quality and accelerating convergence. We employ a standard L1 loss between the predicted and ground-truth latents, which suffices without additional image space losses. The models are trained and tested on the LHQ-256 (Skorokhodov, Sotnikov, and Elhoseiny 2021) dataset. As measured by the PSNR and SSIM metrics, evaluation results are summarized in Table 4. |