Locating What You Need: Towards Adapting Diffusion Models to OOD Concepts In-the-Wild
Authors: Jianan Yang, Chenchao Gao, Zhiqing Xiao, Junbo Zhao, Sai Wu, Gang Chen, Haobo Wang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The extensive results show that CATOD significantly outperforms the prior approaches with an 11.10 boost on the CLIP score and a 33.08% decrease on the CMMD metric. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, Zhejiang University 2Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 3School of Software Technology, Zhejiang University 4International School of Information Science and Engineering, Dalian University of Technology |
| Pseudocode | Yes | An overall Algorithm for a cycle of CATOD is provided in Alg. 1. Algorithm 1 An active selection cycle for CATOD. |
| Open Source Code | Yes | The source code is attached in the Supplementary. |
| Open Datasets | Yes | This dataset consists of 5 categories: insect, lizard, penguin, seafish, and snake, and each category contains data from 5 OOD concepts. Each concept has 1,000 examples in total with 100 samples left out for validation. The dataset is collected from publicly available datasets including Image Net, i Naturalist 2018 [67], IP102 [71]. |
| Dataset Splits | Yes | Each concept has 1,000 examples in total with 100 samples left out for validation. |
| Hardware Specification | No | The paper mentions using Stable Diffusion 2.0 pre-trained on LAION-5B, but does not specify the hardware (e.g., GPU models, CPU models, memory) used to run their experiments. |
| Software Dependencies | No | The paper mentions using Stable Diffusion 2.0 but does not provide version numbers for any software dependencies, libraries, or programming languages. |
| Experiment Setup | Yes | Each experiment starts with 20 randomly sampled instances, and we conducted 5 cycles of data accumulation in which we selected 20 good samples to add to the training pool. We train 20 epochs for all combinations of adaption techniques and sampling strategies in each active learning cycle, with a batch size of 1. Furthermore, we generate 100 images for each concept for evaluation. We use the commonly adopted Stable Diffusion 2.0 pre-trained on LAION-5B [51] following Rombach s work [45]. |