Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques
Authors: Benyuan Meng, Qianqian Xu, Zitai Wang, Zhiyong Yang, Xiaochun Cao, Qingming Huang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Experimental Validation |
| Researcher Affiliation | Academia | Benyuan Meng1,2 Qianqian Xu3,4 Zitai Wang3 Zhiyong Yang5 Xiaochun Cao6 Qingming Huang5,3,7 1Institute of Information Engineering, CAS 2School of Cyber Security, University of Chinese Academy of Sciences 3Key Lab. of Intelligent Information Processing, Institute of Computing Technology, CAS 4Peng Cheng Laboratory 5School of Computer Science and Tech., University of Chinese Academy of Sciences 6School of Cyber Science and Tech., Shenzhen Campus of Sun Yat-sen University 7Key Laboratory of Big Data Mining and Knowledge Management, CAS |
| Pseudocode | No | The paper describes procedures and guidelines but does not include formal pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | Our code is available at this url. |
| Open Datasets | Yes | SPair-71k [26] dataset, label-scarce semantic segmentation using Bedroom28 [54] and Horse-21 [54] datasets, and standard semantic segmentation using ADE20K [61] and City Scapes [5] datasets. ... SPair-71k: Available at https://cvlab.postech.ac.kr/research/SPair-71k/. Horse-21 and Bedroom-28 (LSUN): Available at https://github.com/fyu/lsun. ADE20K: Custom (research-only, non-commercial), at https://groups.csail.mit.edu/ vision/datasets/ADE20K/terms/. City Scapes: Custom, at https://www.cityscapes-dataset.com/license/. |
| Dataset Splits | No | The paper mentions using training data and running experiments on 'different splits' for some datasets (Horse-21 and Bedroom-28), but does not provide specific percentages or counts for training, validation, and test splits for the main datasets used in the experiments. |
| Hardware Specification | Yes | We use Nvidia(R) RTX 3090 and Nvidia(R) RTX 4090 GPUs for the experiments, all with 24GB VRAM. |
| Software Dependencies | No | For the diffusion model to extract features, we choose Stable Diffusion v1.5 [34] to be consistent with SOTA competitors. ... Fortunately, the community has offered many handy tools for ordinary users, such as kohya_ss3, which can be directly utilized for our purpose. |
| Experiment Setup | Yes | All tasks extract features at t = 50. When Control Net is applied, except for standard semantic segmentation, we additionally start multi-step denoising from t = 60. ... γ1 and γ2, the hyper-parameters for the proposed regularization terms, are tuned for each dataset. |