Inserting Anybody in Diffusion Models via Celeb Basis

Authors: Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on the self-collected 2K synthetic facial images generated by Style GAN [2].
Researcher Affiliation Collaboration 1 School of Computer Science and Engineering, Sun Yat-sen University 2 Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, China 3 Guangdong Key Laboratory of Information Security Technology 4 Tencent AI Lab 5 The Hong Kong University of Science and Technology
Pseudocode No No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Project page is at: http://celeb-basis. github.io. Code is at: https://github.com/ygtxr1997/Celeb Basis.
Open Datasets Yes We conduct experiments on the self-collected 2K synthetic facial images generated by Style GAN [2].
Dataset Splits No The paper describes a single-shot personalization method where coefficients are optimized from a single facial photo, rather than training on predefined dataset splits. The 2K synthetic images are used as inputs for evaluation.
Hardware Specification Yes We train the MLP with a learning rate of 0.005 and batch size of 2 on a single NVIDIA A100 GPU.
Software Dependencies No The paper mentions various models and tools used (e.g., Stable Diffusion, CLIP, Arc Face), but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes We train the MLP with a learning rate of 0.005 and batch size of 2 on a single NVIDIA A100 GPU. The training augmentation includes horizontal flip, color jitter, and random scaling ranging in 0.1 1.0. For single identity training, the optimization costs 400 steps, taking 3 minutes. For 10 identities joint training, we found that training 2,500 steps is enough, taking 18 minutes (averaged about 250 steps and 108 seconds for each identity).