PHOTOSWAP: Personalized Subject Swapping in Images
Authors: Jing Gu, Yilin Wang, Nanxuan Zhao, Tsu-Jui Fu, Wei Xiong, Qing Liu, Zhifei Zhang, HE Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments underscore the efficacy and controllability of Photoswap in personalized subject swapping. Furthermore, Photoswap significantly outperforms baseline methods in human ratings across subject swapping, background preservation, and overall quality, revealing its vast application potential, from entertainment to professional editing. |
| Researcher Affiliation | Collaboration | Jing Gu1 Yilin Wang2 Nanxuan Zhao2 Tsu-Jui Fu3 Wei Xiong2 Qing Liu2 Zhifei Zhang2 He Zhang2 Jianming Zhang2 Hyun Joon Jung2 Xin Eric Wang1 1University of California, Santa Cruz 2Adobe 3University of California, Santa Barbara |
| Pseudocode | Yes | Algorithm 1 The Photoswap Algorithm |
| Open Source Code | No | The paper provides a project website 'https://photoswap.github.io/' but does not explicitly state that the source code for the methodology is available there or at a specific code repository link. The provided URL is a project overview page rather without a direct link to a code repository. |
| Open Datasets | No | For real images, the paper states: 'All prompts, along with the collected image, will be made available in our next revision.' For synthetic images, it says: 'All prompts used in synthetic image generation will also be released too.' This indicates future availability of data rather than current public access to the full datasets used. |
| Dataset Splits | No | The paper mentions generating images for evaluation and sampling 200 images from real and synthetic datasets for human evaluation, but it does not specify explicit train, validation, and test dataset splits with percentages or sample counts for the models used. |
| Hardware Specification | Yes | The Dream Booth training takes around 10 minutes on a machine with 8 A100 GPU cards. |
| Software Dependencies | Yes | For concept learning, we mainly utilize Dream Booth (Ruiz et al., 2023) to finetune a stable diffusion 2.1 to learn the new concept from 3 5 images. |
| Experiment Setup | Yes | During inference, we utilize the DDIM sampling method with 50 denoising steps and classifier-free guidance of 7.5. The default step λA for cross-attention map replacement is 20. The default step λM for self-attention map replacement is 25, while the default step for self-attention feature λϕ replacement is 10. ... The learning rate is set to 1e-6. We use Adawm optimizer with 800 hundred training steps. |