Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
Authors: Hong Chen, Yipeng Zhang, Simin Wu, Xin Wang, Xuguang Duan, Yuwei Zhou, Wenwu Zhu
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our proposed Disen Booth framework outperforms baseline models for subject-driven text-to-image generation with the identity-preserved embedding. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Technology, Tsinghua University 2Beijing National Research Center for Information Science and Technology 3Lanzhou University |
| Pseudocode | No | The paper describes its methods through text and equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Our code is available at https://github.com/forchchch/Disen Booth |
| Open Datasets | Yes | We adopt the subject-driven text-to-image generation dataset Dream Bench proposed by Ruiz et al. (2022), which are downloaded from Unsplash2. This dataset contains 30 subjects, including unique objects like backpacks, stuffed animals, cats, etc. |
| Dataset Splits | No | The paper describes using a small set of images for finetuning (3-5 images per subject) and the Dream Bench dataset for evaluation, but it does not specify explicit train/validation/test splits for the Dream Bench dataset or for the finetuning process in general. |
| Hardware Specification | Yes | The finetuning process is conducted on one Tesla V100 with batch size of 1, while the finetuning iterations are 3,000. |
| Software Dependencies | No | The paper mentions implementing based on 'Stable Diffusion 2-1' and using 'Adam W' optimizer, but it does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The learning rate is 1e-4 with the Adam W (Loshchilov & Hutter, 2018) optimizer. The finetuning process is conducted on one Tesla V100 with batch size of 1, while the finetuning iterations are 3,000. As for the Lo RA rank, we use r = 4 for all the experiments. We use λ2 = 0.01 for all our experiments. λ3 is a hyper-parameter which is set to 0.001 for all our experiments. |