Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Customizable Image Synthesis with Multiple Subjects
Authors: Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Both qualitative and quantitative experimental results demonstrate our superiority over state-of-the-art alternatives under a variety of settings for multi-subject customization. Project page can be found here. ... 4 Experiments ... 4.1 Experimental setups ... 4.2 Main results ... 4.4 Ablation studies |
| Researcher Affiliation | Collaboration | 1USTC 2SJTU 3Ant Group 4Alibaba Group |
| Pseudocode | Yes | Algorithm 1 N-Subject Customization with Layout Guidance |
| Open Source Code | Yes | Project page can be found here. |
| Open Datasets | Yes | For fair and unbiased evaluation, we select subjects from previous papers [15, 18, 16, 17] spanning various categories for a total of 15 customized subjects. |
| Dataset Splits | No | The paper describes training and testing procedures but does not explicitly provide details on a validation dataset split or strategy. |
| Hardware Specification | Yes | All experiments are conducted using one A-100 GPU. |
| Software Dependencies | No | The paper mentions using 'Stable Diffusion v2-1-base' as the pre-trained model and refers to 'huggingface [39]' for implementations, but it does not provide specific version numbers for software dependencies like Python, PyTorch, or the Hugging Face libraries themselves. |
| Experiment Setup | Yes | Textual Inversion [18]... batch size of 4 and a learning rate of 0.002 for 3000 steps. ... Ours. ... batch size of 1 and a learning rate of 1 10 6 for 3,000 steps. At inference time... We use a positive value of +2.5 to strengthen the signal of the target subject and we use a negative value of 1 10 5 to weaken the signal of irrelevant subjects. Furthermore, we guide all 50 steps with the layout guidance in the whole generation process to get good customized generation results. |