Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Customizable Image Synthesis with Multiple Subjects

Authors: Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Both qualitative and quantitative experimental results demonstrate our superiority over state-of-the-art alternatives under a variety of settings for multi-subject customization. Project page can be found here. ... 4 Experiments ... 4.1 Experimental setups ... 4.2 Main results ... 4.4 Ablation studies
Researcher Affiliation Collaboration 1USTC 2SJTU 3Ant Group 4Alibaba Group
Pseudocode Yes Algorithm 1 N-Subject Customization with Layout Guidance
Open Source Code Yes Project page can be found here.
Open Datasets Yes For fair and unbiased evaluation, we select subjects from previous papers [15, 18, 16, 17] spanning various categories for a total of 15 customized subjects.
Dataset Splits No The paper describes training and testing procedures but does not explicitly provide details on a validation dataset split or strategy.
Hardware Specification Yes All experiments are conducted using one A-100 GPU.
Software Dependencies No The paper mentions using 'Stable Diffusion v2-1-base' as the pre-trained model and refers to 'huggingface [39]' for implementations, but it does not provide specific version numbers for software dependencies like Python, PyTorch, or the Hugging Face libraries themselves.
Experiment Setup Yes Textual Inversion [18]... batch size of 4 and a learning rate of 0.002 for 3000 steps. ... Ours. ... batch size of 1 and a learning rate of 1 10 6 for 3,000 steps. At inference time... We use a positive value of +2.5 to strengthen the signal of the target subject and we use a negative value of 1 10 5 to weaken the signal of irrelevant subjects. Furthermore, we guide all 50 steps with the layout guidance in the whole generation process to get good customized generation results.