Cones: Concept Neurons in Diffusion Models for Customized Generation
Authors: Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive qualitative and quantitative studies on diverse scenarios show the superiority of our method in interpreting and manipulating diffusion models.4. Experiments |
| Researcher Affiliation | Collaboration | 1University of Science and Technology of China, Hefei, China 2Shanghai Jiao Tong University, Shanghai, China 3Ant Group, Hangzhou, China 4Alibaba Group, Hangzhou, China. |
| Pseudocode | Yes | Algorithm 1 Computing Concept Neuron Mask and Algorithm A2 Accelerated Computation of Concept Neuron Mask |
| Open Source Code | No | The paper references third-party implementations for competing methods (e.g., Dreambooth and Custom Diffusion) with links, but does not state that the code for Cones (their own method) is open-source or provide a link to it. |
| Open Datasets | No | All images used in the paper are downloaded from anonymous ecommerce websites or Unsplash, like the dataset of Custom Diffusion (Kumari et al., 2022). |
| Dataset Splits | No | The paper describes how images were generated for evaluation (e.g., "sample 20 output images using 20 random seeds", "generate 50 images per prompt") but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | Yes | All experiments are conducted using an A-100 GPU. |
| Software Dependencies | No | The paper mentions specific models and samplers used (e.g., Stable Diffusion V1.4, CLIP model (Vi T-L/14), DPM-Solver), but it does not specify software dependencies with version numbers (e.g., Python version, PyTorch version, CUDA version). |
| Experiment Setup | Yes | We use 50 steps of DPM-Solver (Lu et al., 2022) sampler with a scale 7.5 for all above methods.Textual Inversion: We train with the recommended1 batch size of 4, a learning rate of 0.005 (scaled by batch size for an effective learning rate of 0.02) for 5,000 steps.Dreambooth: Training is with a batch size of 1, learning rate 5 10 6, and training steps of 800.Custom Diffusion: batch size is set to 4, training steps is set to 600 and the basic learning rate is 10 5 and scaled by batch size for an effective learning rate of 4 10 5.Cones (Ours): Our experiments are conducted on an A-100 GPU with a batch size of 2. We use Algorithm A2 to find the concept neurons. The base learning rate is set to 3 10 5. we further scale the base learning rate of 6 10 5 by the number of GPUs and the batch size.For the single-subject generation, the base learning rate is set to 2 10 5, which can get better results. We train 1,000 steps for a single subject. |