Exploring Low-Dimensional Subspace in Diffusion Models for Controllable Image Editing

Authors: Siyi Chen, Huijie Zhang, Minzhe Guo, Yifu Lu, Peng Wang, Qing Qu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, extensive empirical experiments demonstrate the effectiveness and efficiency of LOCO Edit. The code and the ar Xiv version can be found on the project website.1
Researcher Affiliation Academia Siyi Chen1 Huijie Zhang1 Minzhe Guo1 Yifu Lu1 Peng Wang1 Qing Qu1 1University of Michigan {siyich,huijiezh,vincegmz,yifulu,pengwa,qingqu}@umich.edu
Pseudocode Yes Algorithm 1 Unsupervised LOCO Edit
Open Source Code Yes The code and the ar Xiv version can be found on the project website.1 1https://chicychen.github.io/LOCO
Open Datasets Yes We evaluated DDPM (U-Net [49]) on CIFAR-10 dataset [50], U-Vi T [51] (Transformer) on Celeb A [52], Image Net [53] datasets and Deep Floy IF [19] trained on LAION-5B [54] dataset.
Dataset Splits No The paper lists datasets used for experiments (e.g., CIFAR-10, Celeb A), but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or counts) or reference predefined splits for reproducibility.
Hardware Specification Yes All the experiments can be conducted with a single A40 GPU having 48G memory.
Software Dependencies No The paper refers to specific models and architectures (e.g., U-Net [49], U-Vi T [51]), but does not specify version numbers for general software dependencies like Python, PyTorch, or other relevant libraries.
Experiment Setup Yes We empirically choose the edit time step t for different datasets in the range [0.5, 0.8]. In practice, we found time steps within the above range give similar editing results. In most of the experiments, the edit time steps chosen are: 0.5 for FFHQ, 0.6 for Celeba A-HQ and LSUN-church, 0.7 for AFHQ, Flowers, and Met Face. In practice, we choose the edit strength λ in the range of [ -15.0, 15.0], where a larger α leads to stronger semantic editing and a negative α leads to the change of semantics in the opposite direction.