reproducibilityindex.ai

Exploring Low-Dimensional Subspace in Diffusion Models for Controllable Image Editing

Authors: Siyi Chen, Huijie Zhang, Minzhe Guo, Yifu Lu, Peng Wang, Qing Qu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, extensive empirical experiments demonstrate the effectiveness and efficiency of LOCO Edit. The code and the ar Xiv version can be found on the project website.1
Researcher Affiliation	Academia	Siyi Chen1 Huijie Zhang1 Minzhe Guo1 Yifu Lu1 Peng Wang1 Qing Qu1 1University of Michigan {siyich,huijiezh,vincegmz,yifulu,pengwa,qingqu}@umich.edu
Pseudocode	Yes	Algorithm 1 Unsupervised LOCO Edit
Open Source Code	Yes	The code and the ar Xiv version can be found on the project website.1 1https://chicychen.github.io/LOCO
Open Datasets	Yes	We evaluated DDPM (U-Net [49]) on CIFAR-10 dataset [50], U-Vi T [51] (Transformer) on Celeb A [52], Image Net [53] datasets and Deep Floy IF [19] trained on LAION-5B [54] dataset.
Dataset Splits	No	The paper lists datasets used for experiments (e.g., CIFAR-10, Celeb A), but does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or counts) or reference predefined splits for reproducibility.
Hardware Specification	Yes	All the experiments can be conducted with a single A40 GPU having 48G memory.
Software Dependencies	No	The paper refers to specific models and architectures (e.g., U-Net [49], U-Vi T [51]), but does not specify version numbers for general software dependencies like Python, PyTorch, or other relevant libraries.
Experiment Setup	Yes	We empirically choose the edit time step t for different datasets in the range [0.5, 0.8]. In practice, we found time steps within the above range give similar editing results. In most of the experiments, the edit time steps chosen are: 0.5 for FFHQ, 0.6 for Celeba A-HQ and LSUN-church, 0.7 for AFHQ, Flowers, and Met Face. In practice, we choose the edit strength λ in the range of [ -15.0, 15.0], where a larger α leads to stronger semantic editing and a negative α leads to the change of semantics in the opposite direction.