reproducibilityindex.ai

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

Authors: Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu, Jaskirat Singh, Jing Zhang, Dylan Campbell, Peter Tu, Richard Hartley

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments validate that our IMPUS can achieve smooth, direct, and realistic image morphing and is adaptable to several other generative tasks.
Researcher Affiliation	Collaboration	GE Research1 Australian National University2
Pseudocode	Yes	Algorithm 1 Finetuning & inference process of IMPUS
Open Source Code	Yes	Code is available at: https://github.com/Go L2022/IMPUS
Open Datasets	Yes	The data used comes from three main sources: 1) benchmark datasets for image generation, including Faces (50 pairs of random images for each subset of images from Celeb A-HQ (Karras et al., 2018)), Animals (50 pairs of random images for each subset of images from AFHQ (Choi et al., 2020), including dog, cat, dog-cat, and wild), and Outdoors (50 pairs of church images from LSUN (Yu et al., 2015)), 2) internet images, e.g., the flower and beetle car examples, and 3) 25 image pairs from Wang & Golland (2023).
Dataset Splits	No	The data used comes from three main sources: 1) benchmark datasets for image generation, including Faces (50 pairs of random images for each subset of images from Celeb A-HQ (Karras et al., 2018)), Animals (50 pairs of random images for each subset of images from AFHQ (Choi et al., 2020), including dog, cat, dog-cat, and wild), and Outdoors (50 pairs of church images from LSUN (Yu et al., 2015)).
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) are mentioned in the paper.
Software Dependencies	No	For all the experiments, we use use a latent space diffusion model (Rombach et al., 2022), with pre-trained weights from Stable-Diffusion-v-1-4 1. Textual inversion is trained with Adam W optimizer (Loshchilov & Hutter, 2019), and the learning rate is set as 0.002 for 2500 steps. For the benchmark dataset, we perform text inversion for 1000 steps. Lo RA is trained with Adam optimizer (Kingma & Ba, 2015), and the learning rate is set as 0.001.
Experiment Setup	Yes	We set the Lo RA rank for unconditional score estimates to 2, and the default Lo RA rank for conditional score estimates is set to be heuristic (auto). The conditional parts and unconditional parts are finetuned for 150 steps and 15 steps respectively. The finetune learning rate is set to 10 3. Hyperparameters for text inversion as well as guidance scales vary based on the dataset.