reproducibilityindex.ai

Real-World Image Variation by Aligning Diffusion Inversion Chain

Authors: Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results demonstrate that our proposed approach outperforms existing methods concerning semantic similarity and perceptual quality.
Researcher Affiliation	Collaboration	1The Chinese University of Hong Kong 2Smart More
Pseudocode	No	The paper describes its methods using equations and prose but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Project page: https://rival-diff.github.io
Open Datasets	Yes	Our study obtained a high-quality test set of reference images from the Internet and Dream Booth [12] to ensure a diverse image dataset.
Dataset Splits	No	The paper mentions using a "test set" and "evaluation samples are from two datasets" but does not provide specific details on training, validation, or test dataset splits or percentages.
Hardware Specification	Yes	Experiments run on a single NVIDIA RTX4090 GPU with 8 seconds to generate image variation with batch size 1.
Software Dependencies	Yes	Our baseline model is Stable-Diffusion V1.5. During the image inversion and generation, we employed DDIM sample steps T = 50 for each image and set the classifier-free guidance scale m = 7 in Eq. (8). We split two stages at talign = tearly = 30 for attention alignment in Eq. (3) and latent alignment in Eq. (8). In addition, we employ the shuffle strategy described in Eq. (6) to initialize the starting latent XT G. Experiments run on a single NVIDIA RTX4090 GPU with 8 seconds to generate image variation with batch size 1.
Experiment Setup	Yes	During the image inversion and generation, we employed DDIM sample steps T = 50 for each image and set the classifier-free guidance scale m = 7 in Eq. (8). We split two stages at talign = tearly = 30 for attention alignment in Eq. (3) and latent alignment in Eq. (8). In addition, we employ the shuffle strategy described in Eq. (6) to initialize the starting latent XT G.