reproducibilityindex.ai

Rethinking Score Distillation as a Bridge Between Image Distributions

Authors: David McAllister, Songwei Ge, Jia-Bin Huang, David Jacobs, Alexei Efros, Aleksander Holynski, Angjoo Kanazawa

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we test our proposed method on several generation problems where SDS is adopted. We compare against SDS and other task-specific baselines.
Researcher Affiliation	Academia	David Mc Allister1 Songwei Ge2 Jia-Bin Huang2 David W. Jacobs2 Alexei A. Efros1 Aleksander Holynski1 Angjoo Kanazawa1 1 UC Berkeley 2 University of Maryland
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	We will release the code with an opensource license when the paper is published.
Open Datasets	Yes	We use the MS-COCO [36] dataset for the evaluation. Consistent with the prior study [3], we randomly sample 5K captions from the COCO validation set as conditions for generating images.
Dataset Splits	Yes	We use the MS-COCO [36] dataset for the evaluation. Consistent with the prior study [3], we randomly sample 5K captions from the COCO validation set as conditions for generating images.
Hardware Specification	No	The paper mentions 'each run of our text-to-image generation with baseline VSD takes 1.3K GPU hours' in the NeurIPS checklist, indicating GPU usage, but does not provide specific GPU models or other hardware specifications for running experiments in the main text or experimental setup sections.
Software Dependencies	No	The paper mentions software components like 'stable-diffusion-v2-1-base model', 'Lo RA', and 'Three Studio [19] repository', but does not provide specific version numbers for these or other ancillary software dependencies.
Experiment Setup	Yes	For all the methods, we use the same learning rate of 0.01 and optimize for 2, 500 steps where we generally observe convergence. We compute the zero-shot FID [21] and CLIP FID scores [31] between these generated images and the ground truth images. We also report results generated by DDIM with 20 steps as a lower bound for renference.