Noise-free Score Distillation

Authors: Oren Katzir, Or Patashnik, Daniel Cohen-Or, Dani Lischinski

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the efficacy of NFSD, we provide qualitative examples that compare NFSD and SDS, as well as several other methods. We validate our formulation and approach by utilizing Stable Diffusion (Rombach et al., 2022) as our score function with a focus on images and Ne RFs as our representations. We implement NFSD using the threestudio (Guo et al., 2023) framework for text-based 3D generation. We quantitatively evaluate both our NFSD and SDS for the task of 2D-image generation using the COCO2014 caption dataset.
Researcher Affiliation Academia Oren Katzir1 Or Patashnik1 Daniel Cohen-Or1 Dani Lischinski2 1Tel-Aviv University 2The Hebrew University of Jerusalem
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It mentions using 'threestudio (Guo et al., 2023) framework' which is open-source, but does not state that they are releasing their own specific implementation of NFSD.
Open Datasets Yes We quantitatively evaluate both our NFSD and SDS for the task of 2D-image generation using the COCO2014 caption dataset.
Dataset Splits No We have sampled 5K captions and images from the validation dataset. (This indicates usage of a validation set, but not the full train/validation/test split for data partitioning reproducibility).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments. It only mentions running experiments 'on a single GPU' in Appendix A.8 without specifying the model.
Software Dependencies No The paper mentions 'Stable Diffusion 2.1-base (Rombach et al., 2022)', 'Adam W optimizer (Loshchilov & Hutter, 2017)', and 'threestudio (Guo et al., 2023) framework'. However, it does not provide specific version numbers for all key software components (e.g., for Adam W, or the threestudio framework itself).
Experiment Setup Yes Unless stated otherwise, all 3D models are optimized for 25, 000 iterations using Adam W optimizer (Loshchilov & Hutter, 2017) with a learning rate of 0.01. The initial rendering resolution of 64 64 is increased to 512 512 after 5, 000 iterations; at the same time we anneal the maximum diffusion time to 500 as proposed by Lin et al. (2023); Wang et al. (2023b). The implicit volume is initialized according to the object-centric initialization (Lin et al., 2023; Wang et al., 2023b). We alternate the background between random solid-color and a learned neural environment map. The pre-trained text-to-image diffusion model for all experiments is Stable Diffusion 2.1-base (Rombach et al., 2022), a latent diffusion model with ϵ-prediction.