Denoising Diffusion via Image-Based Rendering

Authors: Titas Anciukevičius, Fabian Manhardt, Federico Tombari, Paul Henderson

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the model on several challenging datasets of real and synthetic images, and demonstrate superior results on generation, novel view synthesis and 3D reconstruction. 4 EXPERIMENTS Datasets. We evaluate our approach on three datasets: (i) real-world chairs, tables and sofas from MVImg Net (Yu et al., 2023b); (ii) real-world hydrants, apples, sandwiches and teddybears from CO3D (Reizenstein et al., 2021); (iii) the renderings of Shape Net (Chang et al., 2015) cars from (Anciukeviˇcius et al., 2023).
Researcher Affiliation Collaboration Titas Anciukeviˇcius 1,2 Fabian Manhardt 2 Federico Tombari 2,3 Paul Henderson 4 1 University of Edinburgh 2 Google 3 Technical University of Munich 4 University of Glasgow
Pseudocode No The paper describes its methods and processes in descriptive text and uses figures (e.g., Figure 1) to illustrate concepts, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes https://anciukevicius.github.io/generative-image-based-rendering
Open Datasets Yes We evaluate our approach on three datasets: (i) real-world chairs, tables and sofas from MVImg Net (Yu et al., 2023b); (ii) real-world hydrants, apples, sandwiches and teddybears from CO3D (Reizenstein et al., 2021); (iii) the renderings of Shape Net (Chang et al., 2015) cars from (Anciukeviˇcius et al., 2023).
Dataset Splits Yes We partition each dataset into training, validation, and test sets, following a 90-5-5% split based on lexicographic ordering of provided scene names. The validation set was used for model development and hyperparameter tuning, while the test set was reserved solely for final model evaluation to mitigate any overfitting risks.
Hardware Specification No The paper mentions 'GPU memory consumption' and 'To reduce GPU memory consumption during training', but it does not provide specific hardware details such as GPU models (e.g., NVIDIA A100), CPU types, or detailed computer specifications used for running experiments.
Software Dependencies No The paper mentions using 'Adam (Kingma & Ba, 2015)' as an optimizer but does not specify version numbers for any key software components or libraries (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes We employed the Adam (Kingma & Ba, 2015) optimizer with a learning rate of 8 10 5 and beta values of β1 = 0.9 and β2 = 0.999 for model training. Norm-based gradient clipping was applied with a value of 1.0. We used a batch size of 8. For evaluation, we used an Exponential Moving Average (EMA) model with a decay factor of ema decay = 0.995. To reduce GPU memory consumption during training, we render 12% or 5% of pixels depending on resolution. In our volumetric rendering process, each pixel was rendered by sampling 64 depths along the ray with stratified sampling, followed by 64 importance samples. We adopt a sigmoid noise schedule (Jabri et al., 2022) with 1000 timesteps for our denoising diffusion process. To generate samples, we use 250 DDIM steps (Song et al., 2020) for unconditional generation and 50 DDIM steps for conditional generation (single-image and sparse-view reconstruction).