Novel View Synthesis with Diffusion Models
Authors: Daniel Watson, William Chan, Ricardo Martin Brualla, Jonathan Ho, Andrea Tagliasacchi, Mohammad Norouzi
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We benchmark 3Di Ms on the SRN Shape Net dataset (Sitzmann et al., 2019) to allow comparisons with prior work on novel view synthesis from a single image. |
| Researcher Affiliation | Industry | Daniel Watson William Chan Ricardo Martin-Brualla Google Research, Brain Google Research, Brain Google Research Jonathan Ho Andrea Tagliasacchi Mohammad Norouzi Google Research, Brain Google Research, Brain Google Research, Brain |
| Pseudocode | Yes | In order to maximize the reproducibility of our results, we provide code in JAX (Bradbury et al., 2018) for our proposed X-UNet neural architecture from Section 2.3. |
| Open Source Code | Yes | In order to maximize the reproducibility of our results, we provide code in JAX (Bradbury et al., 2018) for our proposed X-UNet neural architecture from Section 2.3. |
| Open Datasets | Yes | We benchmark 3Di Ms on the SRN Shape Net dataset (Sitzmann et al., 2019) to allow comparisons with prior work on novel view synthesis from a single image. |
| Dataset Splits | No | No explicit training/validation/test dataset splits with percentages or sample counts are provided for the main model training. |
| Hardware Specification | Yes | we could not fit ch=512 in TPUv4 memory without model parallelism |
| Software Dependencies | No | In order to maximize the reproducibility of our results, we provide code in JAX (Bradbury et al., 2018) for our proposed X-UNet neural architecture from Section 2.3. |
| Experiment Setup | Yes | For our neural architecture, our main experiments use ch=256 ( 471M params), and we also experiment with ch=448 ( 1.3B params) in Section 4. One of our early findings that we kept throughout all experiments in the paper is that ch_mult=(1, 2, 2, 4)... We use a learning rate with peak value 0.0001... We use a global batch size of 128. |