Is Attention All That NeRF Needs?
Authors: Mukund Varma T, Peihao Wang, Xuxi Chen, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to compare GNT against state-of-the-art methods for novel view synthesis. Our experiment settings include both per-scene optimization and cross-scene generalization. |
| Researcher Affiliation | Collaboration | 1Indian Institute of Technology Madras, 2University of Texas at Austin, 3Google Research |
| Pseudocode | Yes | We provide a simple and efficient pytorch pseudo-code to implement the attention operations in the view, ray transformer blocks in Alg. 1, 2 respectively. We do not indicate the feedforward and layer normalization operations for simplicity. As seen in Alg. 3, we reuse the epipolar view features Xj to derive keys, and values across view transformer blocks. |
| Open Source Code | No | Please refer to our project page for video results: https://vita-group.github.io/GNT/. This link is specified for 'video results' and there is no explicit statement or other link for the source code of the methodology. |
| Open Datasets | Yes | Local Light Field Fusion (LLFF) dataset: Introduced by Mildenhall et al. (2019), it consists of 8 forward facing captures of real-world scenes using a smartphone. |
| Dataset Splits | Yes | In these experiments, we use the same resolution and train/test splits as Ne RF (Mildenhall et al., 2020). |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for the experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and provides 'Py Torch-like Pseudocode', but does not specify version numbers for any software dependencies like PyTorch or CUDA. |
| Experiment Setup | Yes | The base learning rates for the feature extraction network and GNT are 10 3 and 5 10 4 respectively, which decay exponentially over training steps. For all our experiments, we train for 250,000 steps with 4096 rays sampled in each iteration. Unlike most Ne RF methods, we do not use separate coarse, fine networks and therefore to bring GNT to a comparable experimental setup, we sample 192 coarse points per ray across all experiments (unless otherwise specified). |