ResFields: Residual Neural Fields for Spatiotemporal Signals
Authors: Marko Mihajlovic, Sergey Prokudin, Marc Pollefeys, Siyu Tang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a comprehensive analysis of the properties of Res Fields and propose a matrix factorization technique to reduce the number of trainable parameters and enhance generalization capabilities. Importantly, our formulation seamlessly integrates with existing MLP-based neural fields and consistently improves results across various challenging tasks: 2D video approximation, dynamic shape modeling via temporal SDFs, and dynamic Ne RF reconstruction. |
| Researcher Affiliation | Collaboration | Marko Mihajlovic1, Sergey Prokudin1,3, Marc Pollefeys1,2, Siyu Tang1 ETH Zurich1; Microsoft2; ROCS, University Hospital Balgrist, University of Z urich 3 |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code, data, and pre-trained models are released at https://github.com/markomih/Res Fields |
| Open Datasets | Yes | We use four sequences from the Owlii (Xu et al., 2017) dataset to evaluate the methods. Compared to fully synthetic sequences previously utilized for the task (Pumarola et al., 2021), the dynamic Owlii sequences exhibit more rapid and complex high-frequency motions, making it a harder task for MLP-based methods. At the same time, the presence of ground truth 3D scans allows us to evaluate both geometry and appearance reconstruction quality, as compared to the sequences with only RGB data available (Li et al., 2022; Shao et al., 2023). We render 400 RGB training images from four static camera views from 100 frames/time intervals and 100 test images from a rotating camera from 100 frames. |
| Dataset Splits | Yes | For this, we leave out 10% of randomly sampled pixels for validation and fit the video signal on the remaining ones. |
| Hardware Specification | Yes | All the reported runtime in this paper is measured on an NVIDIA RTX 3090 GPU card. |
| Software Dependencies | Yes | All models are trained with the Adam optimizer (Kingma & Ba, 2015) with default parameters defined by the Py Torch framework (Paszke et al., 2019). |
| Experiment Setup | Yes | All models are trained with the Adam optimizer (Kingma & Ba, 2015) with default parameters defined by the Py Torch framework (Paszke et al., 2019). We observe stable convergence with the learning rate of 5 · 10−4 and gradual cosine annealing (Loshchilov & Hutter, 2016) until the minimum learning rate of 5 · 10−5 for the experiments on dynamic neural radiance fields (Sec. 4.3). For other experiments (sections 4.1 and 4.2), we use the learning rate of 5 · 10−5 and cosine annealing until 5 · 10−6. All methods are trained respectively for 105, 2 · 105, 4 · 105, and 6 · 105 iterations on the 2D video approximation task (Sec. 4.1), temporal SDF reconstruction (Sec. 4.2), and dynamic volumetric reconstruction (Sec. 4.3) on Owlii (Xu et al., 2017) and our captured sequences. |