ResFields: Residual Neural Fields for Spatiotemporal Signals

Authors: Marko Mihajlovic, Sergey Prokudin, Marc Pollefeys, Siyu Tang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a comprehensive analysis of the properties of Res Fields and propose a matrix factorization technique to reduce the number of trainable parameters and enhance generalization capabilities. Importantly, our formulation seamlessly integrates with existing MLP-based neural fields and consistently improves results across various challenging tasks: 2D video approximation, dynamic shape modeling via temporal SDFs, and dynamic Ne RF reconstruction.
Researcher Affiliation Collaboration Marko Mihajlovic1, Sergey Prokudin1,3, Marc Pollefeys1,2, Siyu Tang1 ETH Zurich1; Microsoft2; ROCS, University Hospital Balgrist, University of Z urich 3
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Code, data, and pre-trained models are released at https://github.com/markomih/Res Fields
Open Datasets Yes We use four sequences from the Owlii (Xu et al., 2017) dataset to evaluate the methods. Compared to fully synthetic sequences previously utilized for the task (Pumarola et al., 2021), the dynamic Owlii sequences exhibit more rapid and complex high-frequency motions, making it a harder task for MLP-based methods. At the same time, the presence of ground truth 3D scans allows us to evaluate both geometry and appearance reconstruction quality, as compared to the sequences with only RGB data available (Li et al., 2022; Shao et al., 2023). We render 400 RGB training images from four static camera views from 100 frames/time intervals and 100 test images from a rotating camera from 100 frames.
Dataset Splits Yes For this, we leave out 10% of randomly sampled pixels for validation and fit the video signal on the remaining ones.
Hardware Specification Yes All the reported runtime in this paper is measured on an NVIDIA RTX 3090 GPU card.
Software Dependencies Yes All models are trained with the Adam optimizer (Kingma & Ba, 2015) with default parameters defined by the Py Torch framework (Paszke et al., 2019).
Experiment Setup Yes All models are trained with the Adam optimizer (Kingma & Ba, 2015) with default parameters defined by the Py Torch framework (Paszke et al., 2019). We observe stable convergence with the learning rate of 5 · 10−4 and gradual cosine annealing (Loshchilov & Hutter, 2016) until the minimum learning rate of 5 · 10−5 for the experiments on dynamic neural radiance fields (Sec. 4.3). For other experiments (sections 4.1 and 4.2), we use the learning rate of 5 · 10−5 and cosine annealing until 5 · 10−6. All methods are trained respectively for 105, 2 · 105, 4 · 105, and 6 · 105 iterations on the 2D video approximation task (Sec. 4.1), temporal SDF reconstruction (Sec. 4.2), and dynamic volumetric reconstruction (Sec. 4.3) on Owlii (Xu et al., 2017) and our captured sequences.