Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Dynamic 3D Gaussian Fields for Urban Areas
Authors: Tobias Fischer, Jonas Kulhanek, Samuel Rota Bulò, Lorenzo Porzi, Marc Pollefeys, Peter Kontschieder
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, we surpass the state-of-the-art by over 3 d B in PSNR and more than 200 in rendering speed. |
| Researcher Affiliation | Collaboration | Tobias Fischer1 Jonas Kulhanek1,3 Samuel Rota Bulò2 Lorenzo Porzi2 Marc Pollefeys1 Peter Kontschieder2 1 ETH Zürich 2 Meta Reality Labs 3 CTU Prague |
| Pseudocode | No | The paper describes the method and its components but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We released the full source code for reproducing our experiments. All datasets used are publicly available. |
| Open Datasets | Yes | We utilize the recently proposed NVS benchmark [17] of Argoverse 2 [81]... We use the established Waymo Open [23], KITTI [21] and VKITTI2 [22] benchmarks... |
| Dataset Splits | Yes | For Argoverse 2, we follow the experimental setup of [17]. In particular, we use the full resolution 1550 2080 images for training and evaluation and use all cameras of every 10th temporal frame as the testing split. ... For KITTI and VKITTI, we follow the established benchmark used in [16, 83, 17, 73]. We use the full resolution 375 1242 images for training and evaluation and evaluate at varying training set fractions. |
| Hardware Specification | Yes | In our multi-sequence experiments in Table 1 and Table 5, we train our model on 8 NVIDIA A100 40GB GPUs for 125,000 steps, taking approximately 2.5 days. In our single-sequence experiments, we train our model on a single RTX 4090 GPU for several hours. |
| Software Dependencies | No | We implement our method in Py Torch [80] with tools from nerfstudio [85]. |
| Experiment Setup | Yes | We use λrgb := 0.8, λssim := 0.2 and λdepth := 0.05. We use the Adam optimizer [86] with β1 := 0.9, β2 := 0.999. We use separate learning rates for each 3D Gaussian attribute, the neural fields, and the sequence latent codes ωt s. In particular, for means µ, we use an exponential decay learning rate schedule from 1.6 10 5 to 1.6 10 6, for opacity α, we use a learning rate of 5 10 2, for scales a and rotations q, we use a learning rate of 10 3. The neural fields are trained with an exponential decay learning rate schedule from 2.5 10 3 to 2.5 10 4. The sequence latent vectors ωt s are optimized with a learning rate of 5 10 4. We optimize camera and object pose parameters with an exponential decay learning rate schedule from 10 5 to 10 6. To counter pose drift, we apply weight decay with a factor 10 2. |