MovingParts: Motion-based 3D Part Discovery in Dynamic Radiance Field

Authors: Kaizhi Yang, Xiaoshuai Zhang, Zhiao Huang, Xuejin Chen, Zexiang Xu, Hao Su

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, our method can achieve fast and high-quality dynamic scene reconstruction from even a single moving camera, and the induced part-based representation allows direct applications of part tracking, animation, 3D scene editing, etc. 5 EXPERIMENTS AND RESULTS Our method not only enables high-quality dynamic scene reconstruction but also allows for the discovery of reasonable rigid parts. In this section, we first evaluate the reconstruction and part discovery performance of our method on the D-Ne RF 360 synthetic dataset. Then, we construct a synthetic dataset with ground-truth motion masks to quantitatively evaluate our motion grouping results. Finally, we provide direct applications for structural scene modeling and editing.
Researcher Affiliation Collaboration Kaizhi Yang1, Xiaoshuai Zhang2, Zhiao Huang2, Xuejin Chen1, Zexiang Xu3, , Hao Su2, , 1 University of Science and Technology of China, 2 University of California, San Diego, 3 Adobe Research
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes We adopt the 360 Synthetic dataset provided by D-Ne RF (Pumarola et al. (2020)) to evaluate our method quantitatively and qualitatively. We created a synthetic dataset with ground truth image-segmentation pairs using Kubric toolkit (Greff et al. (2022)). Each created scene contains 1 to 5 realistic real-world objects from the GSO dataset (Downs et al. (2022)).
Dataset Splits No The paper mentions training on images and evaluating performance, but it does not specify explicit training/validation/test dataset splits or mention a dedicated validation set for model tuning.
Hardware Specification Yes All the experiments were conducted on a single NVIDIA RTX3090 GPU.
Software Dependencies No The paper mentions using the 'Adam optimizer' but does not specify version numbers for any software dependencies like Python, PyTorch/TensorFlow, or CUDA.
Experiment Setup Yes We use 50 50 50 voxels for the Eulerian and Lagrangian volume and a 160 160 160 voxel for the canonical volume. We use the Adam optimizer for a total of 20k iterations, by sampling 4096 rays from a randomly sampled image in each iteration. We set the learning rate as 0.08 for the Eulerian and Lagrangian volumes, 0.01 for the canonical volume, 6 10 4 for E and D, 8 10 4 for other networks. We set wper pt and wentropy to 0.01 and 0.001. To encourage reciprocity of motion between the Eulerian and Lagrangian views, we use the cycle consistency loss Lcycle with a weight parameter wcycle of 0.1. Additionally, we apply the total variation loss to smooth the features of the motion volumes VE and VL: We set the wtv = 0.01 and w E = 1 for the D-Ne RF synthetic dataset. For motion grouping evaluation, we decrease these two weights due to the more complex object geometry and motion patterns, we set the wtv = 0.001 and w E = 0.1.