Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis

Authors: Diwen Wan, Yuxiang Wang, Ruijie Lu, Gang Zeng

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness and efficiency of our method in obtaining re-posable 3D objects. Not only can our approach achieve excellent visual fidelity, but it also allows for the real-time rendering of high-resolution images.
Researcher Affiliation Academia Diwen Wan1 Yuxiang Wang1 Ruijie Lu1 Gang Zeng1 1National Key Laboratory of General Artificial Intelligence, School of IST, Peking University, China
Pseudocode No The paper describes its method in detailed paragraphs and mathematical equations across several sections (e.g., 3.2, 3.3, 3.4, 3.5, Appendix A) but does not include formal pseudocode blocks or algorithms labeled as such.
Open Source Code Yes We released the source code on Git Hub, i.e., https://github.com/dnvtmf/ SK_GS.
Open Datasets Yes To ensure fair comparison with previous work, we choose the same datasets and configurations as AP-Ne RF[4]. Specifically, we choose three multi-view video datasets. First, the D-Ne RF [14] dataset... The second dataset, Robots [3],contains... The third dataset, ZJU-Mo Cap [34], is commonly used...
Dataset Splits No The paper mentions 'evaluation' sets and training iterations, but it does not specify explicit validation dataset splits (e.g., percentages or counts for a separate validation set) or a clear methodology for hyperparameter tuning using such a set.
Hardware Specification Yes We conducted all experiments on a single NVIDIA Tesla V100 (32GB).
Software Dependencies No We implement our framework using PyTorch. The paper does not specify the version number for PyTorch or any other software components used.
Experiment Setup Yes The number of superpoints is initialized as 512. For both deformable field Φ and Ψ, we adopt the architecture of Ne RF[1], i.e., 8-layers MLP where each layer employs 256-dimensional hidden fully connected layer and ReLU activation function. We also employ positional encoding for the input coordinates and time. For optimization, we employ the Adam optimizer and use the different learning rate decay schedules for each component: the learning rate about 3D Gaussians is the same as 3D-GS, while the learning rate of other components undergoes exponential decay, ranging from 1e-3 to 1e-5. ... In total, our training loss of dynamic stage is: L = λ0Lrgb + λ1Ljoint + λ2Larap + λ3Lsmooth + λ4Lsparse (19) where λ = {1, 1., 10 3, 0.1, 0.1} in our experiments. ... We densify and prune superpoints every 1000 steps between iterations 20k and 30k, and the hyperparameter δgrad = 0.0002 and δprune = 0.001. We merge superpoints every 1000 steps between iterations 30k and 40k, while threshold δmerge = 0.0005.