Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos
Authors: Shi Chen, Erik Sandström, Sandro Lombardi, Siyuan Li, Martin R. Oswald
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our camera tracking on the Bonn RGB-D Dynamic Dataset [47] and the TUM RGB-D Dataset [62] (dynamic scenes). Since existing works report tracking results on different sets of sequences, we select four mostly used sequences from each dataset to evaluate our method. For rendering, we report novel view synthesis (NVS) results both qualitatively and quantitatively on the i Phone dataset [12]. To align with Shape of Motion [68], we evaluate our method and all baselines on the 5 sequences used in [68] with the 2x downsampled image resolution. Metrics. For tracking, we evaluate ATE RMSE [cm] [63] after aligning the estimated camera trajectory with the ground truth via Umeyama alignment [67]. For NVS, we report PSNR, SSIM and LPIPS evaluated within the covisibility masks provided by [12] and averaged over all novel views. |
| Researcher Affiliation | Collaboration | Shi Chen ETH Zürich Erik Sandström Google Sandro Lombardi Independent Researcher Siyuan Li ETH Zürich Martin R. Oswald University of Amsterdam |
| Pseudocode | No | The paper describes methods and processes in paragraph text, and mentions 'Detailed algorithms are in the supplemental material.' in Section 3.1, but no structured pseudocode or algorithm blocks are present in the main paper. |
| Open Source Code | No | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: We wished to have provided the code at submission time, but to improve readability of the code, we choose to release this after submission time, but as soon as possible. |
| Open Datasets | Yes | Datasets. We evaluate our camera tracking on the Bonn RGB-D Dynamic Dataset [47] and the TUM RGB-D Dataset [62] (dynamic scenes). For rendering, we report novel view synthesis (NVS) results both qualitatively and quantitatively on the i Phone dataset [12]. |
| Dataset Splits | No | The paper states: 'Since existing works report tracking results on different sets of sequences, we select four mostly used sequences from each dataset to evaluate our method. For rendering, we report novel view synthesis (NVS) results both qualitatively and quantitatively on the i Phone dataset [12]. To align with Shape of Motion [68], we evaluate our method and all baselines on the 5 sequences used in [68] with the 2x downsampled image resolution.' This describes a selection of sequences for evaluation or mentions relying on external work [68] for evaluation sets, but does not provide specific training/test/validation split percentages, sample counts, or detailed splitting methodologies for any dataset within the paper. |
| Hardware Specification | Yes | Implementation Details. All experiments were conducted on a cluster with an AMD EPYC 7H12 or 7742 CPU and an NVIDIA A6000 GPU. |
| Software Dependencies | No | The paper mentions various tools and frameworks such as 'Splat-SLAM [54]', 'SAM2 [50]', 'Co Tracker3 [20]', and '3DGS [23]', but it does not specify any version numbers for these or other software libraries/dependencies. |
| Experiment Setup | Yes | Implementation Details. All experiments were conducted on a cluster with an AMD EPYC 7H12 or 7742 CPU and an NVIDIA A6000 GPU. The kernel size of the median filter used to denoise the coarse motion masks is 5 5. The spherical search radius for newly-seen pixel identification is rsearch = 0.02m. For geometry and photometric optimization, we keep our loss weights identical with those applied in Mo Sca [29], and set λmask = 1 as the weight of Lmask. For more implementation details, we refer to the supplemental material. |