DynPoint: Dynamic Neural Point For View Synthesis
Authors: Kaichen Zhou, Jia-Xing Zhong, Sangyun Shin, Kai Lu, Yiyuan Yang, Andrew Markham, Niki Trigoni
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results obtained demonstrate the considerable acceleration of training time achieved typically an order of magnitude by our proposed method while yielding comparable outcomes compared to prior approaches. Furthermore, our method exhibits strong robustness in handling long-duration videos without learning a canonical representation of video content. Comprehensive experiments are conducted on datasets including Nerfie, Nvidia, Hyper Ne RF, Iphone, and Davis showcasing the speed and accuracy of Dyn Point for view synthesis. |
| Researcher Affiliation | Academia | Kaichen Zhou, Jia-Xing Zhong, Sangyun Shin, Kai Lu, Yiyuan Yang, Andrew Markham, Niki Trigoni Department of Computer Science University of Oxford {rui.zhou, jiaxing.zhong, sangyun.shin, kai.lu}@cs.ox.ac.uk {yiyuan.yang, andrew.markham, niki.trigoni}@cs.ox.ac.uk |
| Pseudocode | No | The paper describes the proposed method using text, mathematical equations, and figures, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing its source code or a link to a code repository. |
| Open Datasets | Yes | To evaluate the view synthesis capabilities of Dyn Point, we performed experiments on four extensively utilized datasets, namely Nvidia dataset in [31], Nerfie in [42], Hyper Ne RF in [43] and Iphone in [17]. ... Additionally, we also assessed Dyn Point s performance on a recent dataset Iphone [17], which specifically addresses the challenge of camera teleportation. Furthermore, we examined the efficacy of monocular depth estimation and scene estimation by visualizing the results obtained from the Davis dataset, as in [67]. |
| Dataset Splits | No | The paper mentions training and evaluation but does not specify explicit train/validation/test dataset splits with percentages or sample counts. For example, it states: “we adopt a training paradigm in which our model is trained on the input monocular video with the assistance of pre-trained optic flow and monocular depth models [47, 54, 26]. During the training process, the RGB information obtained from the observed viewpoint is utilized as the supervision signal, without relying on any canonical information.” There is no mention of a dedicated validation split for hyperparameter tuning or early stopping. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to conduct the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using specific pre-trained models for monocular depth and optic flow estimation: “In our research, we employed the pretrained Deep Pruning Transformer (DPT) network in [46], for monocular depth estimation. For optic flow estimation, we utilized the pretrained Flow Former model [26].” However, it does not list specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used for implementation. |
| Experiment Setup | Yes | The paper provides some details about the experimental setup, including the number of adjacent frames used (K, with K=3 and K=6 mentioned in ablation), hierarchical levels (H=3), the use of L2 loss for supervision, and the pretraining strategy: “During the training process, we compute the scene flow between frame t and its 2K adjacent frames, where K {1, ..., K}. ... In our case, we set H = 3. ... The L2 loss function is used to supervise our rendered pixel values similar to the setting of [37]. ... To facilitate the fine-tuning process, we initialized the weights of the Rendering MLP by pretraining it on the DTU dataset, employing a similar training set to that used in [61].” |