reproducibilityindex.ai

VGOS: Voxel Grid Optimization for View Synthesis from Sparse Inputs

Authors: Jiakai Sun, Zhanjie Zhang, Jiafu Chen, Guangyuan Li, Boyan Ji, Lei Zhao, Wei Xing

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that VGOS achieves state-of-the-art performance for sparse inputs with super-fast convergence.
Researcher Affiliation	Academia	Zhejiang University {csjk, cszzj, chenjiafu, ji by, cszhl, wxing}@zju.edu.cn, lgy1428275037@163.com
Pseudocode	No	The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	Code will be available at https://github.com/SJo Jo K/VGOS.
Open Datasets	Yes	We perform experiments on inward-facing scenes from the Realistic Synthetic 360 dataset [Mildenhall et al., 2020] and forward-facing scenes from the LLFF dataset [Mildenhall et al., 2019].
Dataset Splits	Yes	Following the protocol of Info Ne RF [Kim et al., 2022], we randomly sample 4 views out of 100 training images as sparse inputs and evaluate the model with 200 testing images. [...] We hold out 1/8 of the images as test sets following the standard protocol [Mildenhall et al., 2020] and report results for 3 input views randomly sampled from the remaining images.
Hardware Specification	Yes	For a fair comparison, the training time of each method is measured on our machine with a single NVIDIA RTX 3090 GPU using respective official implementations.
Software Dependencies	No	We implement our model on the top of DVGOv2 codebase using Pytorch [Paszke et al., 2019]. The paper mentions Pytorch but does not provide a specific version number for it or other software dependencies.
Experiment Setup	Yes	We use the Adam [Kingma and Ba, 2015] to optimize the voxel grids with the initial learning rate of 0.1 for all voxels and 10 3 for the shallow MLP and exponential learning rate decay is applied. For scenes in the Realistic Synthetic 360 dataset, we train the voxel grids for 5K iterations with a batch size of 213 rays for input views and 214 rays for sampled views in both stages. For scenes in the LLFF dataset, we train the voxel grids for 9K iterations with a batch size of 212 rays for input views and 214 rays for sampled views in only one stage.