Robust Camera Pose Refinement for Multi-Resolution Hash Encoding

Authors: Hwan Heo, Taekyung Kim, Jiyoung Lee, Jaewon Lee, Soohyun Kim, Hyunwoo J. Kim, Jin-Hwa Kim

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the novel-view synthesis datasets validate that our learning frameworks achieve state-of-the-art performance and rapid convergence of neural rendering. In this section, we validate our proposed method using the multi-resolution hash encoding (M uller et al., 2022) with inaccurate or unknown camera poses.
Researcher Affiliation Collaboration 1Department of Computer Science, Korea University, Republic of Korea 2NAVER AI Lab, Republic of Korea 3AI Institute of Seoul National University, Republic of Korea.
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper mentions re-implementing a training pipeline but does not provide an explicit statement about releasing their source code or a link to a repository.
Open Datasets Yes We evaluate the proposed method against the two previous works, BARF (Lin et al., 2021) and GARF (Chng et al., 2022)... Ne RF-Synthetic (Mildenhall et al., 2020) has 8 synthetic object-centric scenes, which consist of 100 rendered images with ground-truth camera poses (intrinsic and extrinsic) for each scene... LLFF (Mildenhall et al., 2019) has 8 forwardfacing scenes captured by a hand-held camera, including RGB images and camera poses that have been estimated using the off-the-shelf algorithm (Sch onberger & Frahm, 2016).
Dataset Splits No The paper uses standard datasets (Ne RF-Synthetic and LLFF) for novel-view synthesis and evaluation, but it does not explicitly provide the specific training/validation/test splits (e.g., percentages or sample counts) used for its experiments.
Hardware Specification No The paper mentions using a 'tiny-cuda-nn' framework and the 'NAVER Smart Machine Learning (NSML) platform', but does not specify exact hardware components like GPU models, CPU types, or memory.
Software Dependencies No The paper mentions using 'Py Torch' and 'tiny-cuda-nn (tcnn)' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes For the multi-resolution hash encoding, we follow the approach of Instant-NGP (M uller et al., 2022), which uses a table size of T = 219 and a dimensionality of F = 2 for each level feature. Each feature table is initialized with a uniform distribution U[ 1e-4, 1e-4]... The decoding network consists of 6-layer MLPs with Re LU (Glorot et al., 2011) activation and 256 hidden dimensions... We use the Adam optimizer and train all models for 200K iterations, with a learning rate of 5 × 10−4 that exponentially decays to 1 × 10−4. For multi-level learning rate scheduling, we set the scheduling interval [ts, te] = [20K, 100K]... While we set λ = 1 by default for the straight-through estimator...