Locality-Aware Generalizable Implicit Neural Representation

Authors: Doyup Lee, Chiheon Kim, Minsu Cho, WOOK SHIN HAN

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments to demonstrate the effectiveness of our locality-aware generalizable INR on image reconstruction and novel view synthesis. In addition, we conduct in-depth analysis to validate the efficacy of our selective token aggregation and multi-band feature modulation to localize the information of data to capture fined-grained details.
Researcher Affiliation Collaboration Doyup Lee Kakao Brain doyup.lee@kakaobrain.com Chiheon Kim Kakao Brain chiheon.kim@kakaobrain.com Minsu Cho POSTECH mscho@postech.ac.kr Wook-Shin Han POSTECH wshan@dblab.postech.ac.kr
Pseudocode No The paper describes its methods in detail using natural language and mathematical equations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Our implementation and experimental settings are based on the official codes of Instance Pattern Composers [19] for a fair comparison. We attach the implementation details to Appendix A. 3https://github.com/kakaobrain/ginr-ipc
Open Datasets Yes We follow the protocols in previous studies [8, 19, 35] to evaluate our framework on image reconstruction of Celeb A, FFHQ, and Image Nette with 178 178 resolution.
Dataset Splits Yes For a fair comparison, we use the same splits of train-valid samples with previous studies of generalizable INR [8, 19, 35].
Hardware Specification Yes When we use four NVIDIA V100 GPUs, the training takes 5.5, 6.7, and 4.3 days, respectively. ... We use eight NVIDIA A100 GPUs to train our framework on Image Net during 20 epochs, where the training takes about 2.5 days. ... using four NVIDIA V100 GPUs for FFHQ with 256x256 and about 1.4 days using eight V100 GPUs for FFHQ with 512x512. For FFHQ 1024x1024, we use... The training of 400 epochs takes about 3.4 days using eight NVIDIA V100 GPUs.
Software Dependencies No The paper mentions using the Adam optimizer and basing implementation on IPC3's open-sourced code, but it does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages used.
Experiment Setup Yes We use the Adam [20] optimizer with (β1, β2) = (0.9, 0.999) and constant learning rate of 0.0001. The batch size is 16 and 32 for image reconstruction and novel view synthesis, respectively. ... We train our framework on Celeb A, FFHQ, and Image Nette during 300, 1000, and 4000 epochs, respectively.