reproducibilityindex.ai

Decomposing NeRF for Editing via Feature Field Distillation

Authors: Sosuke Kobayashi, Eiichi Matsumoto, Vincent Sitzmann

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments validate that the distilled feature ﬁelds can transfer recent progress in 2D vision and language foundation models to 3D scene representations, enabling convincing 3D segmentation and selective editing of emerging neural graphics representations. In extensive experiments, we investigate the applications of neural feature ﬁelds with two different pre-trained teacher networks.
Researcher Affiliation	Collaboration	Sosuke Kobayashi Preferred Networks, Inc. sosk@preferred.jp Eiichi Matsumoto Preferred Networks, Inc. matsumoto@preferred.jp Vincent Sitzmann Massachusetts Institute of Technology sitzmann@mit.edu
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The complete code for the reproduction of all the experimental results is not publicly available.
Open Datasets	Yes	We construct a 3D semantic segmentation benchmark from four scenes in the Replica dataset [86] with data split and posed images provided by [112].
Dataset Splits	No	The paper mentions that 'data split and posed images provided by [112]' were used, but it does not specify the exact percentages or counts for training, validation, or test splits. No explicit validation set details are provided.
Hardware Specification	No	The paper states: 'It is difﬁcult to completely track and sum the total amount of computing in the experiments. Instead, we reported the setup of the main experiments.' It does not provide specific hardware details such as GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions using specific teacher networks (LSeg [44], DINO [12]) and follows settings from another paper ([112]) for NeRF implementation, but it does not provide version numbers for any ancillary software dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	During the training of 200K iterations, the loss L in Equation 4 is minimized by Adam with a linearly decaying learning rate (5e-4 to 8e-5). During training, Gaussian noise for density is also applied. The number of coarse and ﬁne samplings is 64 and 128, respectively. The MLP of the neural radiance ﬁeld consists of eight Re LU layers with 256 dimensions, followed by a linear layer for density, three layers for color, and three layers for feature, as shown in Fig. 1. Positional encoding of length 10 is used for the input coordinate and its skip connection, and that of length 4 is for viewing direction. The size of a training image is 320 240 for the Replica dataset and 1008 756 for the other datasets. The batchsize of training rays is 1024 for Replica and 2048 for the others. During ﬁnetuning of feature ﬁelds or radiance ﬁelds, Gaussian noise is removed, and the learning rate is set to 1e-4. See appendix A and C for further training details.