Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

Authors: Suhani Vora, Noha Radwan, Klaus Greff, Henning Meyer, Kyle Genova, Mehdi S. M. Sajjadi, Etienne Pot, Andrea Tagliasacchi, Daniel Duckworth

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical analysis demonstrates comparable quality to competitive 2D and 3D semantic segmentation baselines on complex, realistically-rendered scenes and signiﬁcantly outperforms a comparable neural radiance ﬁeld-based method on a series of tasks requiring 3D reasoning.
Researcher Affiliation	Industry	Suhani Vora EMAIL Google Research; Noha Radwan EMAIL Google Research; Klaus Greﬀ EMAIL Google Research; Henning Meyer EMAIL Google Research; Kyle Genova EMAIL Google Research; Mehdi S. M. Sajjadi EMAIL Google Research; Etienne Pot EMAIL Google Research; Andrea Tagliasacchi EMAIL Google Research, Simon Fraser University; Daniel Duckworth EMAIL Google Research
Pseudocode	No	The paper describes the methodology in detailed prose and through figures like Figure 2, which illustrates the architecture. However, it does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	We release these datasets along with code to reproduce them to the community upon publication. ... We intend to release the code, datasets and pre-trained models upon publication.
Open Datasets	Yes	As large scale datasets of 3D semantically annotated scenes with suﬃcient high quality RGB views are scarce, we propose three novel datasets of increasing complexity: KLEVR, Toy Box5, and Toy Box13. ... We release these datasets along with code to reproduce them to the community upon publication. ... These datasets along with accompanying code and pretrained Ne RF models, are publicly available on our project website (to be linked upon publication).
Dataset Splits	Yes	To enable evaluation from novel views within the same scene, we randomly partition each scene s frames into train cameras and test cameras; the latter representing the set typically used to evaluate methods in novel view synthesis (Mildenhall et al., 2020). For evaluation across scenes, we further partition scenes into train scenes and novel scenes. Table 1 depicts the statistics for each of the proposed datasets. ... We train Ne RF models on all train cameras from all train scenes. We provide Ne SF with supervision from semantic maps corresponding to 9 randomly-chosen cameras per scene. ... For 2D evaluation, we randomly select 4 cameras from each novel scenes train cameras.
Hardware Specification	Yes	While neural radiance ﬁelds are acknowledged to be slow to train, we ﬁnd that we are able to ﬁt a single model to suﬃcient quality in 20 minutes on eight TPUv3 cores on the Google Cloud Platform. ... Our models are trained on 32 TPUv3 cores. ... We train Deep Lab v3 with Wide Res Net (Wu et al., 2019) for 55k steps on 16 TPUv3 chips. ... We train Sparse Conv Net asynchronously on 20 NVIDIA V100 GPUs with momentum using a base learning rate of 1.5e 2 and decaying to 0 over the ﬁnal 250k steps of training.
Software Dependencies	No	The paper mentions several software components and frameworks used (e.g., Adam optimizer, Deep Lab v3, Sparse Conv Net, Ne RF, UNet, Kubric, Blender) but does not provide specific version numbers for any of these, which is required for a reproducible description of software dependencies.
Experiment Setup	Yes	Each scene is preprocessed by training an independent Ne RF for 25k steps with Adam using an initial learning rate of 1e 3 decaying to 5.4e 4 according to a cosine rule. ... Ne SF is trained for 5k steps using Adam with an initial learning rate of 1e 3 decaying to 4e 4. As input for Ne SF, we discretize density ﬁelds by densely probing with ϵ=1/32 resulting in 643 evenly-spaced points in [ 1, +1]3. This density grid is then processed by the 3D UNet architecture of Çiçek et al. (2016) with 32, 64, and 128 channels at each stage of downsampling. The semantic latent vector is processed by a multilayer perceptron consisting of 2 hidden layers of 128 units. ... We train Deep Lab v3 with Wide Res Net (Wu et al., 2019) for 55k steps on 16 TPUv3 chips. ... We train Sparse Conv Net asynchronously on 20 NVIDIA V100 GPUs with momentum using a base learning rate of 1.5e 2 and decaying to 0 over the ﬁnal 250k steps of training.