reproducibilityindex.ai

TransformerFusion: Monocular RGB Scene Reconstruction using Transformers

Authors: Aljaz Bozic, Pablo Palafox, Justus Thies, Angela Dai, Matthias Niessner

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments Metrics. To evaluate our monocular scene reconstruction, we use several measures of reconstruction performance. Table 1: Quantitative comparison with baselines and ablations on test set of Scannet dataset [8].
Researcher Affiliation	Academia	1Technical University of Munich 2Max Planck Institute for Intelligent Systems, Tübingen, Germany
Pseudocode	No	The paper describes the method in text and diagrams (Figure 2) but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper includes a personal project URL (aljazbozic.github.io/transformerfusion) but does not contain an explicit statement about the release of source code for the methodology or a direct link to a code repository.
Open Datasets	Yes	To train our approach we use Scan Net dataset [8], an RGB-D dataset of indoor apartments.
Dataset Splits	Yes	We follow the established train-val-test split.
Hardware Specification	Yes	Training takes about 30 hours using an Intel Xeon 6242R Processor and an Nvidia RTX 3090 GPU.
Software Dependencies	No	The paper mentions 'Py Torch library [31]' but does not provide a specific version number for it or other software dependencies.
Experiment Setup	Yes	During training, a batch size of 4 chunks is used with an Adam [23] optimizer with β1 = 0.9, β2 = 0.999, ϵ = 10 8 and weight regularization of 10 4. We use a learning rate of 10 4 with 5k warm-up steps at initialization, and square root learning rate decay afterwards. When computing the losses of coarse and ﬁne surface ﬁltering predictions, a higher weight of 2.0 is applied to near-surface voxels, to increase recall and improve overall robustness.