reproducibilityindex.ai

Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation

Authors: Jinwoo Bae, Sungho Moon, Sunghoon Im

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We first evaluate state-of-the-art models on diverse public datasets, which have never been seen during the network training. Next, we investigate the effects of texturebiased and shape-biased representations using the various texture-shifted datasets that we generated. Extensive experiments show that the proposed method achieves state-of-the-art performance with various public datasets. Our method also shows the best generalization ability among the competitive methods.
Researcher Affiliation	Academia	Jinwoo Bae1, Sungho Moon1, and Sunghoon Im1 1 Department of Electrical Engineering and Computer Science, DGIST, Daegu, Korea
Pseudocode	No	The paper provides architectural diagrams and mathematical formulations for its components, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets	Yes	We evaluate state-of-the-art models trained on KITTI using six public depth datasets (SUN3D, RGBD, MVS, Scenes11, ETH3D, and Oxford Robotcar). We use the KITTI Eigen split (Geiger et al. 2013; Eigen and Fergus 2015) consisting of 39,810 training, and 4,424 validation and 697 test data. We test the models using public depth datasets consisting of indoor scenes (SUN3D (Xiao, Owens, and Torralba 2013), RGBD (Sturm et al. 2012)), synthetic scenes from graphics tools (Scenes11 (Ummenhofer et al. 2017)), outdoor buildingfocused scenes (MVS (Ummenhofer et al. 2017)), and night driving scenes (Oxford Robotcar (Maddern et al. 2016)). We also use ETH3D (Schops et al. 2017) containing both indoor and outdoor scenes.
Dataset Splits	Yes	We use the KITTI Eigen split (Geiger et al. 2013; Eigen and Fergus 2015) consisting of 39,810 training, and 4,424 validation and 697 test data.
Hardware Specification	No	The paper describes the datasets used and experimental settings, but does not provide specific details about the hardware (e.g., GPU models, CPU types, or memory) used to run the experiments.
Software Dependencies	No	The paper mentions various models and architectures (e.g., 'ResNet50,' 'Transformers,' 'ViT'), but it does not specify any software dependencies (like programming languages, libraries, or frameworks) with their version numbers required to reproduce the experiments.
Experiment Setup	Yes	We use an input image size of 640 x 192. We use ResNet50 (He et al. 2016) as the CNN backbone (E(θ) in Fig. 1), and L number of Transformers. In this work, we set the L as 4.