reproducibilityindex.ai

Learning Monocular Depth in Dynamic Environment via Context-aware Temporal Attention

Authors: Zizhang Wu, Zhuozheng Li, Zhi-Gang Fan, Yunzhe Wu, Yuanzhu Gan, Jian Pu

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experiments and Results, We conduct extensive experiments on three challenging benchmarks to validate the effectiveness of our pipeline against state-of-the-art models.
Researcher Affiliation	Collaboration	Zizhang Wu1 , Zhuozheng Li1 , Zhi-Gang Fan1 , Yunzhe Wu1 , Yuanzhu Gan1 and Jian Pu2 1Zongmu Tech 2Fudan University wuzizhang87@gmail.com, {zhuozheng.li, zhigang.fan, nelson.wu, yuanzhu.gan}@zongmutech.com jianpu@fudan.edu.cn
Pseudocode	No	The paper describes the proposed modules and their processes, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code or a link to a code repository for the described methodology.
Open Datasets	Yes	KITTI. KITTI dataset [Geiger et al., 2012] is a popular benchmark for the task of autonomous driving, which provides over 93,000 depth maps with corresponding raw Li DAR scans and RGB images aligned with raw data. ...Virtual KITTI 2. VKITTI2 dataset [Gaidon et al., 2016] is widely used for video understanding tasks... Nu Scenes. Nu Scenes dataset [Caesar et al., 2020] is a large-scale multi-modal autonomous driving dataset...
Dataset Splits	Yes	In experiments, we follow the widely-used KITTI Eigen split [Eigen et al., 2014] for network training, which is composed of 22,600 images from 32 scenes for training and 697 images from 29 scenes for testing.
Hardware Specification	Yes	Given the same Nvidia RTX A6000 GPU on the KITTI dataset
Software Dependencies	No	The paper states 'We implement our CTA-Depth in Py Torch' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	We implement our CTA-Depth in Py Torch and train it for 100 epochs with a mini-batch size of 4. The learning rate is 2 10 4 for both depth and pose refinement, which is decayed by a constant step (gamma=0.5 and step size=30). We set β1 = 0.9 and β2 = 0.999 in the Adam optimizer. We resize the input images to 320 960 for training, and set the number of sequential images to 2 for CTA-Refiner by balancing both computation efficiency and prediction accuracy. For long-range geometry embedding, the number of temporally adjacent images is set to N = 3. We fix m at 3 and n at 4 in experiments.