Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

Authors: Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon1863-1872

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments conducted on the KITTI and Cityscapes dataset, our framework is shown to outperform the state-of-the-art depth and motion estimation methods. Our code, dataset, and models are publicly available.
Researcher Affiliation Collaboration Seokju Lee1, Sunghoon Im2, Stephen Lin3, In So Kweon1 1 Korea Advanced Institute of Science and Technology (KAIST) 2 Daegu Gyeongbuk Institute of Science and Technology (DGIST) 3 Microsoft Research
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our code, dataset, and models are publicly available.2 2https://github.com/SeokjuLee/Insta-DM
Open Datasets Yes Through extensive experiments conducted on the KITTI and Cityscapes dataset, our framework is shown to outperform the state-of-the-art depth and motion estimation methods. Our code, dataset, and models are publicly available.
Dataset Splits No The paper mentions using the "KITTI Eigen split" and "Cityscapes dataset" for testing, but it does not provide explicit details within the text about the train/validation splits or their sizes/percentages for general reproducibility from scratch, only referencing standard benchmarks.
Hardware Specification Yes We train our networks using the ADAM optimizer (Kingma and Ba 2015) with β1 = 0.9 and β2 = 0.999 on 4 Nvidia RTX 2080 GPUs.
Software Dependencies No The paper states "Our system is implemented in Py Torch (Paszke et al. 2019)" but does not specify the version number for PyTorch or any other software libraries or dependencies. The reference year (2019) is not a version number.
Experiment Setup Yes The image resolution is set to 832 256 and the video data is augmented with random scaling, cropping, and horizontal flipping. We set the mini-batch size to 4 and train the networks over 200 epochs with 1,000 randomly sampled batches in each epoch... The initial learning rate is set to 10 4 and is decreased by half every 50 epochs. The loss weights are set to λp = 2.0, λg = 1.0, λs = 0.1, λt = 0.1, and λh = 0.02.