Unsupervised Learning of Geometry From Videos With Edge-Aware Depth-Normal Consistency

Authors: Zhenheng Yang, Peng Wang, Wei Xu, Liang Zhao, Ramakant Nevatia

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted experiments on both outdoor (KITTI) and indoor (NYUv2) datasets, and showed that our algorithm vastly outperforms state-of-the-art, which demonstrates the benefits of our approach.
Researcher Affiliation Collaboration Zhenheng Yang,1 Peng Wang,2 Wei Xu,2 Liang Zhao,2 Ramakant Nevatia1 zhenheny@usc.edu {wangpeng54,wei.xu,zhaoliang07}@baidu.com nevatia@usc.edu 1University of Southern California 2Baidu Research
Pseudocode No The paper describes computational steps (e.g., 'Formally, the solver for normals is written as...') but does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide a specific repository link or explicitly state that the source code for their methodology is publicly available.
Open Datasets Yes To better compare with other methods, we evaluate on the popular KITTI 2015 (Geiger, Lenz, and Urtasun 2012) dataset. [...] Besides the outdoor dataset, we also explore applying our framework on the indoor scenes: NYUv2 dataset (Silberman et al. 2012).
Dataset Splits Yes This results in 40,109 trainig sequences and 4431 validation sequences. [...] (1) Eigen split contains 697 test images proposed by (Eigen, Puhrsch, and Fergus 2014); (2) KITTI split contains 200 highquality disparity images provided as part of official KITTI training set.
Hardware Specification Yes With a Nvidia Titan X (Pascal), the training process takes around 6 hours.
Software Dependencies No Our framework is implemented with publicly available Tensorflow (Abadi et al. 2016) platform... The paper mentions 'Tensorflow' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes Adam optimizer is applied with parameters β1 = 0.9, β2 = 0.000, ϵ = 10 8. Learning rate and batch size are set to be 2 10 3 and 4 respectively. [...] We set λn = 1 and λg = λs. The length of input sequence is fixed to be 3 and the input frames are resized to 128 416.