Self-Supervised Monocular Depth Estimation in the Dark: Towards Data Distribution Compensation
Authors: Haolin Yang, Chaoqiang Zhao, Lu Sheng, Yang Tang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | qualitative and quantitative results demonstrate that our method achieves So TA depth estimating results on the challenging nu Scenes-Night and Robot Car Night compared with existing methods. Our presented method achieves So TA performance on the challenging nu Scenes-Night and Robot Car-Night dataset, though no nighttime images are used in our training framework. We construct the self-supervised training process on the widely used KITTI dataset... For evaluation, we use the nu Scenes-Night test split... and the Robot Car-Night test split... Table 2 shows the contributions of each pre-processing part. The comparison of different self-supervised training approaches is left to the supplementary material to further prove the effectiveness of our training framework. In nu Scenes-Night, compared with the baseline, BPG improves ABS rel by 19.3% and RMSE by 14.0%, and ING improves ABS rel by 18.0% and RMSE by 9.1%. Besides, the joint application of BPG and ING achieves the best scores with 20.8% improvement on ABS rel, 20.1% on RMSE and 35.3% on δ1. For Robot Car-Night, the joint application of BPG and ING still achieves a significant boost with 13.2% on ABS rel and 9.7% on RMSE. |
| Researcher Affiliation | Academia | Haolin Yang1 , Chaoqiang Zhao1 , Lu Sheng2 and Yang Tang1 1East China University of Science and Technology 2Beihang University |
| Pseudocode | No | The paper describes the components and their interactions, but does not provide formal pseudocode or algorithm blocks. |
| Open Source Code | No | In addition, we will release the code upon acceptance. |
| Open Datasets | Yes | We construct the self-supervised training process on the widely used KITTI dataset, and following [Godard et al., 2019; Zhao et al., 2022b], the KITTI Eigen training split [Eigen et al., 2014] is used as the training set due to its high-quality images, and we also regard it as our basic day-image distribution. |
| Dataset Splits | No | The paper mentions training and test sets but does not explicitly describe a separate validation split or its size. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a 'Transformer CNN hybrid network in Mono Vi T [Zhao et al., 2022b] as our Depth Net' and 'Illuminating Change Net (ICN)', but does not provide specific software versions for libraries, frameworks, or languages (e.g., PyTorch version, Python version, CUDA version). |
| Experiment Setup | Yes | We use the same Transformer CNN hybrid network in Mono Vi T [Zhao et al., 2022b] as our Depth Net because of its good performance. Besides, inspired by [Vankadari et al., 2023], we further consider the potential illuminating changes between images, and the Illuminating Change Net (ICN) is introduced to predict the linear per-pixel illuminating change Ct,t+n and Bt,t+n. The sampled image on current stage, IL t , will be (sd It)gf + i=1 ss (LS, s F , pi)gf !1/gf . The illumination scale rate is set to follow uniform distribution, i.e. sd U(0.4, 1). log s F U(log smin F , log smax F ), log F U(log F min, log F max), Following Flare7K [Dai et al., 2022], BPG adds the light source within the gamma range of gf U(1.8, 2.2). We set gn = 1/2.2. To simulate the low photon count C in the dark, a light scale factor sn is proposed follows U(100, 300) [Wei et al., 2020]. We also provide the G (Generalize to test set) version result of each DA method for further comparisons. All methods use the same Depth Net backbone unless marked. Max depth here indicates the up range of ground truth depth. Note that... the applied resolution 768 256 is a little smaller than 640 320. |