Hierarchical Object-Aware Dual-Level Contrastive Learning for Domain Generalized Stereo Matching

Authors: Yikun Miao, Meiqing Wu, Siew Kei Lam, Changsheng Li, Thambipillai Srikanthan

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on four widely used realistic stereo matching datasets using multiple network architectures underscore the effectiveness and intrinsic generality of HODC in finding semantically and structurally driven matching for generalizable stereo matching networks.
Researcher Affiliation Academia Yikun Miao Beijing Institute of Technology joshmiao233@gmail.com Meiqing Wu Nanyang Technological University meiqingwu@ntu.edu.sg Siew-Kei Lam Nanyang Technological University assklam@ntu.edu.sg Changsheng Li Beijing Institute of Technology lcs@bit.edu.cn Thambipillai Srikanthan Nanyang Technological University astsrikan@ntu.edu.sg
Pseudocode No No pseudocode or algorithm blocks were found.
Open Source Code Yes Url for the project page: https://joshmiao.github.io/HODC/.
Open Datasets Yes We train all models with synthetic dataset Scene Flow [27] and evaluate their generalization performance on the training set of four realistic datasets: KITTI-2012 [12], KITTI-2015 [28], Middlebury [32] and ETH3D [33].
Dataset Splits Yes Scene Flow [27] is a large scale synthetic dataset consisting of three subsets: Flying Things3D, Driving, and Monkaa. In all, Scene Flow provides 35,454 training stereo image pairs and 4,370 testing image pairs with a resolution of 960 × 540, with dense ground-truth disparity and object index. All our models are trained on Scene Flow training set only.
Hardware Specification Yes We use a single Nvidia RTX 3090 graphics card (with 24 Gi B memory) and the batch size is set to 2 for the experiment.
Software Dependencies No The paper mentions 'All models are implemented by Pytorch' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes All models are implemented by Pytorch and trained with the Adam optimizer (β1 = 0.9, β2 = 0.999). We train the models from scratch... with a batch size of 8 for 45 epochs on Scene Flow [27]. The learning rate is set to 0.001, which decreases by half after epoch 15 and 30... The input image are normalized with the mean ([0.485, 0.456, 0.406]) and variation ([0.229, 0.224, 0.225]) of Image Net [9]. The maximum disparity D for training and evaluation is set to D = 192 for PSMNet, Gwc Net, IGEV, and D = 256 for CFNet.