IINet: Implicit Intra-inter Information Fusion for Real-Time Stereo Matching
Authors: Ximeng Li, Chen Zhang, Wanjuan Su, Wenbing Tao
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results affirm the superiority of our network in terms of both speed and accuracy compared to all other fast methods. Experiments Datasets We use four datasets in our experiments. |
| Researcher Affiliation | Collaboration | 1National Key Laboratory of Science and Technology on Multispectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, China 2Tuke Research {ximengli, zhangchen , suwanjuan, wenbingtao}@hust.edu.cn |
| Pseudocode | No | The paper describes procedures and architectures (e.g., in Figure 2 and 5) but does not include structured pseudocode or algorithm blocks with explicit labels like "Pseudocode" or "Algorithm". |
| Open Source Code | No | The paper does not provide any explicit statement about releasing the source code for the methodology or a link to a code repository. |
| Open Datasets | Yes | Datasets We use four datasets in our experiments. The Scene Flow (Mayer et al. 2016) is a large synthetic dataset... The Spring (Mehl et al. 2023)... The KITTI (Geiger, Lenz, and Urtasun 2012)... Lastly, the Middlebury2014 (Scharstein et al. 2014)... |
| Dataset Splits | Yes | The Scene Flow (Mayer et al. 2016) is a large synthetic dataset consisting of 35, 454 training pairs and 4, 370 testing pairs. The KITTI (Geiger, Lenz, and Urtasun 2012)... comprising 394 training pairs and 395 testing pairs. |
| Hardware Specification | Yes | Our model is developed using Py Torch and executed on an RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions "Py Torch" but does not provide a specific version number. No other software dependencies are listed with version numbers. |
| Experiment Setup | Yes | For pretraining on the Scene Flow, we incorporate image augmentation with a probability of 0.2. The augmentation includes color adjustments, brightness enhancements, and contrast modifications to simulate varying exposure conditions. Furthermore, random vertical and rotation shifts are applied to the right image, followed by random cropping to resolution of 512 384. We employ the Adam optimizer and train for 105 epochs. The first 13 epochs use the pretrain loss to train the FMSV. Subsequently, the full network is trained using the total loss for the remaining epochs. The initial learning rate is set to 0.001, then reduced to 0.0001 at epoch 10, and halved at epochs 50, 70, 85, 95, and 100. For the Spring, we fine-tune the pretrained model using the training set for 65 epochs. The initial learning rate is set to 0.00005 and reduced by half at epochs 40 and 55. |