Semi-supervised Deep Large-Baseline Homography Estimation with Progressive Equivalence Constraint
Authors: Hai Jiang, Haipeng Li, Yuhang Lu, Songchen Han, Shuaicheng Liu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method achieves state-of-the-art performance in large-baseline scenes while keeping competitive performance in small-baseline scenes. |
| Researcher Affiliation | Collaboration | Hai Jiang1,3*, Haipeng Li2,3*, Yuhang Lu4, Songchen Han1, , Shuaicheng Liu2,3, 1Sichuan University 2University of Electronic Science and Technology of China 3Megvii Technology 4University of South Carolina |
| Pseudocode | No | The paper describes its methodology in narrative text and diagrams (Figure 2, Figure 3) but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and dataset are available at https://github.com/megviiresearch/LBHomo. |
| Open Datasets | Yes | We introduce a large-scale dataset for large-baseline homography estimation considering there lacks a dedicated dataset for this task. Our dataset contains 5 categories, including regular (RE-L), low-texture (LTL), low-light (LL-L), small-foregrounds (SF-L), and large-foregrounds (LF-L) scenes. We select image pairs from real-world scenes and ensure the average non-overlap rate between the source and target images is from 20% to 50%. Our dataset contains 78k image pairs in totally, and 1.8k image pairs are randomly chosen from all categories as the evaluation data. For each evaluation image pair, we manually labeled more than 6 uniform distributed matching points for quantitative comparisons. Some examples of our dataset are illustrated in Fig. 4. Code and dataset are available at https://github.com/megviiresearch/LBHomo. |
| Dataset Splits | No | Our dataset contains 78k image pairs in totally, and 1.8k image pairs are randomly chosen from all categories as the evaluation data. The paper mentions data used for 'training stage' and 'evaluation data' but does not provide specific details or percentages for distinct training, validation, and test splits. |
| Hardware Specification | Yes | The training is performed on four NVIDIA RTX 2080Ti GPUs. |
| Software Dependencies | No | Our network is implemented with Py Torch, but no specific version number for PyTorch or any other software dependencies are provided. |
| Experiment Setup | Yes | In the training stage, we randomly crop patches of size 320 480 near the center of the initial images as input, and the resolution of resized images is set to (H W ) = (256 256). The number of inserted images is set to n = 2, and the non-overlap rate of the two intermediate images is less than 20%. We empirically set the λi in Eq.(7) to 10 i. The Adam optimizer (Kingma and Ba 2015) is adopted with an initial learning rate of 5 10 4 for model optimization, and it decays by a factor of 0.8 after every epoch. The batch size is set to 16. |