reproducibilityindex.ai

Deep Homography Estimation for Visual Place Recognition

Authors: Feng Lu, Shuting Dong, Lijun Zhang, Bingxi Liu, Xiangyuan Lan, Dongmei Jiang, Chun Yuan

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on benchmark datasets show that our method can outperform several state-of-the-art methods. And it is more than one order of magnitude faster than the mainstream hierarchical VPR methods using RANSAC. The code is released at https://github.com/Lu-Feng/DHE-VPR.
Researcher Affiliation	Academia	Feng Lu1,2, Shuting Dong1,2, Lijun Zhang3, Bingxi Liu2,4, Xiangyuan Lan2, Dongmei Jiang2, Chun Yuan1,2 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Peng Cheng Laboratory 3Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences 4Southern University of Science and Technology {lf22@mails, dst21@mails, yuanc@sz}.tsinghua.edu.cn, zhanglijun@cigit.ac.cn, {liubx, lanxy, jiangdm}@pcl.ac.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code is released at https://github.com/Lu-Feng/DHE-VPR.
Open Datasets	Yes	We conduct experiments using multiple VPR datasets: MSLS (Warburg et al. 2020), Pitts30k (Torii et al. 2013), Nordland (downsampled test set with 224x224 image size) (Olid et al. 2018), and St. Lucia (Berton et al. 2022).
Dataset Splits	Yes	We conduct several ablation experiments on the Pitts30k and MSLS (val) datasets to validate the design of our DHE network and training strategy.
Hardware Specification	Yes	Experiments are implemented using Py Torch on an NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup	Yes	The re-projection error threshold θ of the inlier is set to 1.5 times the patch size for RANSAC, and 3 times the patch size for geometric verification using DHE (in inference). The margin m in Eq. 9 is set to 0.1, and the weight λ in Eq. 11 is 100. Experiments are implemented using Py Torch on an NVIDIA Ge Force RTX 3090 GPU. For the initialization of the DHE network, the Adam optimizer is used with learning rate = 0.0001 (multiplied by 0.8 after every 5 epochs) and batch size = 16. We train the network for 100 epochs (2k iterations per epoch) on MSLS-train. The implementation of the backbone initialization and the fine-tuning of entire model basically follows the benchmark (Berton et al. 2022), with learning rate = 0.00001 and batch size = 4. For the backbone initialization, we train CCT-14 on MSLS-train for MSLS, Nordland, and St. Lucia, and further train it on Pitts30k-train for Pitts30k. For fine-tuning, the DHE network and the last 2 encoder layers in backbone are updatable. The model for Pitts30k is fine-tuned on Pitts30ktrain for 40 epochs (5k iterations per epoch), while the model for others is fine-tuned on MSLS-train for 2 epochs (10k iterations per epoch). We use 2 hard negative images in a triplet.