reproducibilityindex.ai

ResMatch: Residual Attention Learning for Feature Matching

Authors: Yuxin Deng, Kaining Zhang, Shihua Zhang, Yansheng Li, Jiayi Ma

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments, including feature matching, pose estimation and visual localization, conﬁrm the superiority of the proposed method.
Researcher Affiliation	Academia	Yuxin Deng1, Kaining Zhang1, Shihua Zhang1, Yansheng Li2, Jiayi Ma1 1Electronic Information School, Wuhan University, Wuhan 430072, China 2School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China {dyx acuo, zkn19961212, yansheng.li}@whu.edu.com, {suhzhang001, jyma2010}@gmail.com
Pseudocode	No	The paper describes formulas and operations but does not include structured pseudocode or algorithm blocks (e.g., labeled 'Algorithm' or formatted as distinct steps).
Open Source Code	Yes	Our codes are available at https://github.com/ACu Oo Oo O/Res Match.
Open Datasets	Yes	Following SGMNet (Chen et al. 2021), we train the network on the GL3D (Shen et al. 2018). We evaluate our methods for two-view image matching on three datasets, including YFCC100M (Thomee et al. 2016), Scan Net (Dai et al. 2017), and FM-Bench (Bian et al. 2019).
Dataset Splits	No	The paper mentions training on GL3D and evaluating on other datasets (YFCC100M, Scan Net, FM-Bench), but it does not provide specific details on training, validation, and test splits for the GL3D dataset or for how validation was performed during training (e.g., percentages, sample counts, or explicit standard split citations for the training process).
Hardware Specification	Yes	Computation efﬁciency of feature matching networks on an NVIDIA RTX3090 GPU.
Software Dependencies	No	The paper mentions using a 'Sinkhorn algorithm' and integrating with 'Hierarchical Localization (HLoc)', but it does not specify any software names with version numbers (e.g., Python, PyTorch, specific library versions) that are necessary to reproduce the experiment.
Experiment Setup	Yes	Our networks consist of sequential 9 blocks of 4-head hybrid attention, of which feature dimension is consistent with the input descriptor. In the training, 1k features are extracted for each image. 10 iterations of Sinkhorn algorithm are performed to obtain the assignment matrix. And cross-entropy loss same as Super Glue (Sarlin et al. 2020) is conducted on the ﬁnal matching probability. The networks are trained in 450000 iterations with batch size of 16. k is empirically set to a constant 64.