reproducibilityindex.ai

U-Match: Two-view Correspondence Learning with Hierarchy-aware Local Context Aggregation

Authors: Zizhuo Li, Shihua Zhang, Jiayi Ma

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on different visual tasks prove that our method signiﬁcantly surpasses the stateof-the-arts.
Researcher Affiliation	Academia	Electronic Information School, Wuhan University, Wuhan 430072, China
Pseudocode	No	The paper provides network architecture diagrams and mathematical formulations but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is publicly available at https://github.com/Zizhuo Li/U-Match.
Open Datasets	Yes	Datasets. As in the previous work [Zhang et al., 2019], we resort to two popular datasets, YFCC100M [Thomee et al., 2016] and SUN3D [Xiao et al., 2013], to demonstrate the correspondence learning ability of our method in outdoor and indoor scenes, respectively.
Dataset Splits	Yes	YFCC100M contains 100 million images from Internet, which are split into 72 sequences according to different tourist spots. We choose 68 sequences as training and validation data, and the remaining sequences are used for testing. SUN3D is a large-scale RGB-D video dataset with relative camera motions retrieved by generalized bundle adjustment. It is comprised of 254 indoor image sequences with poor texture, repetitive elements, and selfocclusions, where 239 sequences are adopted for training and validation, and the rest sequences are used for testing.
Hardware Specification	Yes	All experiments are conducted on Ubuntu 18.04 with Ge Force RTX 3090 GPUs.
Software Dependencies	No	The paper mentions implementing the model with Pytorch and using the Adam optimizer but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	In our implementation, the input to our model is N 4 putative correspondences established by an NN matcher with SIFT detector-descriptors, typically N = 2000, unless otherwise speciﬁed. The number of levels is set to L = 4, i.e., each HRGA module contains three LCPool layers with the sampling ratios of 0.125, 0.5, 0.5, respectively. We use 4-head attention in the context aggregation layer. We implement our model with Pytorch and adopt Adam optimizer with a learning rate of 10 4 and a batch size of 32 in optimization. Weight α is set to 0 at the start and to 0.5 after ﬁrst 20k iterations.