reproducibilityindex.ai

Visual Correspondence Hallucination

Authors: Hugo Germain, Vincent Lepetit, Guillaume Bourmaud

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally demonstrate that this network is indeed able to hallucinate correspondences on pairs of images captured in scenes that were not seen at training-time. We also apply this network to an absolute camera pose estimation problem and ﬁnd it is signiﬁcantly more robust than state-of-the-art local feature matching-based competitors. 4 EXPERIMENTS In these experiments, we seek to answer two questions: 1) “Is the proposed Neur Hal approach presented in Sec. 3 capable of hallucinating correspondences?” and 2) “In the context of absolute camera pose estimation, does the ability to hallucinate correspondences bring further robustness?”
Researcher Affiliation	Academia	Hugo Germain1, Vincent Lepetit1 and Guillaume Bourmaud2 1LIGM, École des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, France 2IMS, University of Bordeaux, Bordeaux INP, CNRS, Bordeaux, France
Pseudocode	No	The paper provides an architectural diagram (Figure 2) but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	We provide the Neur Hal model architecture and weights in the supplementary material. We also release a simple evaluation script that generates qualitative results, and show in a notebook the results obtained on an image pair captured indoors using a smartphone.
Open Datasets	Yes	We evaluate the ability of our network to hallucinate correspondences on four datasets: the indoor datasets Scan Net (Dai et al., 2017) and NYU (Nathan Silberman & Fergus, 2012), and the outdoor datasets Mega Depth (Li & Snavely, 2018) and ETH-3D (Schöps et al., 2017).
Dataset Splits	Yes	For the indoor setting (outdoor setting, respectively), we train Neur Hal on Scan Net (Megadepth, respectively) on the training scenes as described in Sec. 3.4, and evaluate it on the disjoint set of validation scenes. For testing images, we sample 2, 500 image pairs with overlaps between 2% and 80% from the Scan Net testing scenes, using several bins to ensure the sampling is close to being uniform.
Hardware Specification	Yes	For an indoor sample with 2000 keypoints it has an average throughput of 8.84 image/s on an NVIDIA RTX 3070 GPU. We apply the linear scaling rule and use a batch size of 8 over 8 NVIDIA V100 GPUs.
Software Dependencies	No	The model is implemented in Py Torch (Paszke et al., 2017). No specific version numbers for PyTorch or other libraries are provided.
Experiment Setup	Yes	We use an initial learning rate of 10 3, with a linear learning rate warm-up in 3 epochs from 0.1 of the initial learning rate. As Sun et al. (2021), we decay the learning rate by 0.5 every 8 epochs starting from the 8th epoch. We apply the linear scaling rule and use a batch size of 8 over 8 NVIDIA V100 GPUs. We use the Adam W (Loshchilov & Hutter, 2019) optimizer, with a weight decay of 0.1.