reproducibilityindex.ai

GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs

Authors: Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, Xiaowei Zhou

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that GIFT outperforms state-of-the-art methods on several benchmark datasets and practically improves the performance of relative pose estimation.
Researcher Affiliation	Collaboration	State Key Lab of CAD&CG, ZJU-Sensetime Joint Lab of 3D Vision, Zhejiang University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Corresponding authors: {xzhou,bao}@cad.zju.edu.cn. Project page: https://zju3dv.github.io/GIFT.
Open Datasets	Yes	The proposed GIFT is trained on a synthetic dataset. We randomly sample images from MS-COCO [31] and warp images with reasonable homographies deﬁned in Superpoint [11] to construct image pairs for training. [...] we further ﬁnetune GIFT on the GL3D [50] dataset
Dataset Splits	No	The paper mentions using well-known datasets like MS-COCO and GL3D for training and HPSequences and SUN3D for evaluation. However, it does not explicitly provide the specific percentages or counts for training, validation, and test splits used in their own experimental setup, nor does it refer to a standard split by name (e.g., 'we use the standard MS-COCO train/val/test split').
Hardware Specification	Yes	Given a 480 360 image and randomly-distributed 1024 interest points in the image, the Py Torch [46] implementation of GIFT-6 costs about 65.2 ms on a desktop with an Intel i7 3.7GHz CPU and a GTX 1080 Ti GPU.
Software Dependencies	No	The paper mentions 'Py Torch [46]' as the implementation framework but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	The output feature dimension n0 of the vanilla CNN is 32. In both group CNNs, H deﬁned in Eq. (2) is {r, r 1, s, s 1, rs, rs 1, r 1s, r 1s 1, e}, where e is the identity transformation. [...] The output feature dimensions nα and nβ of two group CNNs are 8 and 16 respectively, which results in a 128-dimensional descriptor after bilinear pooling. [...] The margin γ is set to 0.5 in all experiments.