Learning Token-Based Representation for Image Retrieval
Authors: Hui Wu, Min Wang, Wengang Zhou, Yang Hu, Houqiang Li2703-2711
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted to evaluate our approach, which outperforms the state-of-the-art methods on the Revisited Oxford and Paris datasets. |
| Researcher Affiliation | Academia | 1 CAS Key Laboratory of GIPAS, University of Science and Technology of China 2 Institute of Artificial Intelligence, Hefei Comprehensive National Science Center wh241300@mail.ustc.edu.cn, wangmin@iai.ustc.edu.cn, {zhwg, eeyhu, lihq}@ustc.edu.cn |
| Pseudocode | No | The paper describes its method using text and mathematical equations, and includes a block diagram (Figure 2), but no pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | The clean version of Google landmarks dataset V2 (GLDv2-clean) (Weyand et al. 2020) is used for training. |
| Dataset Splits | Yes | We randomly divide it into two subsets train / val with 80%/20% split. The train split is used for training model, and the val split is used for validation. |
| Hardware Specification | Yes | We use a batch size of 128 to train our model on 4 NVIDIA RTX 3090 GPUs for 30 epochs... on a single thread GPU (RTX 3090) / CPU (Intel Xeon CPU E5-2640 v4 @ 2.40GHz). |
| Software Dependencies | No | The paper mentions using SGD as the optimizer but does not specify version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages. |
| Experiment Setup | Yes | We use a batch size of 128 to train our model on 4 NVIDIA RTX 3090 GPUs for 30 epochs... SGD is used to optimize the model, with an initial learning rate of 0.01, a weight decay of 0.0001, and a momentum of 0.9. ... The dimension d of the global feature is set as 1024. For the Arc Face margin loss, we empirically set the margin m as 0.2 and the scale γ as 32.0. Refinement block number N is set to 2. |