Coded Residual Transform for Generalizable Deep Metric Learning

Authors: Shichao Kan, Yixiong Liang, Min Li, Yigang Cen, Jianxin Wang, Zhihai He

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experimental results and ablation studies demonstrate that the proposed CRT method outperform the state-of-the-art deep metric learning methods by large margins and improving upon the current best method by up to 4.28% on the CUB dataset.
Researcher Affiliation Academia Shichao Kan1, Yixiong Liang1, Min Li1, Yigang Cen2,3, , Jianxin Wang1, Zhihai He4,5, 1School of Computer Science and Engineering, Central South University, Changsha, Hunan, 410083 2Institute of Information Science, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China 3Beijing Key Laboratory of Advanced Information Science and Network Technology, Beijing 100044, China 4Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen, China 5Pengcheng Lab, Shenzhen, 518066, China
Pseudocode No The paper describes the proposed method in text and with diagrams (Figure 1, Figure 2), but does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Supplemental material
Open Datasets Yes Four datasets, i.e., CUB-200-2011 [57], Cars-196 [58], Standford Online Products (SOP) [32], and In-Shop Clothes Retrieval (In-Shop) [59], are used to our experiments. We use the same training and test split as in existing papers.
Dataset Splits Yes Hyperameters were determined previous to the result runs using a 80-20 training and validation split.
Hardware Specification No The paper does not provide specific details about the hardware used, such as GPU or CPU models. It only mentions 'our available computing resources' which is not specific enough.
Software Dependencies No The paper mentions software components and frameworks like 'Mix Transformer-B2 (Mi T-B2)' and 'GELU nonlinear activation layer', but it does not specify exact version numbers for these or other libraries/dependencies.
Experiment Setup Yes The initial learning rate is 3e 5. For all images, the MS loss weight λ1 in the first embedding branch is set as 1.0. In second embedding branch, it is set as 0.1 for the CUB and Cars datasets, and 0.9 for the SOP and In-Shop datasets. The consistency loss weight λ2 is set to 0.9. The backbone network is pre-trained on the Image Net-1K dataset. The batch size is set to 80 on the CUB and Cars datasets, and 180 on the SOP and In-Shop datasets. We apply random cropping with random flipping and resizing to 227 227 for all training images.