Compressed Self-Attention for Deep Metric Learning with Low-Rank Approximation

Authors: Ziye Chen, Mingming Gong, Lingjuan Ge, Bo Du

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed CSALR on person re-identification which is a typical metric learning task. Extensive experiments shows the effectiveness and efficiency of CSALR in deep metric learning and its superiority over the baselines.
Researcher Affiliation Academia Ziye Chen1 , Mingming Gong2 , Lingjuan Ge1 and Bo Du1 1School of Computer Science, Institute of Artificial Intelligence, and National Engineering Research Center for Multimedia Software, Wuhan University, China 2School of Mathematics and Statistics, University of Melbourne, Australia
Pseudocode No The paper does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets Yes We use three datasets for evaluation, i.e., Market-1501 [Zheng et al., 2015], Duke MTMC-re ID [Ristani et al., 2016; Zheng et al., 2017], and CUHK03-NP [Zhong et al., 2017].
Dataset Splits No The paper describes training and testing sets, but does not explicitly mention a separate validation set or how it was used for hyperparameter tuning. The datasets define training IDs and query/gallery for testing.
Hardware Specification No The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.
Software Dependencies No The paper mentions software components like "Res Net-50" and "Stochastic Gradient Descent (SGD)" but does not provide specific version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup Yes The backbone is Res Net-50 with pre-trained weights from Image Net. We apply the CSALR module to each residual block of the backbone except the last one, leading to 3 CSALR modules in total. In Eq. (9), the loss LF is the same as PCB and PCB-RPP, and the loss LDS is cross entropy. The balance factor λ is set to 1.0. We implement the multi-head form of CSALR. For each CSALR module, the number of sampled points in each group is set to 96, and the number of groups is set to 2. ... the input images are all resized to 384 128. We apply random horizontal flipping with 0.5 and normalization as data augmentation. For training, we set the batch size to 64 and the total number of training epochs to 100. The optimizer is set to Stochastic Gradient Descent (SGD) with momentum of 0.9 and weight decay of 10 4. The base learning rate is initialized to 0.1 and decayed by 0.1 after every 40 epochs, and the learning rate for the backbone network is set to 0.1 the base learning rate. For PCB-RPP, we first train PCB for 40 epochs, and then train RPP for another 60 epochs with weights initialized from PCB.