reproducibilityindex.ai

It Takes Two to Tango: Mixup for Deep Metric Learning

Authors: Shashanka Venkataramanan, Bill Psomas, Ewa Kijak, laurent amsaleg, Konstantinos Karantzalos, Yannis Avrithis

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate the effect of improved representations, we show that mixing inputs, intermediate representations or embeddings along with target labels signiﬁcantly outperforms state-of-the-art metric learning methods on four benchmark deep metric learning datasets.
Researcher Affiliation	Academia	1Inria, Univ Rennes, CNRS, IRISA 2Athena RC 3National Technical University of Athens
Pseudocode	No	No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code	No	The paper adapts an 'official code' from another work (https://github.com/navervision/proxy-synthesis) but does not provide a statement or link for the open-source code of their own methodology (Metrix).
Open Datasets	Yes	We experiment on Caltech-UCSD Birds (CUB200) (Wah et al., 2011), Stanford Cars (Cars196) (Krause et al., 2013), Stanford Online Products (SOP) (Oh Song et al., 2016) and In-Shop Clothing retrieval (In-Shop) (Liu et al., 2016) image datasets.
Dataset Splits	No	The paper provides statistics for training and testing images (e.g., Table 4: '# training images 5,894', '# testing images 5,894' for CUB200), but does not explicitly mention a separate validation set or its split details.
Hardware Specification	Yes	On CUB200 dataset, using a batch size of 100 on an NVIDIA RTX 2080 Ti GPU, the average training time in ms/batch is 586 for MS and 817 for MS+Metrix.
Software Dependencies	No	The paper mentions using 'Adam W (Loshchilov & Hutter, 2019) optimizer' but does not specify versions for any programming languages, libraries, or other software components.
Experiment Setup	Yes	We train R-50 using Adam W (Loshchilov & Hutter, 2019) optimizer for 100 epochs with a batch size 100. The initial learning rate per dataset is shown in Table 4. The learning rate is decayed by 0.1 for Cont and by 0.5 for MS and PA on CUB200 and Cars196. For SOP and In-Shop, we decay the learning rate by 0.25 for all losses. The weight decay is set to 0.0001.