Towards Improved Proxy-Based Deep Metric Learning via Data-Augmented Domain Adaptation

Authors: Li Ren, Chen Chen, Liqiang Wang, Kien Hua

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on benchmarks, including the popular CUB-200-2011, CARS196, Stanford Online Products, and In-Shop Clothes Retrieval, show that our learning algorithm significantly improves the existing proxy losses and achieves superior results compared to the existing methods.
Researcher Affiliation Academia University of Central Florida {Li.Ren, Chen.Chen, Liqiang.Wang, Kien.Hua}@ucf.edu
Pseudocode Yes Algorithm 1: Data-Augmented Domain Adaptation (DADA) for Proxy-based Deep Metric Learning
Open Source Code Yes The code and Appendix are available at: https://github.com/Noahsark/DADA
Open Datasets Yes We use the standard benchmarks CUB-200-2011 (CUB200) (Wah et al. 2011) with 11,788 bird images and 200 classes, and CARS196 (Krause et al. 2013) that contains 16,185 car images and 196 classes. We also evaluate our method on larger Stanford Online Products (SOP) (Oh Song et al. 2016) benchmark that includes 120,053 images with 22,634 product classes, and In-shop Clothes Retrieval (In-Shop) (Liu et al. 2016) dataset with 25,882 images and 7982 classes.
Dataset Splits Yes We follow the data split that is consistent with the standard settings of existing DML works (Teh, De Vries, and Taylor 2020; Kim et al. 2020; Venkataramanan et al. 2022; Zheng et al. 2021b; Roth, Vinyals, and Akata 2022; Lim et al. 2022; Zhang et al. 2022).
Hardware Specification Yes We train our model in a machine that contains a single RTX3090 GPU with 24GB memory.
Software Dependencies No The paper mentions using Adam optimizer and RDML backbones, but does not provide specific version numbers for software libraries or dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup Yes Our optimization is done using Adam (β1 = 0.5, β2 = 0.999) (Kingma and Ba 2015) with a decay of 1e-3. We set the learning rate at 1.2e-4 for the feature generator f G( ) and 5e-4 for our discriminators. We adopt the learning rate 4e-2 for the proxies as suggested in (Roth, Vinyals, and Akata 2022). For most of the experiments, we fixed the batch size to 90 as a default setting... For all experiments, the first layer of f C( ) is set to 512. For the second layer, we assigned 128 dimensions to the CUB200 and CARS196 datasets, 8192 dimensions to the SOP datasets, and 4096 dimensions to the In-Shop datasets. We set {η = 0.005, γ = 0.0075} for CUB200, and {η = 0.01, γ = 0.0075} for CARS196. We select {η = 0.01, γ = 0.005} for both SOP and In-Shop datasets.