Simultaneous Similarity-based Self-Distillation for Deep Metric Learning
Authors: Karsten Roth, Timo Milbich, Bjorn Ommer, Joseph Paul Cohen, Marzyeh Ghassemi
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments and ablations across different objectives and standard benchmarks show S2SD offers notable improvements of up to 7% in Recall@1, while also setting a new state-of-the-art. In Tab. 1 (full table in Supp. Tab. 4), we show that under fair experimental protocol, utilizing S2SD and its ablations gives an objective and benchmark independent, significant boost in performance by up to 7% opposing the existing DML objective performance plateau. |
| Researcher Affiliation | Academia | 1University of Toronto, Vector Institute 2Heidelberg University, IWR 3Mila, Universit e de Montr eal 4MIT. |
| Pseudocode | Yes | Pseudo code and detailed results are available in Supp. F, G, and I. |
| Open Source Code | Yes | Code available at https://github.com/MLfor Health/S2SD. |
| Open Datasets | Yes | In all experiments, we evaluate on standard DML benchmarks: CUB200-2011 (Wah et al., 2011), CARS196 (Krause et al., 2013) and Stanford Online Products (SOP) (Oh Song et al., 2016). |
| Dataset Splits | Yes | All benchmarks only offer a train/test split. As such, we use a 80-20 train/validation split of the original training split to determine hyperparameters (e.g. Roth et al. (2020b) and Kim et al. (2020)), and use those for training on the full training dataset and evaluation on the test set used throughout literature (see Sec. 4). |
| Hardware Specification | No | The paper discusses training parameters and models (e.g., 'ResNet50') and mentions runtime, but does not specify the hardware used for experiments (e.g., specific GPU or CPU models). |
| Software Dependencies | No | The paper mentions software components like 'Batch Normalization', 'Adam', and 'ResNet50', but does not provide specific version numbers for these or the underlying deep learning framework (e.g., PyTorch version, TensorFlow version). |
| Experiment Setup | Yes | For S2SD, unless noted otherwise (s.a. in 5.4), we set γ = 50, T = 1 for all objectives on CUB200 and CARS196, and γ = 5, T = 1 on SOP. DSD uses target-dim. d = 2048 and MSD uses target-dims. d [512, 1024, 1536, 2048]. |