reproducibilityindex.ai

Threshold-Consistent Margin Loss for Open-World Deep Metric Learning

Authors: Qin ZHANG, Linghan Xu, Jun Fang, Qingming Tang, Ying Nian Wu, Joseph Tighe, Yifan Xing

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate TCM s effectiveness in enhancing threshold consistency while preserving accuracy, simplifying the threshold selection process in practical DML settings. 5 EXPERIMENTS
Researcher Affiliation	Collaboration	Qin Zhang1 , Linghan Xu1 , Qingming Tang2, Jun Fang1, Ying Nian Wu1, Joe Tighe1, Yifan Xing1 1 AWS AI Labs, 2 Alexa AI {qzaamz, linghax, qmtang, junfa, wunyin, yifax}@amazon.com, jtighe@cs.unc.edu
Pseudocode	Yes	Algorithm 1 Computation for OPIS metric and Algorithm 2 Training with TCM regularization
Open Source Code	No	The paper does not contain an explicit statement about releasing the source code for the described methodology or a link to a code repository.
Open Datasets	Yes	For training and evaluation, we use four commonly-used image retrieval benchmarks, namely i Naturalist-2018 (Horn et al., 2017), Stanford Online Product (Song et al., 2015), CUB-200-2011 (Wah et al., 2011) and Cars-196 (Krause et al., 2013).
Dataset Splits	Yes	The margin parameters (m+, m−) are tuned using grid search on 10% of the training data for each benchmark.
Hardware Specification	Yes	The Vi T-B/16 backbone is utilized with 8 Tesla V100 GPUs and a batch size of 392.
Software Dependencies	No	The paper mentions using the “timm library (Wightman, 2019)” but does not specify its version number or the version numbers for other crucial software dependencies like PyTorch or Python.
Experiment Setup	Yes	During training, mini-batches are generated by randomly sampling 4 images per class following previous works (Brown et al., 2020; Patel et al., 2022). For TCM, we set λ+ = λ− = 1. For OPIS, the calibration range is set to 1e-2 < FAR < 1e-1 for all benchmarks. The margin parameters (m+, m−) are tuned using grid search on 10% of the training data for each benchmark.