reproducibilityindex.ai

Deep Causal Metric Learning

Authors: Xiang Deng, Zhongfei Zhang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on several benchmark datasets demonstrate the superiority of DCML over the existing methods 1. 4. Experiments In this section, we first introduce the experimental settings, then report the comparison results between DCML and the state-of-the-art (SOTA) approaches, and finally present the ablation studies and qualitative results.
Researcher Affiliation	Academia	1Department of Computer Science, State University of New York at Binghamton, NY, US. Correspondence to: Xiang Deng <xdeng7@binghamton.edu>.
Pseudocode	Yes	Algorithm 1 DCML Input: Training data D, Encoder f, Attention MLP Tθ. for i = 1 to N epochs do if i%e == 0 then for i = 1 to M do Update sample weights (environments) with (11) end for end if Update the the model parameters by minimizing (10) end for
Open Source Code	Yes	1Code: https://github.com/Xiang-Deng-DL/DCML. Their final values on each dataset are given in the Github repository: https://github.com/Xiang-Deng-DL/DCML.
Open Datasets	Yes	Datasets. Following the existing literature, we adopt the three widely used metric learning benchmark datasets, i.e., CUB-200 (Wah et al., 2011), Cars-196 (Krause et al., 2013), and Stanford Online Products (SOP) (Oh Song et al., 2016).
Dataset Splits	Yes	4-fold cross validation on the first half of the classes in each dataset is used for training the model. Specifically, the first half of classes are divided into 4 partitions deterministically. 3 of the 4 partitions are used as the training dataset and the remaining 1 as the validation dataset for tuning the hyper-parameters.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions optimizers (RMSprop) and backbone architectures (BN-Inception), but does not provide specific version numbers for software dependencies like programming languages, libraries (e.g., PyTorch, TensorFlow), or CUDA.
Experiment Setup	Yes	DCML has 3 hyper-parameters, i.e., α, β, and γ. Instead of using grid search that is time-consuming, we do a very simple search. We first fix α and γ to 0, and tune β in [0, 1]. After the optimal β is obtained, we fix β and α and search the optimal γ in [0, 1]. Finally, we search α while fixing β and γ. Their final values on each dataset are given in the Github repository: https://github.com/Xiang-Deng-DL/DCML. For the hyper-parameters in the proxy loss, we set them to the values searched by (Musgrave et al., 2020). The model is trained with optimizer RMSprop. The learning rates for the backbone and the attention net are set to 1e-6 and 2e-6, respectively. The learning rate for the class proxy vectors are set to the values searched by (Musgrave et al., 2020), i.e., 2.53e-3, 7.41e-3, and 2.16e-3 on CUB-200, Cars-196, and SOP, respectively.