Learning Relative Similarity by Stochastic Dual Coordinate Ascent

Authors: Pengcheng Wu, Yi Ding, Peilin Zhao, Chunyan Miao, Steven Hoi

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we conduct extensive experiments on both standard and large-scale data sets to validate the effectiveness of the proposed algorithm for retrieval tasks. Theoretically, we prove the optimal linear convergence rate for the proposed SDCA algorithm, beating the well-known sublinear convergence rate by the previous best metric learning algorithms.
Researcher Affiliation Academia School of Computer Engineering, Nanyang Technological University, 639798, Singapore Department of Statistics, Rutgers University, Piscataway, NJ, 08854, USA
Pseudocode Yes Algorithm 1 SDCA: Stochastic Dual Coordinate Ascent for Relative Similarity Learning
Open Source Code No The paper does not include an explicit statement about releasing the source code for the described methodology, nor does it provide a direct link to a code repository.
Open Datasets Yes We first conduct experiments of similarity/distance metric learning on five standard machine learning datasets publicly available at LIBSVM 1, as shown in Table 1. 1http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/ and Caltech256 dataset and Image CLEF 2 as the ground-truth 2http://www.imageclef.org/
Dataset Splits Yes For each dataset in Table 1, data instances from each class were split into training set (70%) and test set (30%). We adopt cross-validation to choose parameters for all algorithms, in which models were learned on 80% of the training set and validated on the rest 20%.
Hardware Specification No The paper describes the experimental setup, including datasets and algorithms, but does not provide any specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments.
Software Dependencies No The paper mentions using LIBSVM datasets and various algorithms, but it does not specify any software dependencies or their version numbers used in the implementation or for conducting the experiments.
Experiment Setup Yes The parameters set by cross validation include: the λ parameter for SDCA (λ {0.0025, 0.005, 0.01}), and the η parameter for ITML and LEGO (η {0.01, 0.125, 0.5}). To obtain side information in the form of triplets for learning similarity function, we generate a triplet instance by randomly sampling two instances sharing the same class and another one instance from any other different class. In total, we provide 10K triplet instances for each standard data set, 100K triplets for Caltech256 Dataset and 500K triplets for the large-scale experiment.