Efficient Stochastic Optimization for Low-Rank Distance Metric Learning

Authors: Jie Zhang, Lijun Zhang

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present empirical studies of the proposed method. The main purpose is to illustrate two characteristics of our method: Memory-efficient: The proposed algorithm has an O(dr) space complexity, where r is an upper bound of the rank of the intermediate solution. To verify our algorithm is memory-efficient, we need to show that all the intermediate solutions are indeed low-rank. Computation-efficient: By taking advantage of the low-rank structure of intermediate solutions and stochastic gradients, the time complexity is O(dr2) per iteration. We will examine the convergence behavior of our method. We will use three real-world data sets in our experiments: Gisette (Chang and Lin 2011), Dexter (Guyon et al. 2004) and News20 (Lang 1995).
Researcher Affiliation Academia Jie Zhang, Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China zhangj@lamda.nju.edu.cn, zhanglj@lamda.nju.edu.cn
Pseudocode Yes Algorithm 1 Efficient SGD for low-rank DML
Open Source Code No The paper does not provide any concrete access to source code, such as a specific repository link or an explicit statement of code release for the methodology described.
Open Datasets Yes We will use three real-world data sets in our experiments: Gisette (Chang and Lin 2011), Dexter (Guyon et al. 2004) and News20 (Lang 1995).
Dataset Splits No The paper mentions generating training triplets but does not provide specific dataset split information such as exact percentages, sample counts, or a detailed splitting methodology for training, validation, and test sets.
Hardware Specification Yes All algorithms are tested on a computer with 3.1GHz CPU and 8GB RAM.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes We set the step size ηt = c/t where c is searched in {1e 7, 1e 6, . . . , 1, 10}, and we choose the one that leads to the largest decrement of the objective value. In each iteration, we randomly sample 100 triplets to construct the low-rank stochastic gradient. For SGD-Batch, we set the batch size to 10, according to the suggestion in (Qian et al. 2015a).