Efficient Stochastic Optimization for Low-Rank Distance Metric Learning
Authors: Jie Zhang, Lijun Zhang
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present empirical studies of the proposed method. The main purpose is to illustrate two characteristics of our method: Memory-efficient: The proposed algorithm has an O(dr) space complexity, where r is an upper bound of the rank of the intermediate solution. To verify our algorithm is memory-efficient, we need to show that all the intermediate solutions are indeed low-rank. Computation-efficient: By taking advantage of the low-rank structure of intermediate solutions and stochastic gradients, the time complexity is O(dr2) per iteration. We will examine the convergence behavior of our method. We will use three real-world data sets in our experiments: Gisette (Chang and Lin 2011), Dexter (Guyon et al. 2004) and News20 (Lang 1995). |
| Researcher Affiliation | Academia | Jie Zhang, Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China zhangj@lamda.nju.edu.cn, zhanglj@lamda.nju.edu.cn |
| Pseudocode | Yes | Algorithm 1 Efficient SGD for low-rank DML |
| Open Source Code | No | The paper does not provide any concrete access to source code, such as a specific repository link or an explicit statement of code release for the methodology described. |
| Open Datasets | Yes | We will use three real-world data sets in our experiments: Gisette (Chang and Lin 2011), Dexter (Guyon et al. 2004) and News20 (Lang 1995). |
| Dataset Splits | No | The paper mentions generating training triplets but does not provide specific dataset split information such as exact percentages, sample counts, or a detailed splitting methodology for training, validation, and test sets. |
| Hardware Specification | Yes | All algorithms are tested on a computer with 3.1GHz CPU and 8GB RAM. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We set the step size ηt = c/t where c is searched in {1e 7, 1e 6, . . . , 1, 10}, and we choose the one that leads to the largest decrement of the objective value. In each iteration, we randomly sample 100 triplets to construct the low-rank stochastic gradient. For SGD-Batch, we set the batch size to 10, according to the suggestion in (Qian et al. 2015a). |