Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Stochastic Optimization for Low-Rank Distance Metric Learning
Authors: Jie Zhang, Lijun Zhang
AAAI 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present empirical studies of the proposed method. The main purpose is to illustrate two characteristics of our method: Memory-efficient: The proposed algorithm has an O(dr) space complexity, where r is an upper bound of the rank of the intermediate solution. To verify our algorithm is memory-efficient, we need to show that all the intermediate solutions are indeed low-rank. Computation-efficient: By taking advantage of the low-rank structure of intermediate solutions and stochastic gradients, the time complexity is O(dr2) per iteration. We will examine the convergence behavior of our method. We will use three real-world data sets in our experiments: Gisette (Chang and Lin 2011), Dexter (Guyon et al. 2004) and News20 (Lang 1995). |
| Researcher Affiliation | Academia | Jie Zhang, Lijun Zhang National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 Efficient SGD for low-rank DML |
| Open Source Code | No | The paper does not provide any concrete access to source code, such as a specific repository link or an explicit statement of code release for the methodology described. |
| Open Datasets | Yes | We will use three real-world data sets in our experiments: Gisette (Chang and Lin 2011), Dexter (Guyon et al. 2004) and News20 (Lang 1995). |
| Dataset Splits | No | The paper mentions generating training triplets but does not provide specific dataset split information such as exact percentages, sample counts, or a detailed splitting methodology for training, validation, and test sets. |
| Hardware Specification | Yes | All algorithms are tested on a computer with 3.1GHz CPU and 8GB RAM. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We set the step size ηt = c/t where c is searched in {1e 7, 1e 6, . . . , 1, 10}, and we choose the one that leads to the largest decrement of the objective value. In each iteration, we randomly sample 100 triplets to construct the low-rank stochastic gradient. For SGD-Batch, we set the batch size to 10, according to the suggestion in (Qian et al. 2015a). |