Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Stochastic DCA with Variance Reduction and Applications in Machine Learning
Authors: Hoai An Le Thi, Hoang Phuc Hau Luu, Hoai Minh Le, Tao Pham Dinh
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To study the efficiency of our algorithms, we apply them to three important problems in machine learning: nonnegative principal component analysis, group variable selection in multiclass logistic regression, and sparse linear regression. Numerical experiments have shown the merits of our proposed algorithms in comparison with other state-of-the-art stochastic methods for solving nonconvex large-sum problems. |
| Researcher Affiliation | Academia | Hoai An Le Thi EMAIL Université de Lorraine, LGIPM, Département IA, F-57000 Metz, France Institut Universitaire de France (IUF) ... Tao Pham Dinh EMAIL Laboratory of Mathematics, INSA-Rouen, University of Normandie 76801 Saint Etienne-du-Rouvray Cedex, France |
| Pseudocode | Yes | Algorithm 1 DCA-SVRG Initialization: x0 dom r1, inner-loop length M, minibatch size b, k = 0, option (either with replacement or without replacement). repeat ... Algorithm 2 DCA-SAGA ... Algorithm 3 DCA-SVRG applied to (Q') |
| Open Source Code | No | The paper does not contain any explicit statement about the release of source code or provide a link to a code repository. It refers to third-party tools or algorithms but not its own implementation code. |
| Open Datasets | Yes | We use standard machine learning data sets in LIBSVM 1, namely, a9a (32561 123), aloi (108000 128), cifar10 (50000 3072), Sens IT Vehicle (78823 100), connect-4 (67557 126), letter (15000 16), mnist (60000 780), protein (17766 357), shuttle (43500 9), Year Prediction MSD (463715 90). 1. The data sets can be downloaded from https://www.csie.ntu.edu.tw/~cjlin/libsvm/. |
| Dataset Splits | No | The paper mentions using 'training set' and normalizations for datasets, but does not provide explicit training/test/validation dataset splits (e.g., percentages, sample counts, or references to standard splits for the listed datasets). |
| Hardware Specification | Yes | All numerical experiments in this section are performed on a Processor Intel(R) core(TM) i7-8700, CPU @ 3.20GHz, RAM 16 GB. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with specific versions). |
| Experiment Setup | Yes | The minibatch size b is chosen as N^(2/3), N^(2/3), 2 * N^(5/4), 2 * sqrt(N + 1) for DCA-SVRG-v1, DCA-SVRG-v2, DCA-SAGA-v1, DCA-SAGA-v2, respectively. We set the inner loop length M for DCA-SVRG-v1 and DCA-SVRG-v2 to be (1/4) * e * (1/b). The fixed budget of SFO calls to be 15N. For the prox-SGD, η = 1/(2L) (Ghadimi et al., 2016) and we choose a neutral minibatch size of 500. |