Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
Authors: Zheng Shi, Abdurakhmon Sadiev, Nicolas Loizou, Peter Richtárik, Martin Takáč
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive empirical analysis and demonstrate its strong performance compared with its classical counterparts and other state-of-the-art first-order methods in solving convex machine learning problems. |
| Researcher Affiliation | Collaboration | Zheng Shi EMAIL IBM United States of America Abdurakhmon Sadiev EMAIL King Abdullah University of Science and Technology (KAUST) Thuwal Saudi Arabia Nicolas Loizou EMAIL Johns Hopkins University Baltimore United States of America Peter Richtárik EMAIL King Abdullah University of Science and Technology (KAUST) Thuwal Saudi Arabia Martin Takáč EMAIL Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) Masdar City, Abu Dhabi United Arab Emirates (UAE) |
| Pseudocode | Yes | Algorithm 1 Theoretical-AI-SARAH [...] Algorithm 2 AI-SARAH |
| Open Source Code | Yes | See code at https://github.com/shizheng-rlfresh/ai_sarah. |
| Open Datasets | Yes | The datasets chosen for the experiments are ijcnn1, rcv1, real-sim, news20 and covtype. Table 1 shows the basic statistics of the datasets. More details and additional datasets can be found in Appendix B. LIBSVM datasets are available at https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/. |
| Dataset Splits | Yes | Table 1: Summary of Datasets from Chang & Lin (2011). Dataset # features n (# Train) # Test % Sparsity ijcnn1 1 22 49,990 91,701 40.91 [...] 2 dataset is randomly split by 75%-training & 25%-testing. |
| Hardware Specification | Yes | In this section, we present the extended results of our empirical study on the performance of AI-SARAH. For the experiments, we used NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper mentions 'Pytorch' multiple times, including 'These algorithms were implemented in Pytorch, where ADAM and SGD w/m are built-in optimizers of Pytorch.' However, no specific version numbers for PyTorch or other libraries are provided. |
| Experiment Setup | Yes | The paper provides specific hyperparameters: '0 < γ < 1 (default 1 32), β = 0.999' in Algorithm 2 parameters. Section 5 also states: 'We perform an extensive search on hyper-parameters: (1) ADAM and SGD with Momentum (SGD w/m) are tuned with different values of the (initial) step-size and schedules to reduce the step-size; (2) SARAH and SVRG are tuned with different values of the (constant) step-size and inner loop size; (3) SARAH+ is tuned with different values of the (constant) step-size and early stopping parameter. (See Appendix B for detailed tuning plan and the selected hyper-parameters.)' Table 3 further details the tuning plan with specific ranges for step-size, inner loop size, and early stopping parameters. |