Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

Authors: Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, in Section 5, we evaluate the performance of different step-size schemes on strongly-convex supervised learning problems. We show that (A)SGD consistently out-perform existing noise-adaptive algorithms.
Researcher Affiliation Collaboration 1Simon Fraser University 2DI ENS, Ecole normale sup erieure, Universit e PSL, CNRS, INRIA, 75005 Paris, France 3SAIT AI lab, Montreal.
Pseudocode No The paper describes algorithms and updates using mathematical equations (e.g., 'wk+1 = wk γkαk fik(wk)' in Eq. (2) and Eqs. (4) and (5) for ASGD), but it does not provide any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code to reproduce our experiments is available here: https://github.com/R3za/expsls
Open Datasets Yes We use three standard datasets from LIBSVM (Chang and Lin, 2011) mushrooms, ijcnn and rcv1, and use λ = 0.01.
Dataset Splits No The paper mentions training examples and iterations but does not provide specific details on how the datasets were split into training, validation, and test sets, nor does it specify the use of cross-validation.
Hardware Specification No The paper does not provide any specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions using 'LIBSVM' but does not specify version numbers for any software components, libraries, or programming languages used in the experiments.
Experiment Setup Yes For each dataset, we fix T = 10n, use a batch-size of 1 and compare the performance of the following optimization strategies: ... We use λ = 0.01... To set ρ, we use a grid search over {10, 100, 1000}. Similarly, to set p and C for M-ASG, we use a grid search over {1, 2, 4} and {2, 10, 100} respectively.