Non-convex Stochastic Composite Optimization with Polyak Momentum

Authors: Yuan Gao, Anton Rodomanov, Sebastian U Stich

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide numerical experiments to validate our theoretical results.
Researcher Affiliation Academia 1CISPA, Saarbrücken, Germany 2Universität des Saarlandes.
Pseudocode Yes Algorithm 1 Proximal Gradient Method with Polyak Momentum
Open Source Code No The paper does not provide any concrete statement or link regarding the open-sourcing of the code for the methodology described in this paper.
Open Datasets Yes We evaluate the performances of Algorithm 1 and the vanilla stochastic proximal gradient method on the Cifar-10 dataset (Krizhevsky et al., 2014) with the Resnet-18 (He et al., 2016).
Dataset Splits No The paper mentions using the Cifar-10 dataset and refers to training loss and test accuracy, but it does not explicitly provide details about training/validation/test splits or how data was partitioned for validation purposes.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper mentions software like Resnet-18 and SGD, but it does not specify version numbers for any software dependencies or libraries needed to replicate the experiments.
Experiment Setup Yes The parameter M is tuned by a grid search in {100, 101, 102, 103, 104} for all methods, and the momentum parameter γ is tuned by a grid search in {10 1, 10 2, 10 3, 10 4, 10 5}. We set the maximum number of iterations to be 104, and the tolerance is 0.02. We use a batch size of 256 and run 300 epochs. We use the standard step size parameter M = 10 (corresponding to a learning rate of 0.1) for the experiment.