reproducibilityindex.ai

On Convergence of Incremental Gradient for Non-convex Smooth Functions

Authors: Anastasia Koloskova, Nikita Doikov, Sebastian U Stich, Martin Jaggi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present illustrative numerical experiments comparing different strategies for selecting stochastic gradients: SGD (sampling gradients with replacement), Single Shuffle (SS, using one random permutation for all epochs), and Random Reshuffling (RR, generating permutation for each new epoch). We demonstrate that both of shuffle strategies are not only beneficial due to simpler and faster implementations, but also achieve comparable or even better convergence than plain SGD.
Researcher Affiliation	Academia	1Machine Learning and Optimization Laboratory (MLO), EPFL, Lausanne, Switzerland 2CISPA Helmholtz Center for Information Security, Saarbr ucken, Germany.
Pseudocode	No	The paper refers to 'Algorithm (2)' as a mathematical equation for the update rule but does not present it in a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not contain an explicit statement about the release of source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	Logistic regression on the Australian dataset (Chang & Lin, 2011), training the logistic regression model on machine learning datasets from (Chang & Lin, 2011), MNIST dataset, CIFAR dataset.
Dataset Splits	No	The paper does not explicitly state specific training, validation, and test dataset splits using percentages, absolute counts, or references to predefined splits. It mentions using training data and testing data for some experiments, but without details on the splits, and no explicit mention of a 'validation' set split.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory specifications used for running the experiments. It only generally refers to the 'computational environment' in the Appendix.
Software Dependencies	No	The paper states 'The methods are implemented in Python 3.' but does not provide specific version numbers for Python libraries, frameworks (like PyTorch or TensorFlow), or other software dependencies.
Experiment Setup	Yes	We tune the stepsize over the fixed grid separately for each method, and for each n., We apply all the methods starting from x0 = 0 and using a constant stepsize γ > 0. We vary several values for γ, a three-layer neural network (one convolutional layer and two fully-connected layers with tanh activation functions) with the total number of parameters d = 140697., sampling batches of a fixed size 256.