Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

New Convergence Aspects of Stochastic Gradient Algorithms

Authors: Lam M. Nguyen, Phuong Ha Nguyen, Peter Richtárik, Katya Scheinberg, Martin Takáč, Marten van Dijk

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For our numerical experiments, we consider the finite sum minimization problem in (2). We consider ℓ2-regularized logistic regression problems with fi(w) = log(1 + exp( yi xi, w )) + λ 2 w 2. ... We conducted experiments on a single core for Algorithm 2 on two popular data sets ijcnn1 (n = 91, 701 training data) and covtype (n = 406, 709 training data) from LIBSVM (Chang and Lin, 2011) data sets. ... For the two data sets, Figures 1 and 3 plot the training loss for each fraction with τ = 10. The top plots have t , the number of coordinate updates, for the horizontal axis. The bottom plots have the number of epochs, each epoch counting n iterations, for the horizontal axis. The results show that each fraction shows a sublinear expected convergence rate of O(1/t ); the smaller fractions exhibit larger deviations but do seem to converge faster to the minimum solution.
Researcher Affiliation Collaboration IBM Research, Thomas J. Watson Research Center Yorktown Heights, NY 10598, USA; Department of Electrical and Computer Engineering University of Connecticut, Storrs, CT 06268, USA; Computer, Electrical and Math. Sciences and Engineering Division King Abdullah University of Science and Technology, Thuwal, KSA; School of Operations Research and Information Engineering Cornell University, Ithaca, NY 14850, USA; Department of Industrial and Systems Engineering Lehigh University, Bethlehem, PA 18015, USA
Pseudocode Yes Algorithm 1 Stochastic Gradient Descent (SGD) Method; Algorithm 2 Hogwild! general recursion
Open Source Code No The paper does not provide an explicit statement about releasing its own implementation code or a link to a code repository.
Open Datasets Yes We conducted experiments on a single core for Algorithm 2 on two popular data sets ijcnn1 (n = 91, 701 training data) and covtype (n = 406, 709 training data) from LIBSVM (Chang and Lin, 2011) data sets.
Dataset Splits No The paper mentions 'ijcnn1 (n = 91, 701 training data)' and 'covtype (n = 406, 709 training data)' but does not provide specific train/test/validation split ratios, sample counts, or methods for reproducibility.
Hardware Specification No The paper states 'We conducted experiments on a single core for Algorithm 2' but does not specify any particular CPU model, GPU, memory, or other detailed hardware specifications.
Software Dependencies No The paper cites 'LIBSVM (Chang and Lin, 2011)' as the source for datasets, which is a software library, but it does not specify any software dependencies with version numbers for the authors' own implementation.
Experiment Setup Yes For our numerical experiments, we consider the finite sum minimization problem in (2). ... The penalty parameter λ is set to 1/n, a widely-used value in literature (Le Roux et al., 2012). ... We choose the step size based on Theorem 6, i.e, ηt = 4 µ(t+E) and E = max{2τ, 16LD µ }. For each fraction v {1, 3/4, 2/3, 1/2, 1/3, 1/4} we performed the following experiment: ... In addition we use τ = 10.