Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
New Convergence Aspects of Stochastic Gradient Algorithms
Authors: Lam M. Nguyen, Phuong Ha Nguyen, Peter Richtárik, Katya Scheinberg, Martin Takáč, Marten van Dijk
JMLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For our numerical experiments, we consider the finite sum minimization problem in (2). We consider ℓ2-regularized logistic regression problems with fi(w) = log(1 + exp( yi xi, w )) + λ 2 w 2. ... We conducted experiments on a single core for Algorithm 2 on two popular data sets ijcnn1 (n = 91, 701 training data) and covtype (n = 406, 709 training data) from LIBSVM (Chang and Lin, 2011) data sets. ... For the two data sets, Figures 1 and 3 plot the training loss for each fraction with τ = 10. The top plots have t , the number of coordinate updates, for the horizontal axis. The bottom plots have the number of epochs, each epoch counting n iterations, for the horizontal axis. The results show that each fraction shows a sublinear expected convergence rate of O(1/t ); the smaller fractions exhibit larger deviations but do seem to converge faster to the minimum solution. |
| Researcher Affiliation | Collaboration | IBM Research, Thomas J. Watson Research Center Yorktown Heights, NY 10598, USA; Department of Electrical and Computer Engineering University of Connecticut, Storrs, CT 06268, USA; Computer, Electrical and Math. Sciences and Engineering Division King Abdullah University of Science and Technology, Thuwal, KSA; School of Operations Research and Information Engineering Cornell University, Ithaca, NY 14850, USA; Department of Industrial and Systems Engineering Lehigh University, Bethlehem, PA 18015, USA |
| Pseudocode | Yes | Algorithm 1 Stochastic Gradient Descent (SGD) Method; Algorithm 2 Hogwild! general recursion |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own implementation code or a link to a code repository. |
| Open Datasets | Yes | We conducted experiments on a single core for Algorithm 2 on two popular data sets ijcnn1 (n = 91, 701 training data) and covtype (n = 406, 709 training data) from LIBSVM (Chang and Lin, 2011) data sets. |
| Dataset Splits | No | The paper mentions 'ijcnn1 (n = 91, 701 training data)' and 'covtype (n = 406, 709 training data)' but does not provide specific train/test/validation split ratios, sample counts, or methods for reproducibility. |
| Hardware Specification | No | The paper states 'We conducted experiments on a single core for Algorithm 2' but does not specify any particular CPU model, GPU, memory, or other detailed hardware specifications. |
| Software Dependencies | No | The paper cites 'LIBSVM (Chang and Lin, 2011)' as the source for datasets, which is a software library, but it does not specify any software dependencies with version numbers for the authors' own implementation. |
| Experiment Setup | Yes | For our numerical experiments, we consider the finite sum minimization problem in (2). ... The penalty parameter λ is set to 1/n, a widely-used value in literature (Le Roux et al., 2012). ... We choose the step size based on Theorem 6, i.e, ηt = 4 µ(t+E) and E = max{2τ, 16LD µ }. For each fraction v {1, 3/4, 2/3, 1/2, 1/3, 1/4} we performed the following experiment: ... In addition we use τ = 10. |