Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Better Theory for SGD in the Nonconvex World
Authors: Ahmed Khaled, Peter Richtárik
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We corroborate our theoretical results with experiments on real and synthetic data. |
| Researcher Affiliation | Academia | Ahmed Khaled EMAIL Princeton University Peter Richtárik EMAIL King Abdullah University of Science and Technology |
| Pseudocode | No | The paper describes algorithms and methods mathematically and in paragraph text but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about the release of source code or links to a code repository. |
| Open Datasets | Yes | We run experiments on the a9a dataset (n = 32561 and d = 123) from LIBSVM (Chang & Lin, 2011). |
| Dataset Splits | No | The paper mentions using the 'a9a dataset' but does not specify any training, testing, or validation splits (e.g., percentages, sample counts, or references to predefined splits). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used to run the experiments, such as CPU or GPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions 'LIBSVM (Chang & Lin, 2011)' but does not specify a version number for this or any other software dependency, which is required for reproducibility. |
| Experiment Setup | Yes | We use n = 1000 and d = 50 and initialize x = 0. We sample minibatches of size τ = 10 with replacement and use γ = 0.1/LAK, where K = 5000 is the number of iterations and A is as in Proposition 3. [...] We fix τ = 1, and run SGD for K = 500 iterations with a stepsize γ = 1/LAK as in the previous experiment. |