Adaptive Gradient Descent without Descent
Authors: Yura Malitsky, Konstantin Mishchenko
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We examine its performance on a range of convex and nonconvex problems, including logistic regression and matrix factorization. |
| Researcher Affiliation | Academia | 1EPFL, Lausanne, Switzerland 2KAUST, Thuwal, Saudi Arabia. |
| Pseudocode | Yes | Algorithm 1 Adaptive gradient descent |
| Open Source Code | Yes | 3See https://github.com/ymalitsky/adaptive_gd |
| Open Datasets | Yes | We use mushrooms and covtype datasets to run the experiments. For the experiments we used Movilens 100K dataset (Harper & Konstan, 2016) train them to classify images from the Cifar10 dataset (Krizhevsky et al., 2009) |
| Dataset Splits | No | The paper uses standard datasets like Cifar10 but does not explicitly provide specific train/validation/test dataset split percentages, sample counts, or explicit methodology for creating these splits in the main text. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, memory, cloud instance types) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2017)' as an implementation framework but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We use batch size 128 for all methods. For our method, we observed that 1 Lk works better than 1 2Lk . We ran it with 1 + γθk in the other factor with values of γ from {1, 0.1, 0.05, 0.02, 0.01} and γ = 0.02 performed the best. |