Matrix-Free Preconditioning in Online Learning
Authors: Ashok Cutkosky, Tamas Sarlos
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude by benchmarking our algorithm on synthetic data and deep learning tasks. In Section 6 we provide an empirical evaluation of our algorithm. We implemented RECURSIVEOPTIMIZER in Tensor Flow (Abadi et al., 2016) and ran benchmarks on both synthetic data as well as several deep learning tasks (see Appendix D for full details). |
| Researcher Affiliation | Industry | 1Google Research, California, USA. |
| Pseudocode | Yes | Algorithm 1 Recursive Optimizer; Algorithm 2 Diagonal Betting Algorithm |
| Open Source Code | Yes | Code available at: https://github.com/ google-research/google-research/tree/ master/recursive_optimizer |
| Open Datasets | Yes | We test RECURSIVEOPTIMIZER on benchmark deep learning models. Specifically, we test performance with the Res Net-32 (He et al., 2016) model on the CIFAR-10 image recognition dataset (Krizhevsky & Hinton, 2009) and the Transformer model (Vaswani et al., 2017; 2018) on LM1B (Chelba et al., 2013) and other textual datasets. |
| Dataset Splits | No | The paper mentions 'train and test error' and 'holdout set' but does not provide specific details on how the datasets were split into training, validation, and test sets, such as exact percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions implementing the algorithm in 'Tensor Flow (Abadi et al., 2016)' but does not specify the version number of TensorFlow or any other software dependencies with version numbers. |
| Experiment Setup | No | The paper states 'For full implementation details, see Appendix D.' and 'See Appendix D.2 for details on the momentum heuristic.' However, these appendices are not provided in the given text. The main text describes the synthetic data generation and the models used (ResNet-32, Transformer) but does not provide concrete hyperparameters (e.g., learning rate, batch size, epochs) or system-level training settings. |