Efficient Second Order Online Learning by Sketching
Authors: Haipeng Luo, Alekh Agarwal, Nicolò Cesa-Bianchi, John Langford
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we evaluate our algorithm using the sparse Oja sketch (called Oja-SON) against first order methods such as diagonalized ADAGRAD [6, 25] on both ill-conditioned synthetic and a suite of real-world datasets. As Fig. 1 shows for a synthetic problem, we observe substantial performance gains as data conditioning worsens. On the real-world datasets, we find improvements in some instances, while observing no substantial second-order signal in the others. |
| Researcher Affiliation | Collaboration | Haipeng Luo Princeton University, Princeton, NJ USA haipengl@cs.princeton.edu Alekh Agarwal Microsoft Research, New York, NY USA alekha@microsoft.com Nicolò Cesa-Bianchi Università degli Studi di Milano, Italy nicolo.cesa-bianchi@unimi.it John Langford Microsoft Research, New York, NY USA jcl@microsoft.com |
| Pseudocode | Yes | Algorithm 1 Sketched Online Newton (SON) Algorithm 4 Sparse Sketched Online Newton with Oja s Algorithm Algorithm 5 Sparse Oja s Sketch |
| Open Source Code | No | The paper mentions implementing their algorithm in Vowpal Wabbit ('We implemented the sparse version of Oja-SON in Vowpal Wabbit.5 An open source machine learning toolkit available at http://hunch.net/~vw'), but it does not state that *their* implementation's source code is being released or provided. |
| Open Datasets | Yes | Table 1 describes the 23 publicly available datasets used in our evaluation. All the datasets are from UCI and LIBSVM repository. |
| Dataset Splits | No | For each dataset, we randomly split 80% of data for training and 20% for testing. While training and testing splits are provided, there is no explicit mention of a separate validation dataset split. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using Vowpal Wabbit ('We implemented the sparse version of Oja-SON in Vowpal Wabbit'), but it does not provide specific version numbers for Vowpal Wabbit or any other software dependencies. |
| Experiment Setup | Yes | Hyperparameters. For ADAGRAD, we sweep the stepsize parameter from {2 3 , 2 2 , ..., 2 6 }. ... We keep the stepsize matrix in Oja-SON fixed as Γt = 1 t Im throughout. |