Efficient Second Order Online Learning by Sketching

Authors: Haipeng Luo, Alekh Agarwal, Nicolò Cesa-Bianchi, John Langford

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we evaluate our algorithm using the sparse Oja sketch (called Oja-SON) against first order methods such as diagonalized ADAGRAD [6, 25] on both ill-conditioned synthetic and a suite of real-world datasets. As Fig. 1 shows for a synthetic problem, we observe substantial performance gains as data conditioning worsens. On the real-world datasets, we find improvements in some instances, while observing no substantial second-order signal in the others.
Researcher Affiliation Collaboration Haipeng Luo Princeton University, Princeton, NJ USA haipengl@cs.princeton.edu Alekh Agarwal Microsoft Research, New York, NY USA alekha@microsoft.com Nicolò Cesa-Bianchi Università degli Studi di Milano, Italy nicolo.cesa-bianchi@unimi.it John Langford Microsoft Research, New York, NY USA jcl@microsoft.com
Pseudocode Yes Algorithm 1 Sketched Online Newton (SON) Algorithm 4 Sparse Sketched Online Newton with Oja s Algorithm Algorithm 5 Sparse Oja s Sketch
Open Source Code No The paper mentions implementing their algorithm in Vowpal Wabbit ('We implemented the sparse version of Oja-SON in Vowpal Wabbit.5 An open source machine learning toolkit available at http://hunch.net/~vw'), but it does not state that *their* implementation's source code is being released or provided.
Open Datasets Yes Table 1 describes the 23 publicly available datasets used in our evaluation. All the datasets are from UCI and LIBSVM repository.
Dataset Splits No For each dataset, we randomly split 80% of data for training and 20% for testing. While training and testing splits are provided, there is no explicit mention of a separate validation dataset split.
Hardware Specification No The paper does not specify the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using Vowpal Wabbit ('We implemented the sparse version of Oja-SON in Vowpal Wabbit'), but it does not provide specific version numbers for Vowpal Wabbit or any other software dependencies.
Experiment Setup Yes Hyperparameters. For ADAGRAD, we sweep the stepsize parameter from {2 3 , 2 2 , ..., 2 6 }. ... We keep the stepsize matrix in Oja-SON fixed as Γt = 1 t Im throughout.