Stabilizing Linear Passive-Aggressive Online Learning with Weighted Reservoir Sampling

Authors: Skyler Wu, Fred Lu, Edward Raff, James Holt

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our WRS approach on the Passive-Aggressive Classifier (PAC) and First-Order Sparse Online Learning (FSOL), where our method consistently and significantly outperforms the unmodified approach.
Researcher Affiliation Collaboration Skyler Wu Booz Allen Hamilton Harvard University Stanford University wu_skyler@bah.com Fred Lu Booz Allen Hamilton University of Maryland, Baltimore County lu_fred@bah.com Edward Raff Booz Allen Hamilton University of Maryland, Baltimore County raff_edward@bah.com James Holt Laboratory for Physical Sciences holt@lps.umd.edu
Pseudocode Yes Algorithm 1 WRS-Augmented Training (WAT)
Open Source Code Yes Code available at https://github.com/Future Computing4AI/Weighted-Reservoir-Sampling-Augmented-Training
Open Datasets Yes Table 1: Sizes, dimensions, and sparsities of all datasets used for numerical experiments. ... Avazu (App) [23] (via LIBSVM) ... [23] Steve Wang and Will Cukierski. Click-Through Rate Prediction. Kaggle. https://kaggle.com/competitions/avazu-ctr-prediction, 2014.
Dataset Splits No We perform a random 70/30 train-test split. The paper specifies train and test splits but does not explicitly mention a separate validation split or how it was used.
Hardware Specification Yes All experiments were run on a Linux computing cluster with 32 nodes, each with 40 Intel Xeon E5-2650 CPU cores and a total of 500 GB of RAM per node, managed using SLURM.
Software Dependencies No The paper mentions using libraries like scikit-learn implicitly through citations (e.g., 'Count Vectorizer from [31]'). However, it does not provide specific version numbers for these software components or other key dependencies required for replication.
Experiment Setup Yes We begin by tuning the Cerr, η, and λ hyperparameters for base PAC and FSOL, with details located in Appendix B. For PAC-WRS and FSOL-WRS, we use the hyperparameters for the corresponding base models, but try all possible WAT variants of weighting scheme (standard or exponential), averaging scheme (simple vs. weighted), voting-based zeroing (True or False), and reservoir size K {1, 4, 16, 64}. Appendix B.2 details the ranges for Cerr (10^-3 to 10^3) and η, λ (2^-3 to 2^9 for η, 0 to 10^3 for λ).