Online Active Learning with Surrogate Loss Functions

Authors: Giulia DeSalvo, Claudio Gentile, Tobias Sommer Thune

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical analysis shows that the algorithm attains favorable generalization and label complexity bounds, while our empirical study on 18 real-world datasets demonstrate that the algorithm outperforms standard baselines, including the Margin Algorithm, or Uncertainty Sampling, a highperforming active learning algorithm favored by practitioners. We then complement our theoretical findings with a thorough experimental investigation on 18 real-world datasets using a class of neural networks, Multi-layer Perceptrons. In this section, we present our experimental results in the streaming active learning setting that test the ALPS algorithm, the IWAL algorithm, the margin algorithm (or uncertainty sampling), and a passive learning algorithm.
Researcher Affiliation Collaboration Giulia De Salvo Google Research New York, NY 10011 giuliad@google.com Claudio Gentile Google Research New York, NY 10011 cgentile@google.com Tobias Sommer Thune University of Copenhagen Copenhagen, DK tobias.thune@gmail.com
Pseudocode Yes See pseudo-code in Algorithm 1. Algorithm 1: Actively Learning over Pseudo-labels for Surrogate losses (ALPS)
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper. There is no explicit statement of code release or a repository link.
Open Datasets Yes We tested these algorithms on 18 publicly available datasets where we used the logistic loss as the surrogate loss ℓand used feedforward artificial neural networks as our model class. For the datasets, we used 18 publicly available datasets from OpenML [Vanschoren et al., 2013]... We also included the CIFAR-10 [Krizhevsky, 2009] and MNIST [Le Cun and Cortes, 2010] datasets.
Dataset Splits No The paper mentions splitting data into training and test sets (2/3 training, 1/3 test) but does not explicitly specify a separate validation dataset split needed to reproduce the experiment.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types) used for running its experiments.
Software Dependencies No The paper mentions using the "Multi-layer Perceptron algorithm in the scikit-learn library" but does not specify a version number for scikit-learn or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes For each type of neural network, we constructed a diverse finite set of hypotheses by pre-training on small random subsets while varying the l2 regularization parameter and initial weights. For both types of networks, we varied the l2 regularization parameter from {0.0001, 0.001, 0.01, 0.1, 1} and repeated the pretraining 5 times for each regularization parameter using different random initial weights.