reproducibilityindex.ai

Adaptive Region-Based Active Learning

Authors: Corinna Cortes, Giulia Desalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also report the results of an extensive suite of experiments on several real-world datasets demonstrating substantial empirical beneﬁts over existing single-region and non-adaptive region-based active learning baselines. In Section 5, we report the results of a series of experiments on multiple datasets, demonstrating the substantial beneﬁts of ARBAL over existing non-region-based active learning algorithms, such as IWAL and margin-based uncertainty sampling, and over the non-adaptive region-based active learning baseline ORIWAL (Cortes et al., 2019b).
Researcher Affiliation	Collaboration	1Google Research, New York, NY; 2Courant Institute of Mathematical Sciences, New York, NY; 3Hudson River Trading, New York, NY.
Pseudocode	Yes	The pseudocode of ARBAL is given in Algorithm 1. The pseudocode of SPLIT is given in Algorithm 2.
Open Source Code	No	No explicit statement or link regarding the release of source code for the described methodology was found.
Open Datasets	Yes	We tested 24 binary classiﬁcation datasets from the UCI and openml repositories, and also the MNIST dataset with 3 and 5 as the two classes, which is standard binary classiﬁcation task extracted from the MNIST dataset (e.g., (Crammer et al., 2009)).
Dataset Splits	Yes	For each experiment, we randomly shufﬂed the dataset, ran the algorithms on the ﬁrst half of the data (so that the number of active learning rounds T equals N/2), and tested the classiﬁer returned on the remaining half to measure misclassiﬁcation loss.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, processor types, or memory amounts) used for running experiments were mentioned.
Software Dependencies	No	No specific software dependencies with version numbers were explicitly mentioned (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We chose κ = 20 and allow the ﬁrst phase to run at most τ = 800 rounds so as to make ARBAL fully split into the desired number of regions on almost all datasets. Since the slack term σT derived from high-probability analyses are typically overly conservative, we simply use 0.01/ Tk in the SPLIT subroutine. ... We set ρ = 0.01 in our experiments. ... We use the logistic loss function ℓdeﬁned for all (x, y) X Y and hypotheses h: X R by ℓ(h(x), y) = log(1 + e yh(x)), which we then rescale to [0, 1]. The initial hypothesis set H consists of 3,000 randomly drawn hyperplanes with bounded norms.