reproducibilityindex.ai

Beyond Bandit Feedback in Online Multiclass Classification

Authors: Dirk van der Hoeven, Federico Fusco, Nicolò Cesa-Bianchi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic data show that for various feedback graphs our algorithm is competitive against known baselines.
Researcher Affiliation	Academia	Dirk van der Hoeven dirk@dirkvanderhoeven.com Dept. of Computer Science Università degli Studi di Milano, Italy Federico Fusco fuscof@diag.uniroma1.it Dept. of Computer, Control and Management Engineering Sapienza Università di Roma, Italy Nicolò Cesa-Bianchi nicolo.cesa-bianchi@unimi.it DSRC & Dept. of Computer Science Università degli Studi di Milano, Italy
Pseudocode	Yes	Algorithm 1: GAPPLETRON
Open Source Code	Yes	(a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets	No	We empirically evaluated the performance of GAPPLETRON on synthetic data in the bandit, multiclass ﬁltering, and full information settings. Similarly to the Syn Sep and Non Syn Syp datasets described in (Kakade et al., 2008), we generated synthetic datasets with d {80, 120, 160}, K {6, 9, 12}, and the label noise rate in {0, 0.05, 0.1}.
Dataset Splits	No	No explicit details on training, validation, or test dataset splits (percentages, counts, or cross-validation) were provided.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or memory) were explicitly provided for the experimental setup within the given text.
Software Dependencies	No	The paper mentions 'Online Gradient Descent' but does not specify software names with version numbers for reproducibility.
Experiment Setup	Yes	We used three surrogate losses for GAPPLETRON: the logistic loss ℓt(Wt) = log K q(Wt, xt, yt) where q is the softmax, the hinge loss deﬁned in (5), and the smooth hinge loss (Rennie and Srebro, 2005), denoted by Gap Log, Gap Hin, and Gap Sm H respectively. The OCO algorithm used with all losses is Online Gradient Descent, with learning rate ηt = 10 8 + Pt j=1 bℓj(Wt) 2 2 1/2 and no projections.