reproducibilityindex.ai

On Weak Regret Analysis for Dueling Bandits

Authors: El Mehdi Saad, Alexandra Carpentier, Tomáš Kocák, Nicolas Verzelen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we perform a numerical evaluation of WR-EXP3-IX and WR-TINF algorithms in three different scenarios that favor different algorithms according to the prior theoretical results. As a benchmark for our experiments, we utilize the state-of-the-art algorithm for weak regret, WS-W [4]. Additionally, we include one of the best-performing algorithms for strong regret, Versatile-DB [14], to demonstrate that optimizing for strong regret does not necessarily translate into optimal weak regret performance. For each of the experiments, we plot the mean regret over 20 iterations together with 0.2 and 0.8 quantiles. All the experiments in this section use theoretical values of parameters for the algorithms.
Researcher Affiliation	Academia	El Mehdi Saad KAUST mehdi.saad@kaust.edu.sa Alexandra Carpentier Institut für Mathematik Universität Potsdam carpentier@uni-potsdam.de Tomáš Kocák Institut für Mathematik Universität Potsdam kocak@uni-potsdam.de Nicolas Verzelen INRAE, MISTEA, Univ. Montpellier nicolas.verzelen@inrae.fr
Pseudocode	Yes	Algorithm 1 WR-TINF
Open Source Code	Yes	Section 6 provides all the details needed to reproduce the simulations presented in our paper. The code is provided as well.
Open Datasets	No	In our experiments, we used data generated synthetically. The description of the distributions of the duels considered is provided in Section 6. The paper does not provide concrete access information for a publicly available or open dataset, as it uses synthetically generated data.
Dataset Splits	No	The paper mentions 'mean regret over 20 iterations together with 0.2 and 0.8 quantiles' in the experiments section but does not specify training, validation, or test splits. The problem is a sequential game, not a typical supervised learning task with train/val/test splits.
Hardware Specification	No	The runtime of each algorithm and iteration is in terms of minutes on a personal computer. This statement is too vague and does not provide specific hardware details (e.g., CPU/GPU models, memory, or processor types).
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers.
Experiment Setup	No	All the experiments in this section use theoretical values of parameters for the algorithms. The paper does not provide specific hyperparameter values, training configurations, or system-level settings in the main text that would allow for concrete replication of the experiment setup.