Dueling Bandits with Team Comparisons

Authors: Lee Cohen, Ulrike Schmidt-Kraepelin, Yishay Mansour

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We introduce the dueling teams problem, a new online-learning setting in which the learner observes noisy comparisons of disjoint pairs of k-sized teams from a universe of n players. The goal of the learner is to minimize the number of duels required to identify, with high probability, a Condorcet winning team... For the stochastic setting, we provide a reduction to the classical dueling bandits setting, yielding an algorithm that identifies a Condorcet winning team within O((n+k log(k)) max(log log n,log k) 2 ) duels... For deterministic feedback, we additionally present a gap-independent algorithm that identifies a Condorcet winning team within O(nk log(k) + k5) duels.
Researcher Affiliation Collaboration Lee Cohen Tel Aviv University leecohencs@gmail.com Ulrike Schmidt-Kraepelin Technische Universität Berlin u.schmidt-kraepelin @tu-berlin.de Yishay Mansour Tel Aviv University and Google Research mansour.yishay@gmail.com
Pseudocode No While the paper describes several algorithms (e.g., Uncover, Reduce Players, Condorcet Winning), it does not present them in formal pseudocode blocks or figures labeled 'Algorithm' within the main text or appendices. For example, Section 5 states 'While all algorithms are formalized in Appendix C, we give sketches thereof and theorems stating their input and output below', implying textual descriptions rather than pseudocode.
Open Source Code No The paper does not contain any statements about open-source code availability or links to code repositories for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments involving datasets, therefore it does not mention the use or availability of training data.
Dataset Splits No The paper is theoretical and does not involve empirical evaluation with datasets, hence it does not provide information on training/validation/test dataset splits.
Hardware Specification No The paper is purely theoretical and does not describe any experiments that would require hardware specifications.
Software Dependencies No The paper focuses on theoretical algorithm design and analysis. It does not describe any specific software implementations or dependencies with version numbers that would be required for reproducibility.
Experiment Setup No The paper describes theoretical algorithms and their properties; it does not include details of an experimental setup such as hyperparameters or system-level training settings as no empirical experiments are reported.