reproducibilityindex.ai

Combinatorial Pure Exploration for Dueling Bandit

Authors: Wei Chen, Yihan Du, Longbo Huang, Haoyu Zhao

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we study combinatorial pure exploration for dueling bandits (CPE-DB): we have multiple candidates for multiple positions as modeled by a bipartite graph, and in each round we sample a duel of two candidates on one position and observe who wins in the duel, with the goal of ﬁnding the best candidate-position matching with high probability after multiple rounds of samples. [...] For Borda winner, we establish a reduction of the problem to the original CPE-MAB setting and design PAC and exact algorithms that achieve both the sample complexity similar to that in the CPE-MAB setting (which is nearly optimal for a subclass of problems) and polynomial running time per round. For Condorcet winner, we ﬁrst design a fully polynomial time approximation scheme (FPTAS) for the ofﬂine problem of ﬁnding the Condorcet winner with known winning probabilities, and then use the FPTAS as an oracle to design a novel pure exploration algorithm CAR-Cond with sample complexity analysis.
Researcher Affiliation	Collaboration	1Microsoft Research, Beijing, China 2IIIS, Tsinghua University, Beijing, China.
Pseudocode	Yes	Algorithm 1 CLUCB-Borda-PAC (Page 4), Algorithm 2 CAR-Cond (Page 6), Algorithm 3 CAR-Parallel (Page 7), Algorithm 4 CAR-Verify (Page 8).
Open Source Code	No	The paper does not provide any statement or link indicating that source code for the methodology is openly available.
Open Datasets	No	The paper is theoretical and does not use or refer to any specific publicly available datasets for training or evaluation. The concept of 'sampling duels' is part of their theoretical model, not an external dataset.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments with training, validation, or test dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe empirical experiments, therefore no hardware specifications are provided.
Software Dependencies	No	The paper is theoretical and does not mention specific software names with version numbers used for implementation or experimentation.
Experiment Setup	No	The paper is theoretical and does not describe empirical experiments, therefore no experimental setup details like hyperparameters or training configurations are provided.