Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem

Authors: Masrour Zoghi, Shimon Whiteson, Remi Munos, Maarten Rijke

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In addition, our empirical results using real data from an information retrieval application show that it greatly outperforms the state of the art. Finally, we evaluate our method empirically using real data from an information retrieval application.
Researcher Affiliation Collaboration 1ISLA, University of Amsterdam, Netherlands 2INRIA Lille Nord Europe / MSR-NE
Pseudocode Yes Algorithm 1 Relative Upper Confidence Bound
Open Source Code No The paper does not provide a link to its source code or explicitly state that the code for the described method is publicly available.
Open Datasets Yes We evaluated RUCB, Condorcet SAVAGE and BTM using randomly chosen subsets from the pool of 64 rankers provided by LETOR, a standard IR dataset (see 8.4 for more details of the experimental setup)
Dataset Splits No The paper does not specify exact percentages, sample counts, or explicit methods for training, validation, and test splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models or memory used for experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks).
Experiment Setup Yes For RUCB we set = 0.51, which approaches the limit set by our high-probability result.