Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem
Authors: Masrour Zoghi, Shimon Whiteson, Remi Munos, Maarten Rijke
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition, our empirical results using real data from an information retrieval application show that it greatly outperforms the state of the art. Finally, we evaluate our method empirically using real data from an information retrieval application. |
| Researcher Affiliation | Collaboration | 1ISLA, University of Amsterdam, Netherlands 2INRIA Lille Nord Europe / MSR-NE |
| Pseudocode | Yes | Algorithm 1 Relative Upper Confidence Bound |
| Open Source Code | No | The paper does not provide a link to its source code or explicitly state that the code for the described method is publicly available. |
| Open Datasets | Yes | We evaluated RUCB, Condorcet SAVAGE and BTM using randomly chosen subsets from the pool of 64 rankers provided by LETOR, a standard IR dataset (see 8.4 for more details of the experimental setup) |
| Dataset Splits | No | The paper does not specify exact percentages, sample counts, or explicit methods for training, validation, and test splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models or memory used for experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks). |
| Experiment Setup | Yes | For RUCB we set = 0.51, which approaches the limit set by our high-probability result. |