Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem
Authors: Masrour Zoghi, Shimon Whiteson, Remi Munos, Maarten Rijke
ICML 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition, our empirical results using real data from an information retrieval application show that it greatly outperforms the state of the art. Finally, we evaluate our method empirically using real data from an information retrieval application. |
| Researcher Affiliation | Collaboration | 1ISLA, University of Amsterdam, Netherlands 2INRIA Lille Nord Europe / MSR-NE |
| Pseudocode | Yes | Algorithm 1 Relative Upper Con๏ฌdence Bound |
| Open Source Code | No | The paper does not provide a link to its source code or explicitly state that the code for the described method is publicly available. |
| Open Datasets | Yes | We evaluated RUCB, Condorcet SAVAGE and BTM using randomly chosen subsets from the pool of 64 rankers provided by LETOR, a standard IR dataset (see 8.4 for more details of the experimental setup) |
| Dataset Splits | No | The paper does not specify exact percentages, sample counts, or explicit methods for training, validation, and test splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models or memory used for experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., libraries, frameworks). |
| Experiment Setup | Yes | For RUCB we set = 0.51, which approaches the limit set by our high-probability result. |