Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Double Thompson Sampling for Dueling Bandits
Authors: Huasen Wu, Xin Liu
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments based on both synthetic and real-world data demonstrate that D-TS and D-TS+ significantly improve the overall performance, in terms of regret and robustness. |
| Researcher Affiliation | Academia | Huasen Wu University of California, Davis EMAIL Xin Liu University of California, Davis EMAIL |
| Pseudocode | Yes | Algorithm 1 D-TS for Copeland Dueling Bandits |
| Open Source Code | Yes | 2Source codes are available at https://github.com/Huasen Wu/Dueling Bandits. |
| Open Datasets | Yes | Here we present the results for experiments based on the Microsoft Learning to Rank (MSLR) dataset [24], which provides the relevance for queries and ranked documents. ... [24] Microsoft Research, Microsoft Learning to Rank Datasets. http://research.microsoft.com/enus/projects/mslr/, 2010. |
| Dataset Splits | No | The paper uses the Microsoft Learning to Rank (MSLR) dataset and refers to 'two 5-armed submatrices in [6]' but does not provide specific percentages or counts for training, validation, or test splits for their experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | For BTM, we set the relaxed factor γ = 1.3 as [16]. For algorithms using RUCB and RLCB, including D-TS and D-TS+, we set the scale factor α = 0.51. For RMED1, we use the same settings as [5], and for ECW-RMED, we use the same setting as [7]. |