Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dueling Bandits with Qualitative Feedback

Authors: Liyuan Xu, Junya Honda, Masashi Sugiyama5549-5556

AAAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the empirical performance of the proposed algorithms through experiments based on both synthetic setting and real-world data.
Researcher Affiliation	Academia	Liyuan Xu,1,2 Junya Honda,1,2 Masashi Sugiyama1,2 1The University of Tokyo, 2RIKEN EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Thompson Condorcet sampling
Open Source Code	No	The paper mentions '1The longer version including all appendices is available at https://arxiv.org/abs/1809.05274' which is a link to an arXiv preprint, but does not explicitly state that the source code for the described methodology is available at this link or elsewhere.
Open Datasets	Yes	We used two web search datasets. The ﬁrst is the MSLRWEB10K dataset (Qin et al. 2010), which consists of 10,000 search queries over the documents from search results. ... The other is the MQ2008 dataset (Qin and Liu 2013)...
Dataset Splits	No	The paper mentions using 'MSLR-WEB10K dataset' and 'MQ2008 dataset' and that it 'repeat[s] 100 runs for each instance', but it does not provide specific details on train/validation/test splits (percentages, sample counts, or explicit references to predefined splits).
Hardware Specification	No	No specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running experiments are provided in the paper.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup	Yes	We set t0 = 10, and the Figure 1 is the experimental result when the number of rankers is K = 5.