Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Preference-based Online Learning with Dueling Bandits: A Survey
Authors: Viktor Bengs, Róbert Busa-Fekete, Adil El Mesaoudi-Paul, Eyke Hüllermeier
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | The aim of this paper is to provide a survey of the state of the art in this field, referred to as preferencebased multi-armed bandits or dueling bandits. To this end, we provide an overview of problems that have been considered in the literature as well as methods for tackling them. Our taxonomy is mainly based on the assumptions made by these methods about the data-generating process and, related to this, the properties of the preference-based feedback. |
| Researcher Affiliation | Collaboration | Viktor Bengs EMAIL Heinz Nixdorf Institute and Department of Computer Science Paderborn University, Germany R obert Busa-Fekete EMAIL Google Research New York, NY, USA Adil El Mesaoudi-Paul EMAIL Heinz Nixdorf Institute and Department of Computer Science Paderborn University, Germany Eyke H ullermeier EMAIL Heinz Nixdorf Institute and Department of Computer Science Paderborn University, Germany |
| Pseudocode | No | The paper is a survey discussing various algorithms, but it does not present its own pseudocode or algorithm blocks. It describes algorithms in narrative form. |
| Open Source Code | No | An attempt to address this issue for the programming language Python is made by the duelpy package30. https://gitlab.com/duelpy/duelpy |
| Open Datasets | No | This paper is a survey and does not report on experiments using datasets. It discusses datasets used in other research, but not in its own methodology. |
| Dataset Splits | No | This paper is a survey and does not contain experimental results that would require dataset splits. |
| Hardware Specification | No | This paper is a survey and does not describe experimental methodology that would involve hardware specifications. |
| Software Dependencies | No | This paper is a survey and does not specify software dependencies with version numbers for its own methodology. |
| Experiment Setup | No | This paper is a survey and does not describe an experimental setup or hyperparameter details for its own research. |