Choice Bandits
Authors: Arpit Agarwal, Nicholas Johnson, Shivani Agarwal
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that for the special case of k = 2, our algorithm is competitive with previous dueling bandit algorithms, and for the more general case k > 2, outperforms the recently proposed Max Min UCB algorithm designed for the MNL model. |
| Researcher Affiliation | Academia | Arpit Agarwal University of Pennsylvania Philadelphia, PA 19104, USA aarpit@seas.upenn.edu Nicholas Johnson University of Minnesota Minneapolis, MN 55455, USA njohnson@cs.umn.edu Shivani Agarwal University of Pennsylvania Philadelphia, PA 19104, USA ashivani@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1 Winner Beats All (WBA) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | Yes | Sushi: Choice model extracted from the Sushi dataset [47]; |
| Dataset Splits | No | The paper does not specify dataset splits for training, validation, or testing, nor does it mention cross-validation. It refers generally to 'datasets' for experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., programming languages, libraries, or solvers). |
| Experiment Setup | Yes | The parameter C in our algorithm was set to 1. ... We set α = 0.51 for RUCB and DTS, and f(K) = 0.3K1.01 for RMED, and γ = 1.3 for BTM. ... We set the parameter α to be 0.51 for Max Min UCB. |