Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Blocking Bandits

Authors: Soumya Basu, Rajat Sen, Sujay Sanghavi, Sanjay Shakkottai

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experimental Evaluation: Synthetic Experiments: We first validate our results on synthetic experiments, where we use K = 20 arms.
Researcher Affiliation Collaboration Soumya Basu Sujay Sanghavi UT Austin, Amazon Sanjay Shakkottai
Pseudocode Yes Algorithm 1 Upper Confidence Bound Greedy
Open Source Code No The paper does not provide any explicit statement or link for open-source code.
Open Datasets Yes We perform jokes recommendation experiment using the Jesters joke dataset [14].
Dataset Splits No The paper does not specify explicit training, validation, or test dataset splits.
Hardware Specification No The paper mentions running experiments but does not provide specific hardware details (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes The gaps in mean rewards of the arms are fixed with i(i+1), chosen uniformly at random (u.a.r.) from [0.01, 0.05] for all i = 1 to 19. We also fix ยตK = 0. The rewards are distributed as Bernoulli random variables with mean ยตi. The delays are fixed either 1) by sampling all delays u.a.r. from [1, 10] (small delay instances), or 2) u.a.r. from [11, 20] (large delay instances), or 3) by fixing all the delay to a single value. ... We rescale the ratings to [0, 1] using x ! (x + 10)/20.