reproducibilityindex.ai

The Blinded Bandit: Learning with Adaptive Feedback

Authors: Ofer Dekel, Elad Hazan, Tomer Koren

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We develop efﬁcient online learning algorithms for this problem and prove that they guarantee the same asymptotic regret as the optimal algorithms for the standard multi-armed bandit problem. In this paper, we present a new algorithm for the blinded bandit setting and prove that it guarantees a regret of O(T) on any oblivious sequence of loss values.
Researcher Affiliation	Collaboration	Ofer Dekel Microsoft Research oferd@microsoft.com Elad Hazan Technion ehazan@ie.technion.ac.il Tomer Koren Technion tomerk@technion.ac.il
Pseudocode	Yes	Algorithm 1: BLINDED EXP3 and Algorithm 2: BLINDED GEOMETRICHEDGE
Open Source Code	No	The paper does not mention providing access to source code for the methodology described.
Open Datasets	No	The paper is theoretical and does not conduct experiments on datasets, thus no dataset access information is provided.
Dataset Splits	No	The paper is theoretical and does not involve empirical evaluation with dataset splits.
Hardware Specification	No	The paper is theoretical and does not conduct experiments, therefore no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not conduct experiments, therefore no specific software dependencies with version numbers are mentioned.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations.