reproducibilityindex.ai

Balanced Linear Contextual Bandits

Authors: Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, Guido Imbens3445-3453

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model misspeciﬁcation and prejudice in the initial training data.
Researcher Affiliation	Academia	1Department of Management Science & Engineering, Stanford University 2Department of Electrical Engineering, Stanford University 3Graduate School of Business, Stanford University
Pseudocode	Yes	Algorithm 1 Balanced Linear Thompson Sampling; Algorithm 2 Balanced Linear UCB
Open Source Code	No	The paper does not provide any specific links or statements about releasing open-source code for the described methodology.
Open Datasets	Yes	We use 300 multiclass datasets from the Open Media Library (Open ML).
Dataset Splits	No	The paper mentions 'cross-validation' for parameter tuning but does not specify explicit train/validation/test splits for the datasets, nor does it cite predefined splits. It only states that 'Each dataset is randomly shufﬂed.'
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies used in the experiments.
Experiment Setup	Yes	The regularization parameter λ, which is present in all algorithms, is chosen via cross-validation every time the model is updated. The constant α, which is present in all algorithms, is optimized among values 0.25, 0.5, 1 in the Thompson sampling bandits... and among values 1, 2, 4 in the UCB bandits... The propensity threshold γ for BLTS and BLUCB is optimized among the values 0.01, 0.05, 0.1, 0.2.