Balanced Linear Contextual Bandits
Authors: Maria Dimakopoulou, Zhengyuan Zhou, Susan Athey, Guido Imbens3445-3453
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model misspecification and prejudice in the initial training data. |
| Researcher Affiliation | Academia | 1Department of Management Science & Engineering, Stanford University 2Department of Electrical Engineering, Stanford University 3Graduate School of Business, Stanford University |
| Pseudocode | Yes | Algorithm 1 Balanced Linear Thompson Sampling; Algorithm 2 Balanced Linear UCB |
| Open Source Code | No | The paper does not provide any specific links or statements about releasing open-source code for the described methodology. |
| Open Datasets | Yes | We use 300 multiclass datasets from the Open Media Library (Open ML). |
| Dataset Splits | No | The paper mentions 'cross-validation' for parameter tuning but does not specify explicit train/validation/test splits for the datasets, nor does it cite predefined splits. It only states that 'Each dataset is randomly shuffled.' |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, memory). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies used in the experiments. |
| Experiment Setup | Yes | The regularization parameter λ, which is present in all algorithms, is chosen via cross-validation every time the model is updated. The constant α, which is present in all algorithms, is optimized among values 0.25, 0.5, 1 in the Thompson sampling bandits... and among values 1, 2, 4 in the UCB bandits... The propensity threshold γ for BLTS and BLUCB is optimized among the values 0.01, 0.05, 0.1, 0.2. |