Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
Authors: Chicheng Zhang, Alekh Agarwal, Hal Daumé Iii, John Langford, Sahand Negahban
ICML 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we evaluate some of these algorithms on a large selection of datasets, showing that our approach is both feasible, and helpful in practice. |
| Researcher Affiliation | Collaboration | 1Microsoft Research 2University of Maryland 3Yale University. |
| Pseudocode | Yes | Algorithm 1: Adaptive Reweighting for Robustly Warmstarting Contextual Bandits (ARROW-CB) |
| Open Source Code | No | No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found. |
| Open Datasets | Yes | We compare these approaches on 524 binary and multiclass classification datasets from Bietti et al. (2018), which in turn are from openml.org. |
| Dataset Splits | Yes | Partition S to E+1 equally sized sets Str, Sval 1 , . . . , Sval E . ... where a separate validation set is used to pick λ. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments were provided. |
| Software Dependencies | No | No specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment were provided. |
| Experiment Setup | Yes | All the algorithms (other than SUP-ONLY and MAJORITY, which do not explore) use ϵ-greedy exploration, with most of the results presented using ϵ = 0.0125. We additionally present the results for ϵ = 0.1 and ϵ = 0.0625 in Appendix J. ... We vary the number of warm-start examples ns in {0.005n, 0.01n, 0.02n, 0.04n}, and the number of CB examples nb in {0.92n, 0.46n, 0.23n, 0.115n}. |