reproducibilityindex.ai

Practical Contextual Bandits with Regression Oracles

Authors: Dylan Foster, Alekh Agarwal, Miroslav Dudik, Haipeng Luo, Robert Schapire

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In an extensive empirical evaluation, we ﬁnd that our approach typically matches or outperforms both realizability-based and agnostic baselines.
Researcher Affiliation	Collaboration	1Cornell University. Work performed while the author was an intern at Microsoft Research. 2Microsoft Research 3University of Southern California.
Pseudocode	Yes	Algorithm 1 REGCB.ELIMINATION ... Algorithm 2 REGCB.OPTIMISTIC ... Algorithm 3 BINSEARCH
Open Source Code	No	The paper references an implementation for baselines ("We use an implementation available at https://github. com/akshaykr/oracle_cb"), but does not state that the code for their own proposed methods (Reg CB) is open-source or available.
Open Datasets	Yes	We use two large-scale learning-to-rank datasets, Microsoft MSLRWEB30k (mslr) (Qin & Liu, 2010) and Yahoo! Learning to Rank Challenge V2.0 (yahoo) (Chapelle & Chang, 2011)... We also use eight classiﬁcation datasets from the UCI repository (Lichman, 2013).
Dataset Splits	Yes	Each dataset is split into training data , for which algorithm receives one example at a time and must predict online, and a holdout validation set.
Hardware Specification	No	The paper does not provide specific details about the hardware used for the experiments (e.g., CPU, GPU models, memory).
Software Dependencies	No	The paper mentions using specific software for baselines (e.g., scikit-learn implicitly, as it's cited generally), but does not list specific software dependencies with version numbers needed to replicate the experiments for their own methods.
Experiment Setup	Yes	Parameter Tuning: For -Greedy we tune the constant , and for ILTCB we tune a certain smoothing parameter (see Appendix B). For Algorithms 1 and 2 we set βm = β for all m and tune β. For Algorithm 2 we use a warm start of 0. We tune a conﬁdence parameter similar to β for Bootstrap-TS.