reproducibilityindex.ai

Post-Contextual-Bandit Inference

Authors: Aurelien Bibaut, Maria Dimakopoulou, Nathan Kallus, Antoine Chambaz, Mark van der Laan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive numerical experiments using 57 Open ML datasets demonstrate that conﬁdence intervals based on CADR uniquely provide correct coverage.
Researcher Affiliation	Collaboration	Aur elien Bibaut Netﬂix abibaut@netflix.com Maria Dimakopoulou Netﬂix mdimakopoulou@netflix.com Nathan Kallus Cornell University and Netﬂix kallus@cornell.edu Antoine Chambaz Universit e de Paris antoine.chambaz@u-paris.fr Mark van der Laan University of California, Berkeley laan@stat.berkeley.edu
Pseudocode	Yes	Algorithm 1 The CADR Estimator and Conﬁdence Interval
Open Source Code	Yes	1The code can be found at https://github.com/mdimakopoulou/post-contextual-bandit-inference.
Open Datasets	Yes	We use the public Open ML Curated Classiﬁcation benchmarking suite 2018 (Open ML-CC18; BSD 3-Clause license) [Bischl et al., 2017]
Dataset Splits	Yes	8 different training procedures (sequential cross-ﬁtting vs. cross-time cross-ﬁtting in Figures 3 and 3; misspeciﬁed vs. well-speciﬁed outcome model family in Figures 4 and 5; weighted vs. unweighted outcome model ﬁtting in Figures 6 and 7; large data vs. small data in Figures 3 and 8).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'linear regression or decision-tree regression, both using default sklearn parameters' but does not specify version numbers for `sklearn` or any other software dependencies.
Experiment Setup	Yes	To generate our data, we set T = 10000 and use the following ϵ-greedy procedure. We pull arms uniformly at random until each arm has been pulled at least once. Then at each subsequent round t, we ﬁt b Qt 1 using the data up to that time in the same fashion as used for the DM estimator above using decision-tree regressions. We set Ax(t) = arg maxa=1,...,K b Qt 1(a, X(t)) and ϵt = 0.01 t 1/3. We then let gt(a \| x) = ϵt/K for a = Ax(t) and gt( Ax(t) \| x) = 1 ϵt + ϵt/K.