reproducibilityindex.ai

Efficient Contextual Bandits with Continuous Actions

Authors: Maryam Majzoubi, Chicheng Zhang, Rajan Chari, Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prove that it works in a general sense and verify the new functionality with large-scale experiments. We implement our algorithms in Vowpal Wabbit (vowpalwabbit.org), and compare with baselines on real datasets. Experiments demonstrate the efﬁcacy and efﬁciency of our approach (Section 5).
Researcher Affiliation	Collaboration	Maryam Majzoubi New York University Chicheng Zhang University of Arizona Rajan Chari Microsoft Research Akshay Krishnamurthy Microsoft Research John Langford Microsoft Research Aleksandrs Slivkins Microsoft Research
Pseudocode	Yes	Algorithm 1 CATS: continuous action tree with smoothing; Algorithm 2 Tree training: Train tree; Algorithm 3 CATS Off
Open Source Code	No	The paper implements algorithms in Vowpal Wabbit (vowpalwabbit.org), which is an existing open-source platform, but it does not provide concrete access to the authors' own implementation code for the specific methodologies described in this paper.
Open Datasets	Yes	We evaluate our approach on six large-scale regression datasets... ﬁve are selected from Open ML with the criterion of having millions of samples with unique regression values (See Appendix F for more details). The Open ML datasets include: Microsoft (ID: 235127), Yandex (ID: 235128), Epsilon (ID: 235129), Helena (ID: 235130), Higgs (ID: 235131).
Dataset Splits	No	The paper states: 'We create an 80-20% split or training and test sets.' and mentions 'progressive validation [15] for online evaluation', but it does not specify explicit train/validation/test dataset splits or a distinct validation set percentage/count for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (such as exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper states 'We implement our algorithms in Vowpal Wabbit (vowpalwabbit.org)', but it does not provide specific version numbers for Vowpal Wabbit or any other software dependencies needed to replicate the experiment.
Experiment Setup	Yes	All algorithms use ϵ = 0.05; see Appendix F for additional experimental details. With the training set, we ﬁrst collect interaction log tuples of (xt, at, Pt(at \| xt), ℓt(at)) using CATS with initial discretization and smoothing parameter (Kinit, hinit) = (4, 1/4), and greedy parameter ϵ = 0.05. We then run CATS Off over the logged data using J , deﬁned in (1), as the set of parameters.