reproducibilityindex.ai

Weighted Linear Bandits for Non-Stationary Environments

Authors: Yoan Russac, Claire Vernade, Olivier Cappé

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also illustrate the empirical performance of D-Lin UCB and compare it with recently proposed alternatives in simulated environments. This section is devoted to the evaluation of the empirical performance of D-Lin UCB. We ﬁrst consider two simulated low-dimensional environments that illustrate the behavior of the algorithms when confronted to either abrupt changes or slow variations of the parameters.
Researcher Affiliation	Collaboration	Yoan Russac CNRS, Inria, ENS, Université PSL yoan.russac@ens.fr Claire Vernade Deepmind vernade@google.com Olivier Cappé CNRS, Inria, ENS, Université PSL olivier.cappe@cnrs.fr
Pseudocode	Yes	Algorithm 1: D-Lin UCB
Open Source Code	No	The paper does not provide any explicit statements about the release of source code or links to a code repository for the described methodology.
Open Datasets	Yes	For this experiment, a dataset providing a sample of 30 days of Criteo live trafﬁc data [13] was used.
Dataset Splits	No	The paper describes the use of synthetic data and a real dataset but does not provide explicit details about train/validation/test splits by percentages, counts, or specific predefined splits from cited sources.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as CPU or GPU models, memory specifications, or cloud computing instance types.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or specific frameworks).
Experiment Setup	Yes	For D-Lin UCB the discount parameter is chosen as γ = 1 ( BT / d T )2/3. For SW-Lin UCB the window s length is set to l = ( d T / BT )2/3, where d = 2 in the experiment. Those values are theoretically supposed to minimize the asymptotic regret. For the Dynamic Linear UCB algorithm, the badness is estimated from τ = 200 steps, as in the experimental section of [29]. The number of rounds is set to T = 6000. with 1-subgaussian random noise and Gaussian noise of variance σ2 = 0.15.