Rotting Bandits

Authors: Nir Levine, Koby Crammer, Shie Mannor

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present algorithms, accompanied by simulations, and derive theoretical guarantees.
Researcher Affiliation Academia Nir Levine Electrical Engineering Department The Technion Haifa 32000, Israel levin.nir1@gmail.com Koby Crammer Electrical Engineering Department The Technion Haifa 32000, Israel koby@ee.technion.ac.il Shie Mannor Electrical Engineering Department The Technion Haifa 32000, Israel shie@ee.technion.ac.il
Pseudocode Yes Pseudo algorithm for SWA is given by Algorithm 1.
Open Source Code No The paper does not provide any specific links to source code repositories or explicit statements about the release of code for the described methodology.
Open Datasets No Setups for all the simulations we use Normal distributions with σ2 = 0.2, and T = 30, 000. Non-Parametric: K = 2. As for the expected rewards: µ1 (n) = 0.5, n, and µ2 (n) = 1 for its first 7, 500 pulls and 0.4 afterwards.
Dataset Splits No The paper conducts simulations by generating data based on defined reward distributions and parameters, rather than using pre-existing datasets with explicit training, validation, or test splits.
Hardware Specification No The paper describes simulation parameters and algorithmic details but does not specify any hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions implementing standard benchmark algorithms, but it does not provide specific software names along with their version numbers required for replication.
Experiment Setup Yes Setups for all the simulations we use Normal distributions with σ2 = 0.2, and T = 30, 000. Non-Parametric: K = 2. As for the expected rewards: µ1 (n) = 0.5, n, and µ2 (n) = 1 for its first 7, 500 pulls and 0.4 afterwards. Parametric AV & ANV: K = 10. The rotting models are of the form µ (j; θ) = int j / 100 + 1 θ, where int( ) is the lower rounded integer, and Θ = {0.1, 0.15, .., 0.4}. The parameters that were chosen by the grid search are as follows: γ = 0.999 for the non-parametric case, and 0.999999 for the parametric cases. τ = 4e3, 8e3, and 16e3 for the nonparametric, AV, and ANV cases, respectively. α = 0.2 was chosen for all cases.