Phase Transitions and Cyclic Phenomena in Bandits with Switching Constraints

Authors: David Simchi-Levi, Yunzong Xu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We consider the classical stochastic multi-armed bandit problem with a constraint on the total cost incurred by switching between actions. Under the unit switching cost structure, where the constraint limits the total number of switches, we prove matching upper and lower bounds on regret and provide near-optimal algorithms for this problem. Surprisingly, we discover phase transitions and cyclic phenomena of the optimal regret.
Researcher Affiliation Academia David Simchi-Levi Institute for Data, Systems and Society Massachusetts Institute of Technology Cambridge, MA 02139 dslevi@mit.edu Yunzong Xu Institute for Data, Systems and Society Massachusetts Institute of Technology Cambridge, MA 02139 yxu@mit.edu
Pseudocode Yes Algorithm 1 S-Switch Successive Elimination (SS-SE) Input: Number of arms k, Switching budget S, Horizon T ... Algorithm 2 Hamiltonian-Switching Successive Elimination (HS-SE) Input: Switching Graph G, Switching budget S, Horizon T
Open Source Code No For the full version of this paper (containing additional results and missing proofs), see [14]. [14] D. Simchi-Levi and Y. Xu. Phase transitions and cyclic phenomena in bandits with switching constraints. ar Xiv preprint ar Xiv:1905.10825, 2019. URL https://arxiv.org/abs/1905. 10825. This citation refers to an arXiv preprint, which typically hosts paper versions, not source code for the methodology.
Open Datasets No This paper is theoretical and does not involve experimental evaluation on datasets. Therefore, it does not mention publicly available or open datasets for training.
Dataset Splits No This paper is theoretical and does not involve experimental evaluation on datasets. Therefore, it does not provide training/test/validation dataset splits.
Hardware Specification No The paper is theoretical and does not report on experiments requiring specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper describes theoretical algorithms and proofs. It does not report on experiments and thus does not list any specific software dependencies with version numbers.
Experiment Setup No The paper focuses on theoretical analysis and algorithm design rather than empirical experiments. As such, it does not detail any experimental setup, hyperparameters, or system-level training settings.