reproducibilityindex.ai

Last Switch Dependent Bandits with Monotone Payoff Functions

Authors: Ayoub Foussoul, Vineet Goyal, Orestis Papadigenopoulos, Assaf Zeevi

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this work, we take a step towards understanding the approximability of planning LSD bandits, namely, the (NP-hard) problem of computing an optimal arm-pulling strategy under complete knowledge of the model. In particular, we design the ﬁrst efﬁcient constant approximation algorithm for the problem and show that, under a natural monotonicity assumption on the payoffs, its approximation guarantee (almost) matches the state-of-the-art for the special and well-studied class of recharging bandits (also known as delaydependent). In this attempt, we develop new tools and insights for this class of problems, including a novel higher-dimensional relaxation and the technique of mirroring the evolution of virtual states. We believe that these novel elements could potentially be used for approaching richer classes of action-induced nonstationary bandits (e.g., special instances of restless bandits). In the case where the model parameters are initially unknown, we develop an online learning adaptation of our algorithm for which we provide sublinear regret guarantees against its full-information counterpart.
Researcher Affiliation	Academia	1Industrial Engineering and Operations Research, Columbia University, New York, USA 2Data Science Institute, Columbia University, New York, USA 3Graduate School of Business, Columbia University, New York, USA. Correspondence to: Ayoub Foussoul <af3209@columbia.edu>, Orestis Papadigenopoulos <opapadig@columbia.edu>.
Pseudocode	Yes	Algorithm 1: Planning k-MLSD bandits.
Open Source Code	No	The paper does not provide any statements about the availability of open-source code or links to a code repository.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and analysis. It does not mention or use any specific public or open datasets for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not involve empirical validation with dataset splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for computations or experiments.
Software Dependencies	No	The paper does not specify any software names with version numbers for reproducibility.
Experiment Setup	No	The paper describes theoretical algorithm design and analysis, but it does not include details about an empirical experimental setup such as hyperparameters or training configurations.