Multinomial Logit Bandit with Low Switching Cost

Authors: Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We propose two measures of adaptivity: the assortment switching cost and the more fine-grained item switching cost. We present an anytime algorithm (AT-DUCB) with O(N log T) assortment switches, almost matching the lower bound ( N log T log log T ). In the fixed-horizon setting, our algorithm FH-DUCB incurs O(N log log T) assortment switches, matching the asymptotic lower bound. We also present the ESUCB algorithm with item switching cost O(N log2 T). ... Theorem 2. For any time horizon T, the expect regret incurred by Algorithm 2 is E [Reg T ] . and the expected number of assortment switches E[ (asst) T ] is O(N log T). ... The lower bound. We complement our algorithmic result with the following almost matching lower bound.
Researcher Affiliation Academia 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China. 2Department of Computer Science, Northwestern University, Evanston, Illinois, USA. 3Computer Science Department, Indiana University, Bloomington, Indiana, USA. 4Department of ISE, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.
Pseudocode Yes Algorithm 1: EXPLORATION(S), Algorithm 2: Anytime Deferred Update UCB (AT-DUCB), Algorithm 3: UPDATE(i), Algorithm 4: Deferred Update UCB for Fixed Time Horizon (FH-DUCB), Algorithm 5: The Exponential Stride UCB algorithm (ESUCB) for MNL-Bandit, Algorithm 6: CHECK( l, r, tmax)
Open Source Code No The paper does not provide any statements or links indicating that source code for the described methodology is publicly available.
Open Datasets No This paper focuses on theoretical contributions and does not mention the use of any datasets, public or otherwise.
Dataset Splits No This paper is theoretical and does not involve experimental data splits like training, validation, or testing.
Hardware Specification No This paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies No This paper is theoretical and does not mention any specific software dependencies with version numbers.
Experiment Setup No This paper presents theoretical algorithms and does not include details on experimental setup, hyperparameters, or training configurations.