Multinomial Logit Bandit with Low Switching Cost
Authors: Kefan Dong, Yingkai Li, Qin Zhang, Yuan Zhou
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We propose two measures of adaptivity: the assortment switching cost and the more fine-grained item switching cost. We present an anytime algorithm (AT-DUCB) with O(N log T) assortment switches, almost matching the lower bound ( N log T log log T ). In the fixed-horizon setting, our algorithm FH-DUCB incurs O(N log log T) assortment switches, matching the asymptotic lower bound. We also present the ESUCB algorithm with item switching cost O(N log2 T). ... Theorem 2. For any time horizon T, the expect regret incurred by Algorithm 2 is E [Reg T ] . and the expected number of assortment switches E[ (asst) T ] is O(N log T). ... The lower bound. We complement our algorithmic result with the following almost matching lower bound. |
| Researcher Affiliation | Academia | 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China. 2Department of Computer Science, Northwestern University, Evanston, Illinois, USA. 3Computer Science Department, Indiana University, Bloomington, Indiana, USA. 4Department of ISE, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA. |
| Pseudocode | Yes | Algorithm 1: EXPLORATION(S), Algorithm 2: Anytime Deferred Update UCB (AT-DUCB), Algorithm 3: UPDATE(i), Algorithm 4: Deferred Update UCB for Fixed Time Horizon (FH-DUCB), Algorithm 5: The Exponential Stride UCB algorithm (ESUCB) for MNL-Bandit, Algorithm 6: CHECK( l, r, tmax) |
| Open Source Code | No | The paper does not provide any statements or links indicating that source code for the described methodology is publicly available. |
| Open Datasets | No | This paper focuses on theoretical contributions and does not mention the use of any datasets, public or otherwise. |
| Dataset Splits | No | This paper is theoretical and does not involve experimental data splits like training, validation, or testing. |
| Hardware Specification | No | This paper is theoretical and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | This paper is theoretical and does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | This paper presents theoretical algorithms and does not include details on experimental setup, hyperparameters, or training configurations. |