reproducibilityindex.ai

Multinomial Logit Contextual Bandits: Provable Optimality and Practicality

Authors: Min-hwan Oh, Garud Iyengar9205-9213

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate the performances of our proposed algorithms: UCB-MNL (Algorithm 1) and DBL-MNL (Algorithm 2) in numerical experiments. In our evaluations, we report the cumulative regret for each round t {1, ..., T}. For each experimental conﬁguration, we evaluate the algorithms on 20 independent instances and report average performances.
Researcher Affiliation	Academia	Min-hwan Oh1 and Garud Iyengar2 1Seoul National University, Seoul, South Korea 2Columbia University, New York, USA
Pseudocode	Yes	Algorithm 1 UCB-MNL; Algorithm 2 DBL-MNL
Open Source Code	No	The paper does not provide a direct link to source code or an explicit statement about its release.
Open Datasets	No	In each instance, the underlying parameter θ is sampled from the d-dimensional uniform distribution, with each element of θ uniformly distributed in [0, 1]. The underlying parameters are ﬁxed during each problem instance but not known to the algorithms. For efﬁcient evaluations, we consider uniform revenues, i.e., rti = 1 for all i and t.
Dataset Splits	No	The paper does not provide specific train/validation/test dataset splits. It describes generating synthetic data and evaluating algorithms over a time horizon T.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	In our evaluations, we report the cumulative regret for each round t {1, ..., T}. For each experimental conﬁguration, we evaluate the algorithms on 20 independent instances and report average performances. In each instance, the underlying parameter θ is sampled from the d-dimensional uniform distribution, with each element of θ uniformly distributed in [0, 1]. The underlying parameters are ﬁxed during each problem instance but not known to the algorithms. For efﬁcient evaluations, we consider uniform revenues, i.e., rti = 1 for all i and t. We compare the performances of the proposed algorithms with those of the state-of-the-art Thompson sampling based algorithms, TS-MNL and optimistic TS-MNL, proposed in Oh and Iyengar (2019). Additionally, we evaluate the performance of the provably optimal but impractical algorithm, sup CB-MNL (see Algorithm 5 in the appendix), that is based on the Auer-framework. Figure 1 shows that the performances of UCB-MNL and DBL-MNL are superior to or comparable to the state-of-the-art Thompson sampling methods. Moreover, the runtime evaluation shows that DBL-MNL is signiﬁcantly faster than the other methods due to its logarithmic number of parameter updates.