reproducibilityindex.ai

Factored Bandits

Authors: Julian Zimmert, Yevgeny Seldin

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our theoretical results with the ﬁrst paper because it matches our problem assumptions. In our experiments, we provide a comparison to both the original algorithm and the KL version. The number of arms is set to 16 in both sets. We always ﬁx u u v v 0.2. We vary the absolute value of u v . As expected, rank1Elim KL has an advantage when the Bernoulli random variables are strongly biased towards one side. When the bias is close to 1 2, we clearly see the better constants of TEA. In the evaluation we clearly outperform rank-1 Elimination over different parameter settings and even beat the KL optimized version if the means are not too close to zero or one. This supports that our algorithm does not only provide a more practical anytime version of elimination, but also improves on constant factors in the regret. Figure 2: Comparison of Rank1Elim, Rank1Elim KL, and TEA for K L 16. The results are averaged over 20 repetitions of the experiment. Figure 3: Comparison of Dueling Bandits algorithms with identical gaps of 0.4. The results are averaged over 20 repetitions of the experiment.
Researcher Affiliation	Academia	Julian Zimmert University of Copenhagen zimmert@di.ku.dk Yevgeny Seldin University of Copenhagen seldin@di.ku.dk
Pseudocode	Yes	Algorithm 1: Factored Bandit TEA Algorithm 2: Dueling Bandit TEA Algorithm 3: Temporary Elimination Module (TEM) Implementation
Open Source Code	No	The paper does not provide any link or explicit statement about the release of its source code for the methodology described.
Open Datasets	No	The paper describes experimental comparisons with different settings (e.g., 'number of arms is set to 16', 'winning probability...set to 0.7'), and mentions using 'the framework provided by Komiyama et al. [9]', but does not specify any publicly available datasets used for training or provide access information for such datasets.
Dataset Splits	No	The paper describes empirical comparisons and experimental results, but it does not specify any training/validation/test dataset splits (e.g., percentage splits, sample counts, or references to predefined standard splits).
Hardware Specification	No	The paper does not explicitly describe any specific hardware (e.g., GPU models, CPU models, memory details) used to run the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software components, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	The number of arms is set to 16 in both sets. We always ﬁx u u v v 0.2. We vary the absolute value of u v. We have used the framework provided by Komiyama et al. [9]. We use the same utility for all sub-optimal arms. In Figure 3, the winning probability of the optimal arm over suboptimal arms is always set to 0.7, we run the experiment for different number of arms K. To show that there also exists a regime where the improved constants gain an advantage over RMED, we conducted a second experiment in Figure 4 (in the Appendix), where we set the winning probability to 0.952 and signiﬁcantly increase the number of arms.