Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs

Authors: Han Shao, Xiaotian Yu, Irwin King, Michael R. Lyu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our proposed algorithms are evaluated based on synthetic datasets, and outperform the state-of-the-art results.
Researcher Affiliation Academia Han Shao Xiaotian Yu Irwin King Michael R. Lyu Department of Computer Science and Engineering The Chinese University of Hong Kong {hshao,xtyu,king,lyu}@cse.cuhk.edu.hk
Pseudocode Yes Algorithm 1 Me dian of mean s under OFU; Algorithm 2 T runcation under OFU
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described algorithms (MENU and TOFU) is publicly available.
Open Datasets No To show effectiveness of bandit algorithms, we will demonstrate cumulative payoffs with respect to number of rounds for playing bandits over a fixed finite-arm decision set. For verifications, we adopt four synthetic datasets (named as S1 S4) in the experiments, of which statistics are shown in Table 1.
Dataset Splits No The paper mentions running experiments over a total number of rounds (T) and independent repetitions, but it does not describe specific train/validation/test dataset splits or cross-validation procedures for reproducibility. The data used is synthetic and generated for the experiments.
Hardware Specification Yes We run multiple independent repetitions for each dataset in a personal computer under Windows 7 with Intel CPU@3.70GHz and 16GB memory.
Software Dependencies No The paper mentions 'Windows 7' for the operating system, but does not specify any software dependencies with version numbers (e.g., programming languages, libraries, frameworks) crucial for reproducibility.
Experiment Setup Yes For all algorithms, we set λ = 1.0, and δ = 0.1.