reproducibilityindex.ai

Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards

Authors: Bo Xue, Yimu Wang, Yuanyu Wan, Jinfeng Yi, Lijun Zhang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experimental results confirm the merits of our algorithms. This section demonstrates the improvement of our algorithms by numerical experiments.
Researcher Affiliation	Collaboration	1Department of Computer Science, City University of Hong Kong, Hong Kong, China 2The City University of Hong Kong Shenzhen Research Institute, Shenzhen, China 3Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada 4School of Software Technology, Zhejiang University, Ningbo, China 5JD AI Research, Beijing, China 6National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 7Peng Cheng Laboratory, Shenzhen, China
Pseudocode	Yes	Algorithm 1 Confidence Region with Truncated Mean (CRTM)
Open Source Code	No	The paper does not provide any explicit statements about open-sourcing the code for the described methods or a link to a code repository.
Open Datasets	No	The paper describes synthetic data generation based on specific distributions ("Student s t-Noise" and "Pareto Noise") and parameters, but does not refer to or provide access information for a pre-existing, publicly available dataset.
Dataset Splits	No	The paper describes a sequential decision-making process (bandit problem) where data is generated online, and thus does not include explicit descriptions of training, validation, or test dataset splits in the conventional sense of static datasets.
Hardware Specification	Yes	All algorithms are implemented using Py Charm 2022 and tested on a laptop with a 2.5GHz CPU and 32GB of memory.
Software Dependencies	Yes	All algorithms are implemented using Py Charm 2022
Experiment Setup	Yes	All algorithms are configured with ϵ = 1, δ = 0.01, and T = 106. The number of arms is set to K = 20, and the feature dimension is d = 10. We run 10 repetitions for each algorithm and display the average regret with time evolution.