Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards
Authors: Bo Xue, Yimu Wang, Yuanyu Wan, Jinfeng Yi, Lijun Zhang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experimental results confirm the merits of our algorithms. This section demonstrates the improvement of our algorithms by numerical experiments. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, City University of Hong Kong, Hong Kong, China 2The City University of Hong Kong Shenzhen Research Institute, Shenzhen, China 3Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada 4School of Software Technology, Zhejiang University, Ningbo, China 5JD AI Research, Beijing, China 6National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 7Peng Cheng Laboratory, Shenzhen, China |
| Pseudocode | Yes | Algorithm 1 Confidence Region with Truncated Mean (CRTM) |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing the code for the described methods or a link to a code repository. |
| Open Datasets | No | The paper describes synthetic data generation based on specific distributions ("Student s t-Noise" and "Pareto Noise") and parameters, but does not refer to or provide access information for a pre-existing, publicly available dataset. |
| Dataset Splits | No | The paper describes a sequential decision-making process (bandit problem) where data is generated online, and thus does not include explicit descriptions of training, validation, or test dataset splits in the conventional sense of static datasets. |
| Hardware Specification | Yes | All algorithms are implemented using Py Charm 2022 and tested on a laptop with a 2.5GHz CPU and 32GB of memory. |
| Software Dependencies | Yes | All algorithms are implemented using Py Charm 2022 |
| Experiment Setup | Yes | All algorithms are configured with ϵ = 1, δ = 0.01, and T = 106. The number of arms is set to K = 20, and the feature dimension is d = 10. We run 10 repetitions for each algorithm and display the average regret with time evolution. |