Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards
Authors: Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct numerical experiments to demonstrate the effectiveness of our algorithms. In this section, we provide numerical experiments to illustrate the performance of our proposed algorithms: ADTM and ADMM. |
| Researcher Affiliation | Collaboration | 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China 2You Ku Cognitive and Intelligent Lab, Alibaba Group, Beijing 100102, China. |
| Pseudocode | Yes | Algorithm 1 Static Discretization with Truncated Mean (SDTM); Algorithm 2 Adaptive Discretization with Truncated Mean (ADTM); Algorithm 3 Median of Means Estimator (MME); Algorithm 4 Adaptive Discretization with Median of Means (ADMM) |
| Open Source Code | No | The paper does not include any explicit statement about releasing source code or provide a link to a code repository. |
| Open Datasets | No | The paper describes a synthetic experimental setup: "Following Magureanu et al. (2014), we set X = [0, 1] with D being the Euclidean metric on it, and choose µ(x) = a min(|x 0.4|, |x 0.8|) as the expected reward function". It does not use a publicly available or open dataset. |
| Dataset Splits | No | The paper uses a synthetic experimental setup and runs '40 independent repetitions'. It does not describe any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We consider two cases: a = 0 and a = 2. For each case, we run 40 independent repetitions and report the average cumulative regret of each tested algorithm in Figure 1. Finally, following common practice (Zhang et al., 2016; Jun et al., 2017), we scale the confidence radius by a factor c searched within [1e 2, 1]. |