Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards

Authors: Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we conduct numerical experiments to demonstrate the effectiveness of our algorithms. In this section, we provide numerical experiments to illustrate the performance of our proposed algorithms: ADTM and ADMM.
Researcher Affiliation Collaboration 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China 2You Ku Cognitive and Intelligent Lab, Alibaba Group, Beijing 100102, China.
Pseudocode Yes Algorithm 1 Static Discretization with Truncated Mean (SDTM); Algorithm 2 Adaptive Discretization with Truncated Mean (ADTM); Algorithm 3 Median of Means Estimator (MME); Algorithm 4 Adaptive Discretization with Median of Means (ADMM)
Open Source Code No The paper does not include any explicit statement about releasing source code or provide a link to a code repository.
Open Datasets No The paper describes a synthetic experimental setup: "Following Magureanu et al. (2014), we set X = [0, 1] with D being the Euclidean metric on it, and choose µ(x) = a min(|x 0.4|, |x 0.8|) as the expected reward function". It does not use a publicly available or open dataset.
Dataset Splits No The paper uses a synthetic experimental setup and runs '40 independent repetitions'. It does not describe any training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We consider two cases: a = 0 and a = 2. For each case, we run 40 independent repetitions and report the average cumulative regret of each tested algorithm in Figure 1. Finally, following common practice (Zhang et al., 2016; Jun et al., 2017), we scale the confidence radius by a factor c searched within [1e 2, 1].