Efficient-UCBV: An Almost Optimal Algorithm Using Variance Estimates
Authors: Subhojyoti Mukherjee, K. P. Naveen, Nandan Sudarsanam, Balaraman Ravindran
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through an extensive numerical study we show that EUCBV significantly outperforms the popular UCB variants (like MOSS, OCUCB, etc.) as well as Thompson sampling and Bayes-UCB algorithms. |
| Researcher Affiliation | Academia | 1Department of Computer Science & Engineering, Indian Institute of Technology Madras 2Department of Electrical Engineering, Indian Institute of Technology Tirupati 3Department of Management Studies, Indian Institute of Technology Madras 4 Robert Bosch Centre for Data Science and AI (RBC-DSAI), Indian Institute of Technology Madras |
| Pseudocode | Yes | Algorithm 1 EUCBV |
| Open Source Code | No | The paper does not provide a link or explicit statement about the open-sourcing of the EUCBV algorithm's code. It mentions code for other algorithms: 'The implementation for KLUCB, Bayes-UCB and DMED were taken from Cappe, Garivier, and Kaufmann (2012)'. |
| Open Datasets | No | The paper describes experiments using synthetically generated data (e.g., '20 Bernoulli distributed arms', '100 arms involving Gaussian reward distributions') rather than named public datasets with explicit access information. |
| Dataset Splits | No | The paper does not explicitly describe train/validation/test dataset splits. For multi-armed bandit problems, the algorithm learns sequentially rather than on pre-split datasets in the manner of supervised learning. |
| Hardware Specification | No | No specific hardware (e.g., GPU/CPU models, memory) used for running the experiments is mentioned in the paper. |
| Software Dependencies | No | The paper mentions that implementations for some baseline algorithms were taken from a citation ('Cappe, Garivier, and Kaufmann (2012)') but does not list specific software dependencies with version numbers for their own algorithm or experimental setup. |
| Experiment Setup | Yes | The parameters of EUCBV algorithm for all the experiments are set as follows: ψ = T K2 and ρ = 0.5 (as in Corollary 1). Experiment-1 (Bernoulli with uniform gaps): This experiment is conducted to observe the performance of EUCBV over a short horizon. The horizon T is set to 60000. The testbed comprises of 20 Bernoulli distributed arms with expected rewards of the arms as r1:19 = 0.07 and r 20 = 0.1... |