Anytime Exploration for Multi-armed Bandits using Confidence Information
Authors: Kwang-Sung Jun, Robert Nowak
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our analysis shows that the sample complexity of AT-LUCB is competitive to anytime variants of existing algorithms. Moreover, our empirical evaluation on AT-LUCB shows that AT-LUCB performs as well as or better than state-of-the-art baseline methods for anytime Explore-m. |
| Researcher Affiliation | Academia | Kwang-Sung Jun KJUN@DISCOVERY.WISC.EDU Wisconsin Institutes for Discovery, UW-Madison, 330 N. Orchard St., Madison, WI 53715 USA Robert Nowak RDNOWAK@WISC.EDU Wisconsin Institutes for Discovery, UW-Madison, 330 N. Orchard St., Madison, WI 53715 USA |
| Pseudocode | Yes | Algorithm 1 AT-LUCB |
| Open Source Code | No | The paper does not provide an explicit link to open-source code for the methodology presented. |
| Open Datasets | Yes | We use the New Yorker dataset.6 The data consists of n = 496 captions with 100K ratings. Footnote 6: Dataset number 499 from https://github.com/nextml/NEXT-data/. |
| Dataset Splits | No | The paper describes the datasets used (toy MAB instances and New Yorker dataset) but does not specify training, validation, or test splits in detail for reproducibility. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU, CPU models) used for running the experiments. |
| Software Dependencies | No | The paper does not mention specific software dependencies with version numbers used for the experiments. |
| Experiment Setup | Yes | We run AT-LUCB with δ1 = 1/2, = .99, and = 0. We set the exploration parameter of UCB as 2. We run each method 200 times. |