Approximation Theory Based Methods for RKHS Bandits
Authors: Sho Takemori, Masahiro Sato
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In synthetic environments, we empirically show that APG-UCB has almost the same cumulative regret as that of IGP-UCB and its running time is much shorter. |
| Researcher Affiliation | Industry | 1 FUJIFILM Business Innovation, Kanagawa, Japan. |
| Pseudocode | Yes | Algorithm 1 Construction of Newton basis with P-greedy algorithm (c.f. Pazouki & Schaback (2011)) |
| Open Source Code | No | The paper does not explicitly state the release of source code or provide a link to a repository for the described methodology. |
| Open Datasets | No | The paper describes generating synthetic environments and reward functions for experiments, but does not refer to a publicly available dataset with concrete access information (link, DOI, or specific citation to an established dataset). |
| Dataset Splits | No | The paper describes synthetic environments and online learning simulations, but does not specify training/validation/test dataset splits for reproducibility. |
| Hardware Specification | Yes | Computation is done by Intel Xeon E5-2630 v4 processor with 128 GB RAM. |
| Software Dependencies | No | No specific software dependencies with version numbers are mentioned in the paper. |
| Experiment Setup | Yes | We take l = 0.3 for the RQ kernel and l = 0.2 for the SE kernel, because the diameter of the d-dimensional cube is d. For each kernel, we generate 10 reward functions as above and evaluate our proposed method and the existing algorithm for time interval T = 5000 in terms of mean cumulative regret and total running time. |