Instance Dependent Regret Analysis of Kernelized Bandits

Authors: Shubhanshu Shekhar, Tara Javidi

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study the problem of designing an adaptive strategy for querying a noisy zeroth-order-oracle to efficiently learn about the optimizer of an unknown function f. To make the problem tractable, we assume that f lies in the reproducing kernel Hilbert space (RKHS) associated with a known kernel K, with its norm bounded by M < . Prior results, working in a minimax framework, have characterized the worst-case (over all functions in the problem class) limits on regret achievable by any algorithm, and have constructed algorithms with matching (modulo polylogarithmic factors) worst-case performance for the Mat ern family of kernels. These results suffer from two drawbacks. First, the minimax lower bound gives limited information about the limits of regret achievable by the commonly used algorithms on a specific problem instance f. Second, the existing upper bound analysis fails to adapt to easier problem instances within the function class. Our work takes steps to address both these issues. First, we derive instance-dependent regret lower bounds for algorithms with uniformly (over the function class) vanishing normalized cumulative regret. Our result, valid for several practically relevant kernelized bandits algorithms, such as, GP-UCB, GP-TS and Sup Kernel UCB, identifies a fundamental complexity measure associated with every problem instance. We then address the second issue, by proposing a new minimax near-optimal algorithm which also adapts to easier problem instances.
Researcher Affiliation Academia 1Carnegie Mellon University 2University of California, San Deigo. Correspondence to: Shubhanshu Shekhar <shubhan2@andrew.cmu.edu>.
Pseudocode Yes Algorithm 1: Breadthwise Exploration with Adaptive Discretization (A1)
Open Source Code No The paper does not contain any statement or link indicating that source code for the described methodology is publicly available.
Open Datasets No This paper is theoretical and focuses on deriving bounds and proposing algorithms. It does not mention or use specific public datasets for training, evaluation, or any experimental purposes.
Dataset Splits No The paper is theoretical and does not describe any experimental setup involving dataset splits for validation. No information on validation splits is provided.
Hardware Specification No The paper is theoretical and does not describe any computational experiments that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks) required to reproduce the work.
Experiment Setup No The paper presents theoretical analysis and an algorithm (Algorithm 1) but does not detail an experimental setup, hyperparameters, or training configurations. It focuses on theoretical properties and bounds rather than empirical implementation details.