Asymptotic Instance-Optimal Algorithms for Interactive Decision Making

Authors: Kefan Dong, Tengyu Ma

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we design the first asymptotic instance-optimal algorithm for general interactive decision making problems with finite number of decisions under mild conditions. Our results, instantiated on concrete problems, recover the classical gap-dependent bounds for multi-armed bandits (Lai et al., 1985) and prior works on linear bandits (Lattimore & Szepesvari, 2017), and improve upon the previous best instance-dependent upper bound (Xu et al., 2021) for reinforcement learning.
Researcher Affiliation Academia Kefan Dong Stanford University kefandong@stanford.edu Tengyu Ma Stanford University tengyuma@stanford.edu
Pseudocode Yes Algorithm 1 Test-to-Commit (T2C)
Open Source Code No The paper does not provide any explicit statements about releasing source code, nor does it include a link to a code repository.
Open Datasets No The paper does not conduct empirical studies or use datasets for training. It discusses problem types like multi-armed bandits and tabular RL in a theoretical context.
Dataset Splits No The paper does not conduct empirical studies or use datasets for validation. It discusses theoretical bounds and algorithm design.
Hardware Specification No The paper does not report empirical experiments, and therefore no specific hardware details are provided.
Software Dependencies No The paper does not report empirical experiments, and therefore no specific software dependencies with version numbers are listed.
Experiment Setup No The paper focuses on theoretical algorithm design and analysis. It does not include specific experimental setup details such as hyperparameters or training configurations for empirical evaluation.