Asymptotic Instance-Optimal Algorithms for Interactive Decision Making
Authors: Kefan Dong, Tengyu Ma
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we design the first asymptotic instance-optimal algorithm for general interactive decision making problems with finite number of decisions under mild conditions. Our results, instantiated on concrete problems, recover the classical gap-dependent bounds for multi-armed bandits (Lai et al., 1985) and prior works on linear bandits (Lattimore & Szepesvari, 2017), and improve upon the previous best instance-dependent upper bound (Xu et al., 2021) for reinforcement learning. |
| Researcher Affiliation | Academia | Kefan Dong Stanford University kefandong@stanford.edu Tengyu Ma Stanford University tengyuma@stanford.edu |
| Pseudocode | Yes | Algorithm 1 Test-to-Commit (T2C) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code, nor does it include a link to a code repository. |
| Open Datasets | No | The paper does not conduct empirical studies or use datasets for training. It discusses problem types like multi-armed bandits and tabular RL in a theoretical context. |
| Dataset Splits | No | The paper does not conduct empirical studies or use datasets for validation. It discusses theoretical bounds and algorithm design. |
| Hardware Specification | No | The paper does not report empirical experiments, and therefore no specific hardware details are provided. |
| Software Dependencies | No | The paper does not report empirical experiments, and therefore no specific software dependencies with version numbers are listed. |
| Experiment Setup | No | The paper focuses on theoretical algorithm design and analysis. It does not include specific experimental setup details such as hyperparameters or training configurations for empirical evaluation. |