Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
Authors: Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study reinforcement learning (RL) with linear function approximation. For episodic timeinhomogeneous linear Markov decision processes (linear MDPs) whose transition probability can be parameterized as a linear function of a given feature mapping, we propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret e O(d H3K)... Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of California, Los Angeles, CA 90095, USA. |
| Pseudocode | Yes | Algorithm 1 LSVI-UCB++ |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not report on experimental evaluation using specific datasets, thus no publicly available dataset is mentioned or linked for training. |
| Dataset Splits | No | The paper is theoretical and does not report on experimental evaluation, thus no dataset splits for training, validation, or testing are provided. |
| Hardware Specification | No | The paper is theoretical and does not report on experimental evaluation, thus no hardware specifications for running experiments are provided. |
| Software Dependencies | No | The paper is theoretical and does not report on experimental evaluation or implementation details requiring specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not report on experimental evaluation, thus no details about experimental setup, hyperparameters, or training settings are provided. |