Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Authors: Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study reinforcement learning (RL) with linear function approximation. For episodic timeinhomogeneous linear Markov decision processes (linear MDPs) whose transition probability can be parameterized as a linear function of a given feature mapping, we propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret e O(d H3K)... Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
Researcher Affiliation Academia 1Department of Computer Science, University of California, Los Angeles, CA 90095, USA.
Pseudocode Yes Algorithm 1 LSVI-UCB++
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No The paper is theoretical and does not report on experimental evaluation using specific datasets, thus no publicly available dataset is mentioned or linked for training.
Dataset Splits No The paper is theoretical and does not report on experimental evaluation, thus no dataset splits for training, validation, or testing are provided.
Hardware Specification No The paper is theoretical and does not report on experimental evaluation, thus no hardware specifications for running experiments are provided.
Software Dependencies No The paper is theoretical and does not report on experimental evaluation or implementation details requiring specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not report on experimental evaluation, thus no details about experimental setup, hyperparameters, or training settings are provided.