reproducibilityindex.ai

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

Authors: Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study reinforcement learning (RL) with linear function approximation. For episodic timeinhomogeneous linear Markov decision processes (linear MDPs) whose transition probability can be parameterized as a linear function of a given feature mapping, we propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret e O(d H3K)... Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
Researcher Affiliation	Academia	1Department of Computer Science, University of California, Los Angeles, CA 90095, USA.
Pseudocode	Yes	Algorithm 1 LSVI-UCB++
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	The paper is theoretical and does not report on experimental evaluation using specific datasets, thus no publicly available dataset is mentioned or linked for training.
Dataset Splits	No	The paper is theoretical and does not report on experimental evaluation, thus no dataset splits for training, validation, or testing are provided.
Hardware Specification	No	The paper is theoretical and does not report on experimental evaluation, thus no hardware specifications for running experiments are provided.
Software Dependencies	No	The paper is theoretical and does not report on experimental evaluation or implementation details requiring specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not report on experimental evaluation, thus no details about experimental setup, hyperparameters, or training settings are provided.