reproducibilityindex.ai

Provably Efficient Reinforcement Learning with Multinomial Logit Function Approximation

Authors: Long-Fei Li, Yu-Jie Zhang, Peng Zhao, Zhi-Hua Zhou

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study a new class of MDPs that employs multinomial logit (MNL) function approximation to ensure valid probability distributions over the state space. Despite its significant benefits, incorporating the non-linear function raises substantial challenges in both statistical and computational efficiency. [...] Finally, we establish the first lower bound for this problem, justifying the optimality of our results in d and K.
Researcher Affiliation	Academia	1 National Key Laboratory for Novel Software Technology, Nanjing University, China 2 School of Artificial Intelligence, Nanjing University, China 3 The University of Tokyo, Chiba, Japan
Pseudocode	Yes	Algorithm 1 UCRL-MNL-LL
Open Source Code	No	The paper states it is a theoretical paper and does not include experiments. It does not provide any links to open-source code for its methodology.
Open Datasets	No	The paper is theoretical and does not report on experiments using datasets.
Dataset Splits	No	The paper is theoretical and does not report on experiments using datasets, thus no dataset splits for validation are provided.
Hardware Specification	No	The paper is theoretical and does not describe any experimental hardware.
Software Dependencies	No	The paper is theoretical and does not describe any specific software dependencies with version numbers for experimental reproducibility.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings.