Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Authors: Gen Li, Laixi Shi, Yuxin Chen, Yuantao Gu, Yuejie Chi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The proof of this theorem can be found in the full version Li et al. (2021c). Additionally, the paper's own checklist states under '3. If you ran experiments...' that all sub-questions (a, b, c, d) are '[N/A]', indicating no empirical experiments were conducted.
Researcher Affiliation Academia Gen Li Princeton Laixi Shi CMU Yuxin Chen Princeton Yuantao Gu Tsinghua Yuejie Chi CMU Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ 08544, USA. Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China.
Pseudocode Yes Algorithm 1: Q-Early Settled-Advantage; Algorithm 2: Auxiliary functions
Open Source Code No The paper states under '3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'.
Open Datasets No The paper is theoretical and does not describe training on any specific dataset. It states under '3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'.
Dataset Splits No The paper is theoretical and does not describe dataset splits for validation. It states under '3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]'.
Hardware Specification No The paper is theoretical and does not describe specific hardware used for experiments. It states under '3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]'.
Software Dependencies No The paper is theoretical and does not describe specific software dependencies or versions. It states under '3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]'.
Experiment Setup No The paper is theoretical and does not provide specific experimental setup details like hyperparameters or training configurations for empirical runs. It states under '3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]'.