Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning

Authors: Adhyyan Narang, Andrew Wagenmaker, Lillian Ratliff, Kevin G. Jamieson

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The contributions of this paper are entirely theoretical.
Researcher Affiliation Academia Adhyyan Narang University of Washington adhyyan@uw.edu Andrew Wagenmaker University of California, Berkeley ajwagen@berkeley.edu Lillian J. Ratliff University of Washington ratliffl@uw.edu Kevin Jamieson University of Washington jamieson@cs.washington.edu
Pseudocode Yes Algorithm 1 PERP: Policy Elimination with Reference Policy (informal) ... Algorithm 2 PERP: Policy Elimination with Reference Policy
Open Source Code No The NeurIPS checklist states 'NA' for open access to data and code, justifying it with 'The contributions of this paper are entirely theoretical.' The paper does not provide any link or statement about open-source code availability.
Open Datasets No The contributions of this paper are entirely theoretical. Therefore, the paper does not discuss training on specific datasets or their availability.
Dataset Splits No The contributions of this paper are entirely theoretical. Therefore, the paper does not discuss dataset splits for training, validation, or testing.
Hardware Specification No The contributions of this paper are entirely theoretical. Therefore, the paper does not specify hardware used for experiments.
Software Dependencies No The contributions of this paper are entirely theoretical. Therefore, the paper does not specify software dependencies with version numbers for experiments.
Experiment Setup No The contributions of this paper are entirely theoretical. Therefore, the paper does not provide details about an experimental setup.