PAC Reinforcement Learning With an Imperfect Model

Authors: Nan Jiang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work we aim at better understanding the theoretical nature of this approach... We then propose two conceptually simple algorithms that enjoy polynomial sample complexity guarantees... and prove some foundational results to provide insights into this important problem.
Researcher Affiliation Industry Nan Jiang Microsoft Research New York, NY 10011 nanjiang@umich.edu
Pseudocode Yes Algorithm 1 MODEL REPAIR( M) and Algorithm 2 MODEL PENALIZE( M)
Open Source Code No The paper does not contain any statement or link regarding the release of open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not involve training on a dataset.
Dataset Splits No The paper is theoretical and does not describe any dataset splits for validation.
Hardware Specification No The paper is theoretical and does not describe any specific hardware used for running experiments.
Software Dependencies No The paper is theoretical and does not list any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not detail an experimental setup, including hyperparameters or training configurations.