PAC Reinforcement Learning With an Imperfect Model
Authors: Nan Jiang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work we aim at better understanding the theoretical nature of this approach... We then propose two conceptually simple algorithms that enjoy polynomial sample complexity guarantees... and prove some foundational results to provide insights into this important problem. |
| Researcher Affiliation | Industry | Nan Jiang Microsoft Research New York, NY 10011 nanjiang@umich.edu |
| Pseudocode | Yes | Algorithm 1 MODEL REPAIR( M) and Algorithm 2 MODEL PENALIZE( M) |
| Open Source Code | No | The paper does not contain any statement or link regarding the release of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not involve training on a dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe any dataset splits for validation. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not list any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not detail an experimental setup, including hyperparameters or training configurations. |