Iterative Value-Aware Model Learning
Authors: Amir-massoud Farahmand
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | The paper theoretically analyzes Iterative VAML and provides finite sample error upper bound guarantee for it. We theoretically analyze Iter VAML (Section 4). We provide a finite-sample error upper bound guarantee for the model learning that shows the effect of the number of samples and complexity of the model on the error bound (Section 4.1). We also analyze how the errors in the learned model affect the quality of the outcome policy. This is in the form of an error propagation result (Section 4.2). |
| Researcher Affiliation | Collaboration | Amir-massoud Farahmand Vector Institute, Toronto, Canada farahmand@vectorinstitute.ai. Part of this work has been done when the author was affiliated with Mitsubishi Electric Research Laboratories (MERL), Cambridge, USA. |
| Pseudocode | Yes | Algorithm 1 Model-based Reinforcement Learning Algorithm with Iterative VAML |
| Open Source Code | No | The paper does not provide any concrete statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper refers to a generic 'dataset Dn' for theoretical analysis, but does not mention or provide access information for any specific publicly available or open dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments, therefore it does not provide specific dataset split information. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments, thus no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any software implementation details with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training configurations. |