Basis refinement strategies for linear value function approximation in MDPs

Authors: Gheorghe Comanici, Doina Precup, Prakash Panangaden

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs). We provide a general algorithmic framework for computing basis function refinements which respect the dynamics of the environment, and we derive approximation error bounds that apply for any algorithm respecting this general framework. The theoretical results show that any algorithmic scheme of this type satisfies strong bounds on the quality of the value function that can be obtained. The focus of this paper was to establish the theoretical underpinnings of the algorithm.
Researcher Affiliation Academia Gheorghe Comanici School of Computer Science Mc Gill University Montreal, Canada gcoman@cs.mcgill.ca Doina Precup School of Computer Science Mc Gill University Montreal, Canada dprecup@cs.mcgill.ca Prakash Panangaden School of Computer Science Mc Gill University Montreal, Canada prakash@cs.mcgill.ca
Pseudocode Yes Algorithm 1 Prototype refinement
Open Source Code No The paper is theoretical and does not mention or provide access to open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not involve empirical studies with datasets, therefore it does not mention public dataset availability or access.
Dataset Splits No The paper is theoretical and does not involve empirical studies with datasets, therefore it does not specify data splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any experimental setup requiring specific hardware specifications.
Software Dependencies No The paper is theoretical and does not describe any experimental setup requiring specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings.