Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Basis refinement strategies for linear value function approximation in MDPs
Authors: Gheorghe Comanici, Doina Precup, Prakash Panangaden
NeurIPS 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs). We provide a general algorithmic framework for computing basis function refinements which respect the dynamics of the environment, and we derive approximation error bounds that apply for any algorithm respecting this general framework. The theoretical results show that any algorithmic scheme of this type satisfies strong bounds on the quality of the value function that can be obtained. The focus of this paper was to establish the theoretical underpinnings of the algorithm. |
| Researcher Affiliation | Academia | Gheorghe Comanici School of Computer Science Mc Gill University Montreal, Canada EMAIL Doina Precup School of Computer Science Mc Gill University Montreal, Canada EMAIL Prakash Panangaden School of Computer Science Mc Gill University Montreal, Canada EMAIL |
| Pseudocode | Yes | Algorithm 1 Prototype refinement |
| Open Source Code | No | The paper is theoretical and does not mention or provide access to open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not involve empirical studies with datasets, therefore it does not mention public dataset availability or access. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical studies with datasets, therefore it does not specify data splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup requiring specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental setup requiring specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings. |