Policies that Generalize: Solving Many Planning Problems with the Same Policy
Authors: Blai Bonet, Hector Geffner
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We establish conditions under which memoryless policies and finite-state controllers that solve one partially observable non-deterministic problem (PONDP) generalize to other problems; namely, problems that have a similar structure and share the same action and observation space. Our aim in this paper is to provide a characterization of the common structure that allows for policy generalization. We use a logical setting where uncertainty is represented by sets of states and the goal is to be achieved with certainty. In this setting, the notions of solution policy and generalization become very crisp: a policy solves a problem or not, and it generalizes to another problem or not. We thus ignore considerations of costs, quality or rewards, and do not consider probabilities explicitly. Our results, however, do apply to the probabilistic setting where goals are to be achieved with probability 1. |
| Researcher Affiliation | Academia | Blai Bonet Universidad Sim on Bol ıvar Caracas, Venezuela bonet@ldc.usb.ve Hector Geffner ICREA & Universitat Pompeu Fabra Barcelona, SPAIN hector.geffner@upf.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper uses conceptual examples to illustrate theoretical points, such as 'Countdown' and 'Dust-Cleaning Robot', but it does not mention or use any publicly available datasets in the context of experiments or training. |
| Dataset Splits | No | The paper does not describe empirical experiments, and therefore does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any empirical experiments, thus no hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and does not describe any empirical experiments, thus no software dependencies with version numbers are provided. |
| Experiment Setup | No | The paper is theoretical and focuses on defining models and proving theorems, therefore it does not provide details about an experimental setup, hyperparameters, or training configurations. |