Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning
Authors: Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander T Toshev, Sergey Levine, brian ichter
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations for maze-solving and robotic manipulation tasks demonstrate that our approach improves long-horizon performance and enables better zero-shot generalization than alternative model-free and model-based methods. |
| Researcher Affiliation | Collaboration | γGoogle Research, Robotics @ Google βBerkeley AI Research, UC Berkeley |
| Pseudocode | No | The paper describes algorithms but does not provide formal pseudocode blocks or figures explicitly labeled 'Algorithm' or 'Pseudocode'. |
| Open Source Code | No | We plan to release more information about the proprietary environments and tasks used in a public release1 of this article at a later date. |
| Open Datasets | Yes | We use the versatile Mini Grid environment (Chevalier-Boisvert et al., 2018) in a fully observable setting, where the agent receives a top-down view of the environment. |
| Dataset Splits | No | The paper does not explicitly provide information on validation dataset splits, only training and test/generalization setups. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running experiments (e.g., GPU/CPU models, cloud instances). |
| Software Dependencies | No | The paper refers to various algorithms and frameworks (e.g., DQN, DDQN, MT-Opt) but does not provide specific software version numbers for any libraries or dependencies. |
| Experiment Setup | No | The paper refers to Appendix A.1 and A.2 for 'further implementation details' and 'further details about these skills', implying that specific hyperparameter values and detailed training configurations are not fully present in the main text. |