Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning

Authors: Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander T Toshev, Sergey Levine, brian ichter

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations for maze-solving and robotic manipulation tasks demonstrate that our approach improves long-horizon performance and enables better zero-shot generalization than alternative model-free and model-based methods.
Researcher Affiliation Collaboration γGoogle Research, Robotics @ Google βBerkeley AI Research, UC Berkeley
Pseudocode No The paper describes algorithms but does not provide formal pseudocode blocks or figures explicitly labeled 'Algorithm' or 'Pseudocode'.
Open Source Code No We plan to release more information about the proprietary environments and tasks used in a public release1 of this article at a later date.
Open Datasets Yes We use the versatile Mini Grid environment (Chevalier-Boisvert et al., 2018) in a fully observable setting, where the agent receives a top-down view of the environment.
Dataset Splits No The paper does not explicitly provide information on validation dataset splits, only training and test/generalization setups.
Hardware Specification No The paper does not specify any particular hardware used for running experiments (e.g., GPU/CPU models, cloud instances).
Software Dependencies No The paper refers to various algorithms and frameworks (e.g., DQN, DDQN, MT-Opt) but does not provide specific software version numbers for any libraries or dependencies.
Experiment Setup No The paper refers to Appendix A.1 and A.2 for 'further implementation details' and 'further details about these skills', implying that specific hyperparameter values and detailed training configurations are not fully present in the main text.