Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning
Authors: Dhruv Shah, Peng Xu, Yao Lu, Ted Xiao, Alexander T Toshev, Sergey Levine, brian ichter
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluations for maze-solving and robotic manipulation tasks demonstrate that our approach improves long-horizon performance and enables better zero-shot generalization than alternative model-free and model-based methods. |
| Researcher Affiliation | Collaboration | γGoogle Research, Robotics @ Google βBerkeley AI Research, UC Berkeley |
| Pseudocode | No | The paper describes algorithms but does not provide formal pseudocode blocks or figures explicitly labeled 'Algorithm' or 'Pseudocode'. |
| Open Source Code | No | We plan to release more information about the proprietary environments and tasks used in a public release1 of this article at a later date. |
| Open Datasets | Yes | We use the versatile Mini Grid environment (Chevalier-Boisvert et al., 2018) in a fully observable setting, where the agent receives a top-down view of the environment. |
| Dataset Splits | No | The paper does not explicitly provide information on validation dataset splits, only training and test/generalization setups. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running experiments (e.g., GPU/CPU models, cloud instances). |
| Software Dependencies | No | The paper refers to various algorithms and frameworks (e.g., DQN, DDQN, MT-Opt) but does not provide specific software version numbers for any libraries or dependencies. |
| Experiment Setup | No | The paper refers to Appendix A.1 and A.2 for 'further implementation details' and 'further details about these skills', implying that specific hyperparameter values and detailed training configurations are not fully present in the main text. |