Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning
Authors: Christopher Hoang, Sungryull Sohn, Jongwook Choi, Wilka Carvalho, Honglak Lee
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show in our experiments on Mini Grid and Vi ZDoom that SFL enables efficient exploration of large, high-dimensional state spaces and outperforms state-of-the-art baselines on long-horizon GCRL tasks. We evaluate SFL against current graph-based methods in long-horizon goal-reaching RL and visual navigation on Mini Grid [6], a 2D gridworld, and Vi ZDoom [37], a visual 3D first-person view environment with large mazes. We observe that SFL outperforms state-of-the-art navigation baselines, most notably when goals are furthest away. In a setting where exploration is needed to collect training experience, SFL significantly outperforms the other methods which struggle to scale in Vi ZDoom s high-dimensional state space. In our experiments, we evaluate the benefits of SFL for exploration and long-horizon GCRL. |
| Researcher Affiliation | Collaboration | Christopher Hoang 1 Sungryull Sohn 1 2 Jongwook Choi 1 Wilka Carvalho 1 Honglak Lee 1 2 1University of Michigan 2LG AI Research |
| Pseudocode | Yes | Algorithm 1 Training; Algorithm 2 Graph-Update (4.2) |
| Open Source Code | Yes | The demo video and code can be found at https://2016choang.github.io/sfl. |
| Open Datasets | Yes | We use mazes from SPTM in our experiments, with one example shown in Figure 3 [29]. Mini Grid [6], a 2D gridworld. https://github.com/maximecb/gym-minigrid, 2018. |
| Dataset Splits | No | The paper describes experimental setups (random spawn, fixed spawn) and goal sampling strategies, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts for static datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'rlpyt codebase' and 'pretrained ResNet-18 backbone from SPTM' but does not provide specific version numbers for any software components. |
| Experiment Setup | Yes | See Appendix C for more details on feature learning, edge formation, and hyperparameters. |