Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning

Authors: Christopher Hoang, Sungryull Sohn, Jongwook Choi, Wilka Carvalho, Honglak Lee

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show in our experiments on Mini Grid and Vi ZDoom that SFL enables efficient exploration of large, high-dimensional state spaces and outperforms state-of-the-art baselines on long-horizon GCRL tasks. We evaluate SFL against current graph-based methods in long-horizon goal-reaching RL and visual navigation on Mini Grid [6], a 2D gridworld, and Vi ZDoom [37], a visual 3D first-person view environment with large mazes. We observe that SFL outperforms state-of-the-art navigation baselines, most notably when goals are furthest away. In a setting where exploration is needed to collect training experience, SFL significantly outperforms the other methods which struggle to scale in Vi ZDoom s high-dimensional state space. In our experiments, we evaluate the benefits of SFL for exploration and long-horizon GCRL.
Researcher Affiliation Collaboration Christopher Hoang 1 Sungryull Sohn 1 2 Jongwook Choi 1 Wilka Carvalho 1 Honglak Lee 1 2 1University of Michigan 2LG AI Research
Pseudocode Yes Algorithm 1 Training; Algorithm 2 Graph-Update (4.2)
Open Source Code Yes The demo video and code can be found at https://2016choang.github.io/sfl.
Open Datasets Yes We use mazes from SPTM in our experiments, with one example shown in Figure 3 [29]. Mini Grid [6], a 2D gridworld. https://github.com/maximecb/gym-minigrid, 2018.
Dataset Splits No The paper describes experimental setups (random spawn, fixed spawn) and goal sampling strategies, but it does not specify explicit train/validation/test dataset splits with percentages or sample counts for static datasets.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'rlpyt codebase' and 'pretrained ResNet-18 backbone from SPTM' but does not provide specific version numbers for any software components.
Experiment Setup Yes See Appendix C for more details on feature learning, edge formation, and hyperparameters.