Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Subgoal Search For Complex Reasoning Tasks
Authors: Konrad Czechowski, Tomasz Odrzygóźdź, Marek Zbysiński, Michał Zawalski, Krzysztof Olejnik, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically demonstrate the efficiency of MCTS-k Sub S and BF-k Sub S. In particular, we show that they vastly outperform their standard ( non-subgoal ) counterparts. As a testing ground, we consider three challenging domains: Sokoban, Rubik s Cube, and INT. All of them require non-trivial reasoning. |
| Researcher Affiliation | Collaboration | Konrad Czechowski University of Warsaw EMAIL Tomasz Odrzygó zd z University of Warsaw EMAIL Marek Zbysi nski University of Warsaw m.zbysinski@ students.mimuw.edu.pl Michał Zawalski University of Warsaw EMAIL Krzysztof Olejnik University of Warsaw k.olejnik3@ student.uw.edu.pl Yuhuai Wu University of Toronto, Vector Institute EMAIL Łukasz Kuci nski Polish Academy of Sciences EMAIL Piotr Miło s Polish Academy of Sciences, University of Oxford, deepsense.ai EMAIL |
| Pseudocode | Yes | Algorithm 1 Best-First Subgoal Search (BF-k Sub S) [...] Algorithm 2 Low-level conditional policy [...] Algorithm 3 Subgoal generator |
| Open Source Code | Yes | We provide the code of our method and experiment settings at https://github.com/ subgoal-search/subgoal-search, and a dedicated website https://sites.google.com/ view/subgoal-search. |
| Open Datasets | Yes | 2The dataset for INT or Sokoban can be easily generated or are publicly available. For the Rubik s Cube, we use random data or simple heuristic (random data are often sufficient for robotic tasks and navigation.) ... INT [55] |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning. |
| Hardware Specification | Yes | The results were obtained on a server with an Intel Xeon E5-2630 v4 CPU and eight NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions software components like 'transformer architecture' and 'convolutional network' but does not specify their version numbers or the versions of any other software dependencies. |
| Experiment Setup | Yes | Table 1: BF-k Sub S hyperparameters. [...] In Table 1, we provide the values of the hyperparameters used in all experiments. |