Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Tree-Guided Diffusion Planner
Authors: Hyeonseong Jeon, Cheolhong Min, Jaesik Park
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate TDP on three diverse tasks: maze gold-picking, robot arm block manipulation, and Ant Maze multi-goal exploration. TDP consistently outperforms state-of-the-art approaches on all tasks. The project page can be found at: tree-diffusion-planner.github.io. |
| Researcher Affiliation | Academia | Hyeonseong Jeon1 Cheolhong Min1 Jaesik Park1,2 1Department of Computer Science & Engineering, 2Interdisciplinary Program of AI Seoul National University |
| Pseudocode | Yes | Appendix A, B outline the overall TDP pipeline and present the full algorithms. We detail the core modules of TDP: state decomposition (Sec. 4.1), parent branching (Sec. 4.2), and subtree expansion (Sec. 4.3). Algorithm 1 State Decomposition (SD) ... Algorithm 2 Parent Branching ... Algorithm 3 Sub-Tree Expansion |
| Open Source Code | Yes | The project page can be found at: tree-diffusion-planner.github.io. Code with instructions to reproduce the main results is available on the project website. |
| Open Datasets | Yes | We extend the single gold-picking example [2] in the Maze2D environment [40] to a multi-task benchmark. Diffusion planners are pretrained on arbitrary block stacking demonstrations collected from PDDLStream [48]. We finally evaluate test-time multi-goal exploration capability on Ant Maze [40]. |
| Dataset Splits | No | The paper does not provide explicit training/test/validation dataset splits. Instead, it describes evaluation on tasks and benchmarks, specifying metrics, seeds, and task configurations, but not conventional data splits. |
| Hardware Specification | Yes | All experiments were conducted using a single NVIDIA Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python X.X, PyTorch X.X) in the main text or appendices. |
| Experiment Setup | Yes | All experimental hyperparameters are reported in Appendix D. Table 4: Hyperparameters of three tasks. Task Name Value maze2d-medium planning horizon Tpred 256 maze2d-medium maximum steps Tmax 600 maze2d-large planning horizon Tpred 384 maze2d-large maximum steps Tmax 800 Maze2D Gold-picking Threshold distance 0.3 gradient guidance strength αg 62.5 particle guidance strength αp 0.1 diffusion steps N = Nf 256 Number of samples B 128 |