Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Robust and Scalable Autonomous Reinforcement Learning in Irreversible Environments
Authors: Sang-Hyun Lee
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that RSA outperforms existing ARL algorithms with fewer manual resets in both reversible and irreversible environments. ... We evaluate RSA against baselines on diverse navigation and manipulation tasks. ... We design our experiments to investigate the following questions: (1) Can RSA achieve more robust performance with fewer manual resets than previous algorithms in both reversible and irreversible environments? (2) Can RSA generate a curriculum by identifying informative initial and goal states based on the learning progress of an agent? (3) How does each main component of RSA contribute to its performance improvement? |
| Researcher Affiliation | Academia | Sang-Hyun Lee Department of Automotive Engineering Ajou University Gyeonggi-do, South Korea EMAIL |
| Pseudocode | Yes | Algorithm 1 Robust and Scalable Autonomous Reinforcement Learning |
| Open Source Code | Yes | The supplementary materials include all of our code as well as a README.md file with instructions for reproducing the main experimental results. |
| Open Datasets | Yes | All tasks are provided by Gymnasium-Robotics [7]. ... [7] Rodrigo de Lazcano, Kallinteris Andreas, Jun Jet Tai, Seungjae Ryan Lee, and Jordan Terry. Gymnasium robotics, 2024. URL http://github.com/Farama-Foundation/ Gymnasium-Robotics. |
| Dataset Splits | No | The paper does not explicitly provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, and testing. It refers to using 'the same set of episodes with diverse initial and goal states' for evaluation without defining explicit splits. |
| Hardware Specification | Yes | The paper provides sufficient information on the computer resources to reproduce our experimental results in Appendix C. |
| Software Dependencies | Yes | The supplementary materials include all of our code as well as a README.md file with instructions for reproducing the main experimental results. |
| Experiment Setup | Yes | The paper provides all training and evaluation details, including hyperparameters and the type of optimizer, in Appendices B and C. |