Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Parameterizing Non-Parametric Meta-Reinforcement Learning Tasks via Subtask Decomposition
Authors: Suyoung Lee, Myungsik Cho, Youngchul Sung
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on the Meta-World ML-10 and ML-45 benchmarks [71], widely used meta-RL benchmarks comprising diverse non-parametric robotic manipulation tasks. We empirically demonstrate that our method successfully meta-learns the shareable subtask decomposition. With the help of the subtask decomposition and virtual training, our method, without any offline demonstration or test-time gradient updates, achieves test success rates of 33.4% on ML-10 and 31.2% on ML-45, which improves the previous state-of-the-art by approximately 1.7 times and 1.3 times, respectively. |
| Researcher Affiliation | Academia | Suyoung Lee, Myungsik Cho, Youngchul Sung School of Electrical Engineering KAIST Daejeon 34141, Republic of Korea EMAIL |
| Pseudocode | Yes | A Pseudocode Algorithm 1 Subtask Decomposition and Virtual Training (SDVT) |
| Open Source Code | Yes | Our implementation is available at https://github.com/suyoung-lee/SDVT. |
| Open Datasets | Yes | Meta-World benchmark The Meta-World V2 benchmark [71] stands as the most prominent, if not the only, established benchmark for assessing meta-RL algorithms featuring non-parametric task variability. [71] T. Yu, D. Quillen, Z. He, R. Julian, A. Narayan, H. Shively, A. Bellathur, K. Hausman, C. Finn, and S. Levine. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. ar Xiv preprint ar Xiv:1910.10897, 2021. |
| Dataset Splits | No | The paper describes meta-training and meta-testing tasks, and the structure of meta-episodes, but it does not specify explicit dataset splits (e.g., percentages or counts for training, validation, and test data subsets) within a given task that would be required for reproduction. |
| Hardware Specification | Yes | Our experiments were conducted using an Nvidia TITAN Xp. |
| Software Dependencies | No | The paper mentions software like 'Garage repository [15]' and algorithms like 'PPO [49]' for implementation, and specifies using an 'exact version' of the Garage repository (referencing a pull request URL), but it does not provide explicit version numbers for general software dependencies such as Python, PyTorch, or TensorFlow, nor a specific numbered release for the Garage framework itself. |
| Experiment Setup | Yes | Table 3: Hyperparameters of SDVT and SD. Hyperparameters of SDVT used for Meta-World ML-10 and ML-45 along with the notations in the manuscript and the argument names in the source code. |