Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Model-Based Offline Planning
Authors: Arthur Argenson, Gabriel Dulac-Arnold
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show the performance of our algorithm, Model-Based Ofο¬ine Planning (MBOP) on a series of robotics-inspired tasks, and demonstrate its ability to leverage planning to respect environmental constraints. |
| Researcher Affiliation | Industry | Arthur Argenson EMAIL Google Research Gabriel Dulac-Arnold EMAIL Google Research |
| Pseudocode | Yes | Algorithm 1 High-Level MBOP-Policy; Algorithm 2 MBOP-Trajopt |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-sourcing the code for the described methodology. It only mentions that 'Accompanying videos are available here' and 'All non-standard datasets will be available publicly'. |
| Open Datasets | Yes | We use standard datasets from the RL Unplugged (RLU) (Gulcehre et al., 2020) and D4RL (Fu et al., 2020) papers. |
| Dataset Splits | Yes | On all datasets, training is performed on 90% of data and 10% is used for validation. |
| Hardware Specification | Yes | We calculate the average control frequency of MBOP on the RLU Walker task using a single Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz core and a Nvidia 1080TI ... Execution speeds on the RLU Walker task in represented in Table 9. ... on an Tesla P100 using a single core of a Xeon 2200 MHz equivalent processor. |
| Software Dependencies | No | The paper describes the software components (e.g., neural networks) but does not provide specific version numbers for any libraries, frameworks, or environments used. |
| Experiment Setup | Yes | The full set of parameters for each experiment can be found in the Appendix Sec. 5.2. ... # FC Layers : 2 Size FC Layers : 500 # Ensemble Networks : 3 Learning Rate : 0.001 Batch Size : 512 # Epochs : 40 |