Model-Based Offline Planning

Authors: Arthur Argenson, Gabriel Dulac-Arnold

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show the performance of our algorithm, Model-Based Offline Planning (MBOP) on a series of robotics-inspired tasks, and demonstrate its ability to leverage planning to respect environmental constraints.
Researcher Affiliation Industry Arthur Argenson aarg@google.com Google Research Gabriel Dulac-Arnold dulacarnold@google.com Google Research
Pseudocode Yes Algorithm 1 High-Level MBOP-Policy; Algorithm 2 MBOP-Trajopt
Open Source Code No The paper does not provide an explicit statement or link for open-sourcing the code for the described methodology. It only mentions that 'Accompanying videos are available here' and 'All non-standard datasets will be available publicly'.
Open Datasets Yes We use standard datasets from the RL Unplugged (RLU) (Gulcehre et al., 2020) and D4RL (Fu et al., 2020) papers.
Dataset Splits Yes On all datasets, training is performed on 90% of data and 10% is used for validation.
Hardware Specification Yes We calculate the average control frequency of MBOP on the RLU Walker task using a single Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz core and a Nvidia 1080TI ... Execution speeds on the RLU Walker task in represented in Table 9. ... on an Tesla P100 using a single core of a Xeon 2200 MHz equivalent processor.
Software Dependencies No The paper describes the software components (e.g., neural networks) but does not provide specific version numbers for any libraries, frameworks, or environments used.
Experiment Setup Yes The full set of parameters for each experiment can be found in the Appendix Sec. 5.2. ... # FC Layers : 2 Size FC Layers : 500 # Ensemble Networks : 3 Learning Rate : 0.001 Batch Size : 512 # Epochs : 40