Learning Temporally AbstractWorld Models without Online Experimentation

Authors: Benjamin Freed, Siddarth Venkatraman, Guillaume Adrien Sartoretti, Jeff Schneider, Howie Choset

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our approach performs comparably to or better than a wide array of state-of-the-art offline RL algorithms on a number of simulated robotics locomotion and manipulation benchmarks, while offering a higher degree of adaptability to new goals.
Researcher Affiliation Academia 1Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 2Mechanical Engineering Department, National University of Singapore, Singapore.
Pseudocode No The paper describes algorithms in text (e.g., 'incremental EM-style algorithm') but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code No The paper does not contain an explicit statement of code release or a link to a code repository for the main methodology. It only mentions 'compute_elbo.py (included in the supplemantal material)' for a specific example, not the full project.
Open Datasets Yes We compare the performance of OPOSM with that of several other offline RL algorithms (...) on multiple tasks from the D4RL benchmark suite (Fu et al., 2020).
Dataset Splits No The paper mentions training on 'mixed', 'partial', and 'complete' datasets from D4RL but does not explicitly provide training, validation, and test dataset splits (e.g., percentages or sample counts) within the text.
Hardware Specification No No specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments were mentioned in the paper.
Software Dependencies No The paper mentions the Adam optimizer, but does not provide specific version numbers for any software, libraries, or frameworks used (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Table 13. Training hyperparameters for EM skill learning procedure. Table 14. Parameters for Skill-Sequence Planning.