Reconciling Spatial and Temporal Abstractions for Goal Representation

Authors: Mehdi Zadem, Sergio Mover, Sao Mai Nguyen

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the approach on complex continuous control tasks, demonstrating the effectiveness of spatial and temporal abstractions learned by this approach.1Empirically show that STAR successfully combines both temporal and spatial abstraction for more efficient learning, and that the reachability-aware abstraction scales to tasks with more complex dynamics. (Section 5).
Researcher Affiliation Academia 1LIX, Ecole Polytechnique,Institut Polytechnique de Paris, France 2CNRS, 3Flowers Team, U2IS, ENSTA Paris, IP Paris 4IMT Atlantique, Lab-STICC, UMR CNRS 6285
Pseudocode Yes Algorithm 1 STAR
Open Source Code Yes Find open-source code at https://github.com/cosynus-lix/STAR
Open Datasets Yes We evaluate our approach on a set of challenging tasks in the Ant environments (Fig.2) adapted from Duan et al. (2016) and popularised by Nachum et al. (2018). All of the environments use the Mujoco physics simulator (Todorov et al., 2012).
Dataset Splits No The paper describes training and evaluation but does not specify exact train/validation/test dataset splits or percentages.
Hardware Specification No The paper acknowledges 'provision of computational resources' but does not provide specific details such as GPU/CPU models or memory specifications.
Software Dependencies Yes All of the environments use the Mujoco physics simulator (Todorov et al., 2012). Both the Tutor and Controller use TD3 (Fujimoto et al., 2018) for learning policies. Specifically, we use Ai2 (Gehr et al., 2018) to compute the output of a neural network given a set of inputs.
Experiment Setup Yes Table 1: Hyperparameters for Tutor and Controller networks based on Zhang et al. (2023), Table 2: Hyperparameters for the forward model, and Table 3: Hyperparameters for reachability analysis provide specific values for learning rates, batch sizes, buffer sizes, and other parameters.