Self-supervised Visual Reinforcement Learning with Object-centric Representations

Authors: Andrii Zadaianchuk, Maximilian Seitzer, Georg Martius

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have done computational experiments to address the following questions: How well does our method scale to challenging tasks with a large number of objects in case when ground-truth representations are provided? How does our method perform compared to prior visual goal-conditioned RL methods on image-based, multiobject continuous control tasks? How suitable are the representations learned by the compositional generative world model for discovering and solving RL tasks?
Researcher Affiliation Academia 1 Max Planck Institute for Intelligent Systems, Tübingen, Germany 2 Department of Computer Science, ETH Zurich
Pseudocode Yes Algorithm 1 SMORL: Self-Supervised Multi-Object RL (Training) ... Algorithm 2 SMORL: Self-Supervised Multi-object RL (Training with Details) ... Algorithm 3 SMORL (Evaluation)
Open Source Code No Our code, as well as the multi-objects environments will be made public after the paper publication.
Open Datasets No To answer these questions, we constructed the Multi-Object Visual Push and Multi-Object Visual Rearrange environments. Both environments are based on Mu Jo Co (Todorov et al., 2012) and the Multiworld package for image-based continuous control tasks introduced by Nair et al. (2018), and contain a 7-dof Sawyer arm where the agent needs to be controlled to manipulate a variable number of small picks on a table.
Dataset Splits No No explicit statement providing specific training/validation/test dataset splits (percentages, sample counts, or predefined split citations) was found.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running experiments were provided.
Software Dependencies No The paper mentions software like Pytorch, MuJoCo, Multiworld, and Adam optimizer, but does not provide specific version numbers for these software dependencies to ensure reproducibility.
Experiment Setup Yes We refer to Table 2 for general hyper-parameters of SMORL and to Table 3 for environment specific hyper-parameters of SMORL.