Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration

Authors: Alexandre Péré, Sébastien Forestier, Olivier Sigaud, Pierre-Yves Oudeyer

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present experiments where a simulated robot arm interacts with an object, and we show that exploration algorithms using such learned representations can match the performance obtained using engineered representations.
Researcher Affiliation Academia Alexandre Péré Flowers Team Inria and Ensta-Paris Tech, France alexandre.pere@inria.fr; Sebastien Forestier Flowers Team Inria and Ensta-Paris Tech, France sebastien.forestier@inria.fr; Olivier Sigaud Flowers Team Inria, Ensta-Paris Tech and UPMC, France Olivier.Sigaud@upmc.fr; Pierre-Yves Oudeyer Flowers Team Inria and Ensta-Paris Tech, France pierre-yves.oudeyer@inria.fr
Pseudocode Yes Algorithmic Architecture 1: Intrinsically Motivated Goal Exploration Process with Unsupervised Goal Representation Learning (IMGEP-UGL); Algorithmic Architecture 2: Intrinsically Motivated Goal Exploration Strategy; Algorithm 3: Random Goal Exploration with Unsupervised Goal Space Learning
Open Source Code Yes The code to reproduce the experiments is available at https://github.com/flowersteam/Unsupervised_Goal_Space_Learning
Open Datasets No For the UGL phase, we used the following mechanism to generate the distribution of samples xi: the object was moved randomly uniformly over [ 1, 1]2 for Arm Ball, and over [ 1, 1]2 [0, 2π] for Arm Arrow, and the corresponding images were generated and provided as an observable sample to IMGEP-UGL learners. (The paper describes data generation within a simulated environment rather than using a pre-existing, publicly available dataset with concrete access information.)
Dataset Splits No The paper does not explicitly provide train/validation/test dataset splits with percentages, sample counts, or references to predefined splits for reproducibility, as data is generated in a simulated environment.
Hardware Specification No No specific hardware (e.g., GPU models, CPU types, memory amounts) used for running the experiments is mentioned. The paper describes a 'simulated robot arm' but no underlying computational hardware.
Software Dependencies No The paper mentions optimizers like 'Stochastic Gradient Descent (SGD)', 'Adagrad', and 'Adam', but does not provide specific version numbers for these or any other software libraries or frameworks used.
Experiment Setup Yes Auto-Encoder: The Ada Grad optimizer was used, with initial learning rate of 1e 3, with batches of size 100, until convergence at 2e5 epochs. Variational Auto-Encoder: The Adam optimizer was used, with initial learning rate of 1e 3, with batches of size 100, until convergence at 1e5 epochs. Radial Flow Variational Auto-Encoder: The Adam optimizer was used, with initial learning rate of 1e 3, with batches of size 100, until convergence at 5e4 epochs.