Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration
Authors: Alexandre Péré, Sébastien Forestier, Olivier Sigaud, Pierre-Yves Oudeyer
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments where a simulated robot arm interacts with an object, and we show that exploration algorithms using such learned representations can match the performance obtained using engineered representations. |
| Researcher Affiliation | Academia | Alexandre Péré Flowers Team Inria and Ensta-Paris Tech, France alexandre.pere@inria.fr; Sebastien Forestier Flowers Team Inria and Ensta-Paris Tech, France sebastien.forestier@inria.fr; Olivier Sigaud Flowers Team Inria, Ensta-Paris Tech and UPMC, France Olivier.Sigaud@upmc.fr; Pierre-Yves Oudeyer Flowers Team Inria and Ensta-Paris Tech, France pierre-yves.oudeyer@inria.fr |
| Pseudocode | Yes | Algorithmic Architecture 1: Intrinsically Motivated Goal Exploration Process with Unsupervised Goal Representation Learning (IMGEP-UGL); Algorithmic Architecture 2: Intrinsically Motivated Goal Exploration Strategy; Algorithm 3: Random Goal Exploration with Unsupervised Goal Space Learning |
| Open Source Code | Yes | The code to reproduce the experiments is available at https://github.com/flowersteam/Unsupervised_Goal_Space_Learning |
| Open Datasets | No | For the UGL phase, we used the following mechanism to generate the distribution of samples xi: the object was moved randomly uniformly over [ 1, 1]2 for Arm Ball, and over [ 1, 1]2 [0, 2π] for Arm Arrow, and the corresponding images were generated and provided as an observable sample to IMGEP-UGL learners. (The paper describes data generation within a simulated environment rather than using a pre-existing, publicly available dataset with concrete access information.) |
| Dataset Splits | No | The paper does not explicitly provide train/validation/test dataset splits with percentages, sample counts, or references to predefined splits for reproducibility, as data is generated in a simulated environment. |
| Hardware Specification | No | No specific hardware (e.g., GPU models, CPU types, memory amounts) used for running the experiments is mentioned. The paper describes a 'simulated robot arm' but no underlying computational hardware. |
| Software Dependencies | No | The paper mentions optimizers like 'Stochastic Gradient Descent (SGD)', 'Adagrad', and 'Adam', but does not provide specific version numbers for these or any other software libraries or frameworks used. |
| Experiment Setup | Yes | Auto-Encoder: The Ada Grad optimizer was used, with initial learning rate of 1e 3, with batches of size 100, until convergence at 2e5 epochs. Variational Auto-Encoder: The Adam optimizer was used, with initial learning rate of 1e 3, with batches of size 100, until convergence at 1e5 epochs. Radial Flow Variational Auto-Encoder: The Adam optimizer was used, with initial learning rate of 1e 3, with batches of size 100, until convergence at 5e4 epochs. |