Learning an Embedding Space for Transferable Robot Skills

Authors: Karol Hausman, Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin Riedmiller

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our method on several simulated robotic manipulation tasks. We find that our method allows for discovery of multiple solutions and is capable of learning the minimum number of distinct skills that are necessary to solve a given set of tasks.
Researcher Affiliation Collaboration Karol Hausman Department of Computer Science, University of Southern California hausman@usc.edu Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin Riedmiller Deep Mind {springenberg,ziyu,heess,riedmiller}@google.com
Pseudocode No The paper describes algorithmic steps in text but does not include explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets No The paper describes simulated tasks (point mass, robot manipulation) but does not use or provide access information for a publicly available or open dataset.
Dataset Splits No The paper does not explicitly provide training, validation, and test dataset splits needed for reproduction.
Hardware Specification No The paper mentions '16 asynchronous workers' but does not specify any particular hardware components like CPU/GPU models or memory details.
Software Dependencies No The paper mentions software like 'neural network function approximators' and 'Adam learning rate' but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes Hyperparameter Spring-wall L-wall Rail-push State dims 27 27 27 Action dims 7 7 7 Policy net 200-100 200-100 200-100 Q function net 200-200 200-200 200-100 Inference net 200-200 200-200 200-100 Embedding distribution 3D Gaussian 3D Gaussian 3D Gaussian Minibatch size (per-worker) 32 32 32 Replay buffer size 1e5 1e5 1e5 α1 1e3 1e3 1e3 α2 1e3 1e3 1e3 α3 1e3 1e3 1e3 Discount factor (γ) 0.99 0.99 0.99 Adam learning rate 1e-3 1e-3 1e-3