Learning an Embedding Space for Transferable Robot Skills
Authors: Karol Hausman, Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin Riedmiller
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our method on several simulated robotic manipulation tasks. We find that our method allows for discovery of multiple solutions and is capable of learning the minimum number of distinct skills that are necessary to solve a given set of tasks. |
| Researcher Affiliation | Collaboration | Karol Hausman Department of Computer Science, University of Southern California hausman@usc.edu Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin Riedmiller Deep Mind {springenberg,ziyu,heess,riedmiller}@google.com |
| Pseudocode | No | The paper describes algorithmic steps in text but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link to the open-source code for the described methodology. |
| Open Datasets | No | The paper describes simulated tasks (point mass, robot manipulation) but does not use or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not explicitly provide training, validation, and test dataset splits needed for reproduction. |
| Hardware Specification | No | The paper mentions '16 asynchronous workers' but does not specify any particular hardware components like CPU/GPU models or memory details. |
| Software Dependencies | No | The paper mentions software like 'neural network function approximators' and 'Adam learning rate' but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Hyperparameter Spring-wall L-wall Rail-push State dims 27 27 27 Action dims 7 7 7 Policy net 200-100 200-100 200-100 Q function net 200-200 200-200 200-100 Inference net 200-200 200-200 200-100 Embedding distribution 3D Gaussian 3D Gaussian 3D Gaussian Minibatch size (per-worker) 32 32 32 Replay buffer size 1e5 1e5 1e5 α1 1e3 1e3 1e3 α2 1e3 1e3 1e3 α3 1e3 1e3 1e3 Discount factor (γ) 0.99 0.99 0.99 Adam learning rate 1e-3 1e-3 1e-3 |