reproducibilityindex.ai

Learning Robot Skills with Temporal Variational Inference

Authors: Tanmay Shankar, Abhinav Gupta

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the ability of our framework to learn such options across three robotic demonstration datasets, and provide our code 1. We evaluate our approach s ability to learn options across three datasets, and demonstrate that our approach can learn a meaningful space of options that correspond with traditional skills in manipulation, visualized at https://sites.google.com/view/learning-causalskills/home. We quantify the effectiveness of our policies in solving downstream tasks, evaluated on a suite of tasks.
Researcher Affiliation	Industry	1Facebook AI Research, Pittsburgh, PA, USA.
Pseudocode	Yes	Algorithm 1 Trajectory Generation Process with Options; Algorithm 2 Temporal Variational Inference for Learning Skills
Open Source Code	Yes	We demonstrate the ability of our framework to learn such options across three robotic demonstration datasets, and provide our code 1. 1github.com/facebookresearch/Causal Skill Learning
Open Datasets	Yes	MIME Dataset (Sharma et al., 2018); Roboturk Dataset (Mandlekar et al., 2018); CMU Mocap Dataset (CMU, 2002)
Dataset Splits	No	For each dataset, we set aside 500 randomly sampled trajectories that serve as our test set for our experiments in section 4.2. The remaining trajectories serve as the respective training sets. The paper does not explicitly mention a separate validation split.
Hardware Specification	No	The paper states: "The RL based approaches are trained with DDPG with the same exploration processes and hyperparameters (such as initializations of the networks, learning rates used, etc.), as noted in the supplementary." However, it does not provide any specific details about the hardware used (e.g., GPU models, CPU types) in the main paper.
Software Dependencies	No	The paper mentions using LSTMs and DDPG but does not specify any software names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x) for its implementation or experiments in the main text.
Experiment Setup	Yes	We parameterize each of the policies π and η as LSTMs (Hochreiter & Schmidhuber, 1997), with 8 layers and 128 hidden units per layer. All baseline policies are implemented as 8 layer LSTMs with 128 hidden units, for direct comparison with our policies.