Know Thyself: Transferable Visual Control Policies Through Robot-Awareness
Authors: Edward S. Hu, Kun Huang, Oleh Rybkin, Dinesh Jayaraman
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on tabletop manipulation tasks with simulated and real robots demonstrate that these plug-in improvements dramatically boost the transferability of visual model-based RL policies, even permitting zero-shot transfer of visual manipulation skills onto new robots. |
| Researcher Affiliation | Academia | GRASP Lab, University of Pennsylvania {hued, huangkun, oleh, dineshj}@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1 TRAIN, Algorithm 2 TEST, Algorithm 3 Pytorch code for the Lw loss. |
| Open Source Code | Yes | To ensure reproducibility, we will release the codebase that contains our video prediction and control algorithms, as well as weights for our trained models. and See the website for the code https://www.seas.upenn.edu/~hued/rac |
| Open Datasets | Yes | Models were pretrained for 150,000 gradient steps on the Robo Net dataset with the Adam optimizer... and For datasets, we will release our subset of Robo Net labeled with robotic masks, and our Widow X video dataset. |
| Dataset Splits | No | The paper describes training and evaluation procedures, including details about sequences and evaluation metrics, but it does not explicitly provide percentages or counts for training, validation, and test dataset splits for its experiments. For instance, it mentions 'train on 10k trajectories' and 'evaluate on 1000 trajectories' but these are dataset sizes rather than formal train/validation/test splits. |
| Hardware Specification | No | The paper does not specify the computational hardware (e.g., specific GPU or CPU models, memory) used for running the experiments or training the models. |
| Software Dependencies | No | The paper mentions software like 'Py Robot', 'Move It! motion planning ROS package', 'Mu Jo Co', 'Label Studio', and 'Py Torch', but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | Models were pretrained for 150,000 gradient steps on the Robo Net dataset with the Adam optimizer, learning rate 3e 4 and batch size of 16. ... For fine-tuning, all models were trained on the fine-tune dataset for 10,000 gradient steps with learning rate of 1e 4, batch size of 10... and For the few-shot Widow X200 experiment, the CEM action selection generates 300 action trajectories of length 5, selects the top 10 sequences, and optimizes the distribution for 10 iterations. For the zero-shot Franka experiment, the CEM action selection generates 100 action trajectories of length 5, selects the top 10 sequences, and optimizes the distribution for 3 iterations. |