Vision-Based Manipulators Need to Also See from Their Hands

Authors: Kyle Hsu, Moo Jin Kim, Rafael Rafailov, Jiajun Wu, Chelsea Finn

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study how the choice of visual perspective affects learning and generalization in the context of physical manipulation from raw sensor observations... We first perform a head-to-head comparison between hand-centric and third-person perspectives in a grasping task that features three kinds of distribution shifts. We find that using the hand-centric perspective, with no other algorithmic modifications, reduces aggregate out-of-distribution failure rate by 92%, 99%, and 100% (relative) in the imitation learning, reinforcement learning, and adversarial imitation learning settings in simulation, and by 45% (relative) in the imitation learning setting on a real robot apparatus.
Researcher Affiliation Academia Kyle Hsu , Moo Jin Kim , Rafael Rafailov, Jiajun Wu, Chelsea Finn Stanford University {kylehsu,moojink,rafailov,jiajunwu,cbfinn}@cs.stanford.edu
Pseudocode No The paper describes algorithms such as DAgger, Dr Q, and DAC, and their modifications, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Project website: https://sites.google.com/view/seeing-from-hands. ... Separately, we have included links to code used for our simulation experiments on our project website.
Open Datasets Yes We first consider a visuomotor grasping task instantiated in the Py Bullet physics engine (Coumans & Bai, 2016 2021)... adapted from the Meta-World benchmark (Yu et al., 2020)... Table textures are from the describable textures dataset (DTD) (Cimpoi et al., 2014).
Dataset Splits No While the paper mentions using a "validation sample from the test distribution" for tuning regularization weights, it does not specify the size or exact methodology of this validation sample, nor does it describe a conventional split of a single dataset into training, validation, and test sets.
Hardware Specification No The paper mentions robot manipulators (Franka Emika Panda, Sawyer robot) used in experiments and simulation environments (PyBullet), but it does not provide specifications for the computing hardware (e.g., CPU, GPU models, memory) used for training models or running simulations.
Software Dependencies No The paper references specific software like "PyBullet physics engine (Coumans & Bai, 2016 2021)" and various algorithms (e.g., "Dr Q-v2 (Yarats et al., 2021)"), but it does not provide a list of specific ancillary software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The DAgger, Dr Q, and DAC hyperparameters used in the cube grasping experiments are listed in Tables 5, 6, and 7, respectively. ... We present the Dr Q-v2 hyperparameters used in the Meta-World experiments in Table 10.