RoboCLIP: One Demonstration is Enough to Learn Robot Policies

Authors: Sumedh Sontakke, Jesse Zhang, Séb Arnold, Karl Pertsch, Erdem Bıyık, Dorsa Sadigh, Chelsea Finn, Laurent Itti

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Robo CLIP on the Metaworld Environment suite [Yu et al., 2020] and on the Franka Kitchen Environment [Gupta et al., 2019], and find that policies obtained by pretraining on the Robo CLIP reward result in 2 3 higher zero-shot task success in comparison to state-of-the-art imitation learning baselines. and 4 Experiments We test out each of the hypotheses defined in Section 1 on simulated robotic environments.
Researcher Affiliation Collaboration 1Thomas Lord Department of Computer Science, University of Southern California 2University of California, Berkeley 3Stanford University 4Google Research
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions 'Visit our website for experiment videos.' but does not state that the source code for the methodology is available, nor does it provide a link to a code repository.
Open Datasets Yes We evaluate Robo CLIP on the Metaworld Environment suite [Yu et al., 2020] and on the Franka Kitchen Environment [Gupta et al., 2019] and The backbone model used in Robo CLIP is S3D [Xie et al., 2018] trained on the Howto100M dataset [Miech et al., 2019].
Dataset Splits No The paper describes pretraining and finetuning phases, and mentions zero-shot evaluation, but does not specify explicit train, validation, or test dataset splits with percentages or sample counts for its experiments.
Hardware Specification No The paper does not specify the hardware (e.g., GPU, CPU models, or cloud computing instances) used for running the experiments.
Software Dependencies No The paper mentions using PPO [Schulman et al., 2017] but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup No The paper states that agents are trained with PPO but does not provide specific experimental setup details such as hyperparameter values, learning rates, batch sizes, or network configurations.