SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

Authors: Junjie Zhang, Chenjia Bai, Haoran He, Zhigang Wang, Bin Zhao, Xiu Li, Xuelong Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results from various instruction-following tasks demonstrate that SAM-E achieves superior performance with higher execution efficiency compared to the baselines, and also significantly improves generalization in few-shot adaptation to new tasks.
Researcher Affiliation Collaboration 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Shanghai Artificial Intelligence Laboratory 3Hong Kong University of Science and Technology 4Institute of Artificial Intelligence (Tele AI), China Telecom, P. R. China.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It only provides a link for videos related to the experiments: "The Videos are available at: https://sam-embodied.github. io/."
Open Datasets Yes In this section, we evaluate SAM-E in RLBench (James et al., 2020), which is a challenging multi-task 3D manipulation benchmark. We perform experiments in RLBench (James et al., 2020), which is simulated by Coppelia Sim (Rohmer et al., 2013)...
Dataset Splits No The paper states using "100 expert demonstrations per task" for training and evaluates on "249 variations" and "6 new tasks" for generalization, but does not specify explicit dataset split percentages (e.g., train/validation/test percentages or counts for each split) or refer to predefined splits from RLBench with a citation for the split methodology itself within the paper.
Hardware Specification No The paper mentions the simulation environment (Coppelia Sim) and the robot model (Franka Panda) used in both simulation and real-world experiments, but it does not specify any computational hardware details such as GPU models, CPU types, or memory used for training or inference.
Software Dependencies No The paper mentions 'Coppelia Sim' as the simulation environment but does not provide specific version numbers for this or any other software dependencies such as programming languages or libraries.
Experiment Setup Yes In our experiments, the hyperparameters are primarily fixed, as shown in Table 8. Table 8: Training Hyperparameters - batch size 10, learning rate 4e-3, optimizer LAMB, learning rate schedule cosine decay, warmup steps 2000, training steps 60K, training epochs 15.