SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Authors: Junjie Zhang, Chenjia Bai, Haoran He, Zhigang Wang, Bin Zhao, Xiu Li, Xuelong Li
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results from various instruction-following tasks demonstrate that SAM-E achieves superior performance with higher execution efficiency compared to the baselines, and also significantly improves generalization in few-shot adaptation to new tasks. |
| Researcher Affiliation | Collaboration | 1Tsinghua Shenzhen International Graduate School, Tsinghua University 2Shanghai Artificial Intelligence Laboratory 3Hong Kong University of Science and Technology 4Institute of Artificial Intelligence (Tele AI), China Telecom, P. R. China. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It only provides a link for videos related to the experiments: "The Videos are available at: https://sam-embodied.github. io/." |
| Open Datasets | Yes | In this section, we evaluate SAM-E in RLBench (James et al., 2020), which is a challenging multi-task 3D manipulation benchmark. We perform experiments in RLBench (James et al., 2020), which is simulated by Coppelia Sim (Rohmer et al., 2013)... |
| Dataset Splits | No | The paper states using "100 expert demonstrations per task" for training and evaluates on "249 variations" and "6 new tasks" for generalization, but does not specify explicit dataset split percentages (e.g., train/validation/test percentages or counts for each split) or refer to predefined splits from RLBench with a citation for the split methodology itself within the paper. |
| Hardware Specification | No | The paper mentions the simulation environment (Coppelia Sim) and the robot model (Franka Panda) used in both simulation and real-world experiments, but it does not specify any computational hardware details such as GPU models, CPU types, or memory used for training or inference. |
| Software Dependencies | No | The paper mentions 'Coppelia Sim' as the simulation environment but does not provide specific version numbers for this or any other software dependencies such as programming languages or libraries. |
| Experiment Setup | Yes | In our experiments, the hyperparameters are primarily fixed, as shown in Table 8. Table 8: Training Hyperparameters - batch size 10, learning rate 4e-3, optimizer LAMB, learning rate schedule cosine decay, warmup steps 2000, training steps 60K, training epochs 15. |