ABM: Attention before Manipulation
Authors: Fan Zhuo, Ying He, Fei Yu, Pengteng Li, Zheyi Zhao, Xilong Sun
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our method significantly outperforms the baselines in the zero-shot and compositional generalization experiment settings. |
| Researcher Affiliation | Collaboration | 1Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ) 2College of Computer Science and Software Engineering, Shenzhen University, China |
| Pseudocode | No | The paper describes the methodology in text and with diagrams, but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | Visual results are provided at: ABM.github.io. This statement specifies visual results, not source code for the methodology. |
| Open Datasets | Yes | We utilize RLBench tools to generate training datasets with 100 demonstrations per task as RVT. For ABM data preprocessing, we employ the frozen, pretrained Vi T-L/14@336px CLIP encoder to encode the images from four RGB-D cameras, and obtain patch-level dense features, which will be upscaled to match the resolution of the original images captured by the cameras during training. We train our model on 8 RLBench tasks... [James et al., 2020]. RLBench: The Robot Learning Benchmark & Learning Environment. |
| Dataset Splits | Yes | Specifically, there is no overlap between the objects in the training and validation sets, meaning that the objects in the validation set have never been seen by the model during the training process. And Evaluations are scored as 0 for failures or 100 for complete successes, and we report average success rates by evaluating the model five times on the same 25 variations episodes per task in seen , unseen and compositional generalization evaluations. |
| Hardware Specification | Yes | We use a batch size of 30 to train our model and baseline methods on 6 NVIDIA RTX 3090 GPUs for 80k iterations with the LAMB optimizer... We assess the realtime performance of our model on an NVIDIA RTX 3090... |
| Software Dependencies | No | The paper mentions software tools like Coppela Sim, Py Rep, and Pytorch3D, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We use a batch size of 30 to train our model and baseline methods on 6 NVIDIA RTX 3090 GPUs for 80k iterations with the LAMB optimizer [You et al., 2019] and a learning rate of 0.003. During training, data augmentation involves random translations of point clouds within the range of [ 0.125m, 0.125m, 0.125m], and random rotations around the yaw axis within the range of 45 . |