Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
Authors: Renhao Wang, Jiayuan Mao, Joy Hsu, Hang Zhao, Jiajun Wu, Yang Gao
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we will show that the modularity of PROGRAMPORT enables better vision-language grounding and generalization, which empirically translates to strong zero-shot generalization (i.e., understanding instructions with unseen visual concepts), and compositional generalzation (i.e., understanding novel combination of previously seen visual and action concepts). We also provide validation of our approach on real world experiments in Appendix A.4. |
| Researcher Affiliation | Collaboration | Renhao Wang1 Jiayuan Mao2 Joy Hsu3 Hang Zhao1,4,5 Jiajun Wu3 Yang Gao1,4,5 1Tsinghua University 2MIT 3 Stanford University 4 Shanghai Artificial Intelligence Laboratory 5 Shanghai Qi Zhi Institute |
| Pseudocode | No | The paper contains architectural diagrams and mathematical equations but no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Project webpage at: https://progport.github.io. Our code will also be publicly available. |
| Open Datasets | Yes | Our dataset extends the benchmark proposed by CLIPORT (Shridhar et al., 2022a), in turn based on the Ravens benchmark from Zeng et al. (2020). |
| Dataset Splits | Yes | We train for 200k iterations with Adam W and a learning rate of 0.0002. We use either n = 100 or n = 1000 training demonstrations, and evaluate on 100 test demonstrations. (from Appendix A.4) We train on 30 demonstrations, and hold out the other 10 demonstrations for validation. |
| Hardware Specification | No | The paper mentions robotic platforms (Universal Robot UR5e, Franka Panda robot arm) and a simulation environment (Py Bullet), but does not specify the computational hardware (e.g., GPU/CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions software components like Py Bullet, CLIP, MASKCLIP, and Adam W, but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | We train for 200k iterations with Adam W and a learning rate of 0.0002. We use either n = 100 or n = 1000 training demonstrations. |