LLM-based Skill Diffusion for Zero-shot Policy Adaptation
Authors: Woo Kyung Kim, Youngseok Lee, Jooyoung Kim, Honguk Woo
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments, we demonstrate the zero-shot adaptability of LDu S to various context types including different specification levels, multi-modality, and varied temporal conditions for several robotic manipulation tasks, outperforming other language-conditioned imitation and planning methods. |
| Researcher Affiliation | Academia | Woo Kyung Kim1, Youngseok Lee2, Jooyoung Kim1, Honguk Woo1 1 Department of Computer Science and Engineering, Sunkyunkwan University 2 Department of Electrical and Computer Engineering, Sungkyunkwan University {kwk2696,yslee.gs,onsaemiro,hwoo}@skku.edu |
| Pseudocode | Yes | Algorithm 1 Policy adaptation via LLM-guided diffusion |
| Open Source Code | Yes | We submit the code and shows the details of our implementations in Appendix B. |
| Open Datasets | Yes | We use the Meta World benchmark [39], specifically with 10 different robot manipulation goals. |
| Dataset Splits | No | For data collection, we emulate rule-based expert policies. For each goal, we collect 60 trajectories, varying the speed of the agent as well as the position and weight of the objects being manipulated. |
| Hardware Specification | Yes | All experiments are conduced on a system equipped with an Intel(R) Core(TM) i9-10980XE CPU and an NVIDIA RTX A6000 GPU. |
| Software Dependencies | No | We implement LCD [3] using the open source projects Jax 3 and Haiku 4. |
| Experiment Setup | Yes | The hyperparameter settings for LDu S are summarized in Table 10. |