LLM-based Skill Diffusion for Zero-shot Policy Adaptation

Authors: Woo Kyung Kim, Youngseok Lee, Jooyoung Kim, Honguk Woo

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments, we demonstrate the zero-shot adaptability of LDu S to various context types including different specification levels, multi-modality, and varied temporal conditions for several robotic manipulation tasks, outperforming other language-conditioned imitation and planning methods.
Researcher Affiliation Academia Woo Kyung Kim1, Youngseok Lee2, Jooyoung Kim1, Honguk Woo1 1 Department of Computer Science and Engineering, Sunkyunkwan University 2 Department of Electrical and Computer Engineering, Sungkyunkwan University {kwk2696,yslee.gs,onsaemiro,hwoo}@skku.edu
Pseudocode Yes Algorithm 1 Policy adaptation via LLM-guided diffusion
Open Source Code Yes We submit the code and shows the details of our implementations in Appendix B.
Open Datasets Yes We use the Meta World benchmark [39], specifically with 10 different robot manipulation goals.
Dataset Splits No For data collection, we emulate rule-based expert policies. For each goal, we collect 60 trajectories, varying the speed of the agent as well as the position and weight of the objects being manipulated.
Hardware Specification Yes All experiments are conduced on a system equipped with an Intel(R) Core(TM) i9-10980XE CPU and an NVIDIA RTX A6000 GPU.
Software Dependencies No We implement LCD [3] using the open source projects Jax 3 and Haiku 4.
Experiment Setup Yes The hyperparameter settings for LDu S are summarized in Table 10.