Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought
Authors: Yuki Wang, Gonzalo Gonzalez-Pumariega, Yash Sharma, Sanjiban Choudhury
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive evaluation on various robot task benchmarks, including a novel game benchmark Robotouille, designed to simulate diverse cooking tasks in a kitchen environment. |
| Researcher Affiliation | Academia | Huaxiaoyue Wang Cornell University yukiwang@cs.cornell.edu Gonzalo Gonzalez-Pumariega Cornell University gg387@cornell.edu Yash Sharma Cornell University ys749@cornell.edu Sanjiban Choudhury Cornell University sanjibanc@cornell.edu |
| Pseudocode | Yes | Algorithm 1 Demo2Code: Generating task code from language instructions and demonstrations |
| Open Source Code | Yes | The project s website is at https://portal-cornell.github.io/demo2code/ Codebase is available here: https://github.com/portal-cornell/demo2code |
| Open Datasets | Yes | We introduce a novel, open-source simulator to simulate complex, long-horizon cooking tasks for a robot, e.g. making a burger by cutting lettuces and cooking patties. Unlike existing simulators that focus on simulating physics or sensors, Robotouille focuses on high level task planning and abstracts away other details. We build on a standard backend, PDDLGym [59], with a user-friendly game as the front end to easily collect demonstrations. For the experiment, we create a set of tasks, where each is associated with a set of preferences (e.g. what a user wants in the burger, how the user wants the burger cooked). For each task and each associated preference, we procedurally generate 10 scenarios. Codebase and usage guide for Robotouille is available here: https://github.com/portal-cornell/robotouille |
| Dataset Splits | No | We evaluate the different methods across three metrics. |
| Hardware Specification | No | We use gpt-3.5-turbo-16k for all experiments with temperature 0. |
| Software Dependencies | No | We use gpt-3.5-turbo-16k for all experiments with temperature 0. |
| Experiment Setup | Yes | We use gpt-3.5-turbo-16k for all experiments with temperature 0. |