Learning Rational Subgoals from Demonstrations and Instructions
Authors: Zhezheng Luo, Jiayuan Mao, Jiajun Wu, Tomás Lozano-Pérez, Joshua B. Tenenbaum, Leslie Pack Kaelbling
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation shows that our model clearly outperforms baselines on planning tasks where the agent needs to generate trajectories to accomplish a given task. |
| Researcher Affiliation | Academia | 1 Massachusetts Institute of Technology 2 Stanford University |
| Pseudocode | Yes | Algorithm 1: Overview of the training paradigm in pseudocode. Algorithm 2: Overview of the search algorithm given only the final goal. |
| Open Source Code | No | Project page: https://rsg.csail.mit.edu (The project page states: 'The code will be publicly released soon.') |
| Open Datasets | Yes | We evaluate RSGs in Crafting World (Chen, Gupta, and Marino 2021), an image-based grid-world domain with a rich set of object crafting tasks, and Playroom (Konidaris, Kaelbling, and Lozano-Perez 2018), a 2D continuous domain with geometric constraints. |
| Dataset Splits | No | The paper mentions 'compositional split' and 'novel split' for training and testing, but does not explicitly describe a separate validation split or its purpose. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper does not provide specific software dependency versions (e.g., library or solver names with version numbers) required to replicate the experiment. |
| Experiment Setup | Yes | The maximum number of expanded nodes for all planners is 5,000. The limit of instruction length, length limit, is set to 6 for our experiment. priority(t) = λk k 1 Y j=i+1 (1 d(oj, oi)) where λ is a length bias constant which is set to 0.9 because we prefer shorter instructions. |