Learning Rational Subgoals from Demonstrations and Instructions

Authors: Zhezheng Luo, Jiayuan Mao, Jiajun Wu, Tomás Lozano-Pérez, Joshua B. Tenenbaum, Leslie Pack Kaelbling

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation shows that our model clearly outperforms baselines on planning tasks where the agent needs to generate trajectories to accomplish a given task.
Researcher Affiliation Academia 1 Massachusetts Institute of Technology 2 Stanford University
Pseudocode Yes Algorithm 1: Overview of the training paradigm in pseudocode. Algorithm 2: Overview of the search algorithm given only the final goal.
Open Source Code No Project page: https://rsg.csail.mit.edu (The project page states: 'The code will be publicly released soon.')
Open Datasets Yes We evaluate RSGs in Crafting World (Chen, Gupta, and Marino 2021), an image-based grid-world domain with a rich set of object crafting tasks, and Playroom (Konidaris, Kaelbling, and Lozano-Perez 2018), a 2D continuous domain with geometric constraints.
Dataset Splits No The paper mentions 'compositional split' and 'novel split' for training and testing, but does not explicitly describe a separate validation split or its purpose.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper does not provide specific software dependency versions (e.g., library or solver names with version numbers) required to replicate the experiment.
Experiment Setup Yes The maximum number of expanded nodes for all planners is 5,000. The limit of instruction length, length limit, is set to 6 for our experiment. priority(t) = λk k 1 Y j=i+1 (1 d(oj, oi)) where λ is a length bias constant which is set to 0.9 because we prefer shorter instructions.