Compositional Foundation Models for Hierarchical Planning

Authors: Anurag Ajay, Seungwook Han, Yilun Du, Shuang Li, Abhi Gupta, Tommi Jaakkola, Josh Tenenbaum, Leslie Kaelbling, Akash Srivastava, Pulkit Agrawal

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the efficacy and adaptability of our approach in three different long-horizon table-top manipulation tasks. and 3 Experimental Evaluations
Researcher Affiliation Collaboration Improbable AI Lab MIT-IBM Watson AI Lab Massachusetts Institute Technology https://hierarchical-planning-foundation-model.github.io/ Anurag Ajay , Seungwook Han * Yilun Du * , Shuang Li , Abhi Gupta , Tommi Jaakkola , Josh Tenenbaum , Leslie Kaelbling , Akash Srivastava , Pulkit Agrawal and Correspondence to aajay@mit.edu, swhan@mit.edu and yilundu@mit.edu.
Pseudocode Yes Algorithm 1 Decision Making with Hi P
Open Source Code Yes https://hierarchical-planning-foundation-model.github.io/
Open Datasets Yes We pretrain it pϕ(τ i x|wi, xi,1) on a large-scale text-to-video dataset Ego4D [13].
Dataset Splits No The paper defines `Ttrain` and `Ttest` for generating datasets and evaluating performance, but does not explicitly describe a separate validation dataset split with percentages, counts, or specific methodology for reproduction.
Hardware Specification Yes We used one V100 Nvidia GPU for training the multi-class classifier. and We used two A6000 Nvidia GPUs for training these diffusion models.
Software Dependencies No The paper refers to specific models (e.g., Flan-T5-Base, GPT3.5-turbo) and codebases (e.g., PVDM, Vima, Say Can) but does not provide specific version numbers for underlying software dependencies like Python or PyTorch.
Experiment Setup Yes Task Planning We train fϕ for 50 epochs using Adam W optimizer [31], a batch size of 256, a learning rate of 1e 3 and a weight decay of 1e 6. and Visual Planning ... We use Adam W optimizer [31], a batch size of 24 and a learning rate of 1e 4 for training the autoencoder. and Action Planning We train VC-1 initialized inverse dynamics model for 20 epochs with Adam W optimizer [31], a batch size of 256 and a learning rate of 3e 5.