Neurosymbolic Grounding for Compositional World Models

Authors: Atharva Sehgal, Arya Grayeli, Jennifer J. Sun, Swarat Chaudhuri

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through an evaluation that considers two different forms of Comp Gen on an established blocks-pushing domain, we show that the framework establishes a new state-of-the-art for Comp Gen in world modeling.
Researcher Affiliation Academia Atharva Sehgal UT Austin Arya Grayeli UT Austin Jennifer J. Sun Caltech Swarat Chaudhuri UT Austin
Pseudocode Yes The full algorithm is presented in Algorithm 1, and the model is visualized in Figure 3.
Open Source Code Yes Artifacts are available at https://trishullab.github.io/cosmos-web/. ... Upon acceptance of our work, we will release the complete source code, pretrained models, and associated setup environments under the MIT license.
Open Datasets Yes We demonstrate the effectiveness of COSMOS on the 2D Block pushing domain (Kipf et al., 2019; Zhao et al., 2022; Goyal et al., 2021; Ke et al., 2021)... Our methodology extensively utilizes these public repositories (Zhao et al., 2022; Kipf et al., 2019; Goyal et al., 2021) for generating data and computing evaluation metrics.
Dataset Splits Yes For entity compositions (EC), we construct training and testing datasets to have unique object combinations between them. ... Our data generation methodology ensures that the compound distribution is disjoint, while the atom distribution remains consistent across datasets, i.e. FC(Dtrain) FC(Deval) = and FA(Dtrain) = FA(Deval).
Hardware Specification Yes We evaluate all models on a single 48 GB NVIDIA A40 GPU with a (maximum possible) batch size of 64 for 50 epochs for three random seeds.
Software Dependencies No The paper mentions "machine learning libraries" and "anaconda environments" but does not specify particular software names with version numbers.
Experiment Setup Yes We evaluate all models on a single 48 GB NVIDIA A40 GPU with a (maximum possible) batch size of 64 for 50 epochs for three random seeds. ... We first train the slot autoencoder (ENTITYEXTRACTOR and SPATIALDECODER) until the model shows no training improvement for 5 epochs. ... All transition models are initialized with the same slot-autoencoder and are optimized to minimize a mixture of the autoencoder reconstruction loss and the next-state reconstruction loss.