Neurosymbolic Transformers for Multi-Agent Communication
Authors: Jeevana Priya Inala, Yichen Yang, James Paulos, Yewen Pu, Osbert Bastani, Vijay Kumar, Martin Rinard, Armando Solar-Lezama
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate how our approach can synthesize policies that generate low-degree communication graphs while maintaining near-optimal performance. |
| Researcher Affiliation | Academia | 1 MIT CSAIL 2 University of Pennsylvania |
| Pseudocode | No | The paper describes the synthesis algorithm in prose but does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code and a video illustrating the different tasks are available at https://github.com/jinala/multi-agent-neurosym-transformers. |
| Open Datasets | No | The paper describes custom simulation environments ('Formation task', 'Unlabeled goals task') but does not provide access information or citations for a publicly available or open dataset. |
| Dataset Splits | No | The paper mentions training with '10k rollouts' and building a dataset for synthesis using '300 rollouts' but does not specify explicit training/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies, such as library names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For all approaches, we train the model with 10k rollouts. For synthesizing the programmatic policy, we build a dataset using 300 rollouts and run MCMC for 10000 steps. We retrain the transformer with 1000 rollouts. We constrain the maximum in-degree to be a constant d0 across all approaches (except tf-full, where each agent communicates with every other agent); for dist and hard-attn, we do so by setting the communication neighbors to be k = d0, and for prog and prog-retrain, we choose the number of rules to be K = d0. |