Zero-Shot Linear Combinations of Grounded Social Interactions with Linear Social MDPs
Authors: Ravi Tejwani, Yen-Ling Kuo, Tianmin Shu, Bennett Stankovits, Dan Gutfreund, Joshua B. Tenenbaum, Boris Katz, Andrei Barbu
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments The produced social interactions are only meaningful when humans can recognize them as social. In the experiments, we want to understand if the behavior produced by the Linear Social MDP agrees with human ideas of the magnitude and valence of the social interaction. We first used the Linear Social MDP to generate a collection of social interactions between two agents rendered as videos. Human subjects are asked to recognize the social goals of the agents. Unlike the original Social MDPs where interactions were fixed, here the interactions change over the duration of the scenario as the agents switch between goals. Additionally, we wanted to understand if Linear Social MDPs can recognize these social interactions, not just produce them. We then compare Linear Social MDPs and other baseline models to understand to what extent the models could determine what social interactions were being carried out. |
| Researcher Affiliation | Collaboration | Ravi Tejwani1*, Yen-Ling Kuo1 , Tianmin Shu2, Bennett Stankovits1, Dan Gutfreund3 1CSAIL & CBMM, MIT 2BCS & CBMM, MIT 3MIT-IBM Watson AI Lab {tejwanir,ylkuo,tshu,bstankov,jbt,boris,abarbu}@mit.edu, dgutfre@us.ibm.com |
| Pseudocode | Yes | Figure 3: (left) A gloss of the key notation used. (right) The algorithm to solve Linear Social MDPs at each time step. We use the estimated social policy eψi,l 1 j at the previous time step to update the estimated rewards. At t = 0 goals are sampled uniformly. |
| Open Source Code | Yes | Code is available at https://github.com/Linear-Social-MDP/ linear-social-mdp-framework |
| Open Datasets | No | The paper describes a custom 10x10 grid-world environment used for simulations and experiments, but it does not specify or provide access to a publicly available dataset that was used for training or evaluation. Footnote 1 refers to 'All scenarios with detailed results', which points to generated videos and results, not a reusable dataset. |
| Dataset Splits | No | The paper describes the experimental setup and evaluation process, including human subject studies and model comparisons, but it does not explicitly provide details about training, validation, and test dataset splits for the models evaluated. |
| Hardware Specification | Yes | On a workstation with an RTX3090, updating the value estimates in parallel over 10^9 states takes about one minute. With 50 iterations, level 1 Linear Social MDPs takes about 40s, while level 3 Linear Social MDPs takes about 10 minutes. |
| Software Dependencies | No | The solver for Linear Social MDPs was implemented in C++ and CUDA to perform GPU-accelerated value iteration. However, specific version numbers for C++, CUDA, or any other software dependencies are not provided. |
| Experiment Setup | Yes | Environment We use a two-agent (a yellow and a red robot) 10x10 grid-world environment, with five actions (move in one of four directions or stay in place), three physical goals (watering the tree, adding logs to a fire, and sawing logs), three locations (tree, fire, and saw), and two objects (a log and a water can). In all experiments, each robot always attempts to achieve two physical goals while engaging in social interactions relative to those goals. A state is defined as an 8-tuple, consisting of (x,y) coordinates of both agents and their resources, with each component as an integer from 0-9. With 50 iterations, level 1 Linear Social MDPs takes about 40s, while level 3 Linear Social MDPs takes about 10 minutes. |