Stochastic Prediction of Multi-Agent Interactions from Partial Observations
Authors: Chen Sun, Per Karlsson, Jiajun Wu, Joshua B Tenenbaum, Kevin Murphy
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our method outperforms various baselines on two sports datasets, one based on real basketball trajectories, and one generated by a soccer game engine. |
| Researcher Affiliation | Collaboration | Chen Sun Google Research Per Karlsson Google Research Jiajun Wu MIT CSAIL Joshua B Tenenbaum MIT CSAIL Kevin Murphy Google Research |
| Pseudocode | No | The paper does not contain any section explicitly labeled 'Pseudocode' or 'Algorithm', nor are there structured steps formatted like an algorithm block. |
| Open Source Code | No | We plan to release the videos along with the game engine after publication of the paper. Video samples can be found at bit.ly/2E3qg6F |
| Open Datasets | Yes | We use the basketball data from Zhan et al. (2018). |
| Dataset Splits | Yes | The hyper parameters, such as the base learning rate and the KL divergence weight β, are tuned on a hold-out validation set. |
| Hardware Specification | Yes | The models are trained on 6 V100 GPUs with synchronous training with batch size of 8 per GPU, we train the model for 80K steps on soccer and 40K steps on basketball. |
| Software Dependencies | No | The paper mentions software components such as Res Net-18, S3D, GRUs, MLPs, relation networks, and the Unity game engine, but it does not provide specific version numbers for any of these components or other libraries. |
| Experiment Setup | Yes | Our visual encoder is based on Res Net-18 (He et al., 2016), we use the first two blocks of Res Net to maintain spatial resolution, and then aggregate the feature map with max pooling. The encoder is pre-trained on visible players, and then fine-tuned for each baseline. For the soccer data, we down-sample the video to 4 FPS, and treat 4 frames (1 second) as one step. We consider 10 steps in total, 6 observed, 4 unobserved. We set the size of GRU hidden states to 128 for all baselines. The state decoder is a 2-layer MLP. For basketball data, we set every 5 frames as one step, and consider 10 steps as well. The size of GRU hidden states is set to 128. The models are trained on 6 V100 GPUs with synchronous training with batch size of 8 per GPU, we train the model for 80K steps on soccer and 40K steps on basketball. We use a linear learning rate warmup schedule for the first 1K steps, followed by a cosine learning rate schedule. The hyper parameters, such as the base learning rate and the KL divergence weight β, are tuned on a hold-out validation set. |