reproducibilityindex.ai

Coordinated Multi-Agent Imitation Learning

Authors: Hoang M. Le, Yisong Yue, Peter Carr, Patrick Lucey

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of our method in two settings. The ﬁrst is a synthetic experiment based on the popular predator-prey game. The second is a challenging task of learning multiple policies for team defense in professional soccer, using a large training set1 of play sequences illustrated by Figure 1. We show that learning a good latent structure to encode implicit coordination yields signiﬁcantly superior imitation performance compared to conventional baselines.
Researcher Affiliation	Collaboration	1California Institute of Technology, Pasadena, CA 2Disney Research, Pittsburgh, PA 3STATS LLC, Chicago, IL.
Pseudocode	Yes	Algorithm 1 Coordinated Multi-Agent Imitation Learning
Open Source Code	No	Footnote 1 states 'Data at http://www.stats.com/data-science/ and see video result at http://hoangminhle.github.io'. This links to a data source and a video result, not the source code for the methodology.
Open Datasets	No	The paper mentions using 'tracking data from 45 games of real professional soccer' and 'The demonstration data is collected from 1000 game instances' for the predator-prey domain, and footnote 1 states 'Data at http://www.stats.com/data-science/'. However, this is a general link to a company's data science page, not a specific direct URL, DOI, or repository for the exact datasets used with proper bibliographic information or attribution.
Dataset Splits	No	Algorithm 1 includes 'until No improvement on validation set', implying a validation set is used. However, the experimental sections do not provide specific details such as split percentages, sample counts, or methodology for the validation set, which are necessary for reproducibility.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper mentions using 'recurrent neural network structure (LSTM)' and 'random forest', but it does not specify software names with version numbers (e.g., TensorFlow 2.x, PyTorch 1.x, scikit-learn 0.x) for reproducibility.
Experiment Setup	Yes	Each policy is represented by a recurrent neural network structure (LSTM), with two hidden layers of 512 units each. As LSTMs generally require ﬁxed-length input sequences, we further chunk each trajectory into sub-sequences of length 50, with overlapping window of 25 time steps. The joint multi-agent imitation learning procedure follows Algorithm 2 closely. In this setup, without access to dynamic oracles for imitation learning in the style of SEARN (Daum e III et al., 2009) and DAgger (Ross et al., 2011), we gradually increase the horizon of the rolled-out trajectories from 1 to 10 steps lookahead.