Symmetric Machine Theory of Mind

Authors: Melanie Sclar, Graham Neubig, Yonatan Bisk

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that reinforcement learning agents that model the mental states of others achieve significant performance improvements over agents with no such theory of mind model.
Researcher Affiliation Academia 1Paul G. Allen School of Computer Science & Engineering, University of Washington 2Language Technologies Institute, Carnegie Mellon University. Correspondence to: <msclar@cs.washington.edu, {gneubig,ybisk}@cs.cmu.edu>.
Pseudocode Yes A pseudocode of MADDPG-EE s implementation can be found in Section A.4. Algorithm 1 Actor implementation of MADDPG-EE
Open Source Code Yes Code can be found at https: //github.com/msclar/symmtom.
Open Datasets No The paper introduces a new simulated environment called 'Symm To M' for its experiments instead of using a pre-existing, publicly available dataset with specific access information.
Dataset Splits No The paper mentions training for '60000 episodes' and evaluating for '1000 episodes' but does not specify any validation dataset splits or percentages.
Hardware Specification Yes Experiments were run on a server with 256GB RAM, 2 18-core Intel E5-2699 processors @ 2.3GHz.
Software Dependencies No The paper mentions using MADDPG and its variants (RMADDPG, MADDPG-CE, etc.) as frameworks, but it does not specify version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We train through 60000 episodes, and with 9 random seeds to account for high variances. Our policies are parametrized by a two-layer Re LU MLP with 64 units per layer...We used the same hyperparameters as the ones used in MADDPG, except with a reduced learning rate and tau (lr = 0.001 and τ = 0.005). We set the length of each episode to 5w