Learning to Share and Hide Intentions using Information Regularization

Authors: DJ Strouse, Max Kleiman-Weiner, Josh Tenenbaum, Matt Botvinick, David J. Schwab

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that cooperative (competitive) policies learned with our approach lead to more (less) reward for a second agent in two simple asymmetric information games.
Researcher Affiliation Collaboration 1 Princeton University, 2 MIT, 3 Deep Mind 4 UCL, 5 CUNY Graduate Center
Pseudocode Yes Algorithm 1 Action information regularized REINFORCE with value baseline. ... Algorithm 2 State information regularized REINFORCE with value baseline.
Open Source Code Yes Our code is available at https://github.com/djstrouse/Info MARL.
Open Datasets No The paper describes custom simulated environments (a 5x5 grid world and a key-and-door game) but does not provide concrete access information for a publicly available or open dataset used for training.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology).
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Tensor Flow [Abadi et al., 2016]' but does not provide specific version numbers for TensorFlow or any other software dependencies.
Experiment Setup Yes Alice was trained using implementations of algorithms 2.1 and 2.2 in Tensor Flow [Abadi et al., 2016]. Given the small, discrete environment, we used tabular representations for both π and V . See section S2.1 for training parameters. ... (see section S2.2 for training parameters).