Learning to Share and Hide Intentions using Information Regularization
Authors: DJ Strouse, Max Kleiman-Weiner, Josh Tenenbaum, Matt Botvinick, David J. Schwab
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that cooperative (competitive) policies learned with our approach lead to more (less) reward for a second agent in two simple asymmetric information games. |
| Researcher Affiliation | Collaboration | 1 Princeton University, 2 MIT, 3 Deep Mind 4 UCL, 5 CUNY Graduate Center |
| Pseudocode | Yes | Algorithm 1 Action information regularized REINFORCE with value baseline. ... Algorithm 2 State information regularized REINFORCE with value baseline. |
| Open Source Code | Yes | Our code is available at https://github.com/djstrouse/Info MARL. |
| Open Datasets | No | The paper describes custom simulated environments (a 5x5 grid world and a key-and-door game) but does not provide concrete access information for a publicly available or open dataset used for training. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology). |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Tensor Flow [Abadi et al., 2016]' but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | Alice was trained using implementations of algorithms 2.1 and 2.2 in Tensor Flow [Abadi et al., 2016]. Given the small, discrete environment, we used tabular representations for both π and V . See section S2.1 for training parameters. ... (see section S2.2 for training parameters). |