Information Design in Multi-Agent Reinforcement Learning
Authors: Yue Lin, Wenhao Li, Hongyuan Zha, Baoxiang Wang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on Recommendation Letter and Reaching Goals demonstrate the efficacy of our approach. Our code is publicly available at https://github.com/Yue Lin301/Information Design MARL. |
| Researcher Affiliation | Academia | Yue Lin, Wenhao Li, Hongyuan Zha, Baoxiang Wang The Chinese University of Hong Kong, Shenzhen linyue3h1@gmail.com, {liwenhao, zhahy, bxiangwang}@cuhk.edu.cn |
| Pseudocode | No | The paper contains mathematical formulations and descriptions of algorithms (e.g., update rules like Equation 7), but it does not present these in a clearly labeled pseudocode or algorithm block format. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/Yue Lin301/Information Design MARL. |
| Open Datasets | No | The paper describes the setup for 'Recommendation Letter' and 'Reaching Goals' tasks as simulated environments with defined parameters, but it does not provide concrete access information (specific links, DOIs, repositories, or formal citations) for pre-existing publicly available or open datasets. |
| Dataset Splits | No | The paper refers to running experiments with '15 random seeds' and using A2C, which suggests data is generated through interaction with a simulated environment rather than pre-defined dataset splits. No specific training, validation, or test dataset splits (e.g., percentages or sample counts) are provided. |
| Hardware Specification | Yes | Running 4 seeds with 2 NVIDIA Ge Force RTX 3090, the longest time is Reaching Goals with 5 5 map, which takes up to a day. |
| Software Dependencies | No | The paper mentions software components like 'advantage actor-critic (A2C)' and 'Gumbel-Softmax' but does not provide specific version numbers for any key software dependencies or libraries. |
| Experiment Setup | Yes | Each curve in the experimental result graphs is drawn with at least 15 random seeds. We let φη(σ | s, o) = φη(σ | s) and Σ = {0, 1}. And we set a fixed horizon of 50 for each episode in this scenario. |