Information Design in Multi-Agent Reinforcement Learning

Authors: Yue Lin, Wenhao Li, Hongyuan Zha, Baoxiang Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments on Recommendation Letter and Reaching Goals demonstrate the efficacy of our approach. Our code is publicly available at https://github.com/Yue Lin301/Information Design MARL.
Researcher Affiliation Academia Yue Lin, Wenhao Li, Hongyuan Zha, Baoxiang Wang The Chinese University of Hong Kong, Shenzhen linyue3h1@gmail.com, {liwenhao, zhahy, bxiangwang}@cuhk.edu.cn
Pseudocode No The paper contains mathematical formulations and descriptions of algorithms (e.g., update rules like Equation 7), but it does not present these in a clearly labeled pseudocode or algorithm block format.
Open Source Code Yes Our code is publicly available at https://github.com/Yue Lin301/Information Design MARL.
Open Datasets No The paper describes the setup for 'Recommendation Letter' and 'Reaching Goals' tasks as simulated environments with defined parameters, but it does not provide concrete access information (specific links, DOIs, repositories, or formal citations) for pre-existing publicly available or open datasets.
Dataset Splits No The paper refers to running experiments with '15 random seeds' and using A2C, which suggests data is generated through interaction with a simulated environment rather than pre-defined dataset splits. No specific training, validation, or test dataset splits (e.g., percentages or sample counts) are provided.
Hardware Specification Yes Running 4 seeds with 2 NVIDIA Ge Force RTX 3090, the longest time is Reaching Goals with 5 5 map, which takes up to a day.
Software Dependencies No The paper mentions software components like 'advantage actor-critic (A2C)' and 'Gumbel-Softmax' but does not provide specific version numbers for any key software dependencies or libraries.
Experiment Setup Yes Each curve in the experimental result graphs is drawn with at least 15 random seeds. We let φη(σ | s, o) = φη(σ | s) and Σ = {0, 1}. And we set a fixed horizon of 50 for each episode in this scenario.