Asynchronous Gradient Play in Zero-Sum Multi-agent Games

Authors: Ruicheng Ao, Shicong Cen, Yuejie Chi

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we verify our theoretical findings by investigating the performance of both single-timescale and two-timescale OMWU on randomly generated zero-sum entropy-regularized polymatrix games with n = 10, |Si| = 10, i V and τ = 0.1. For each (i, j) E, we set Aij = A ji with entries of Aij independently sampled from the uniform distribution over [ 1, 1]. All the results are averaged over five independent runs. In Fig. 1 (a), we compare the performance of single-timescale OMWU in both synchronous and asynchronous settings, with delay uniformly sampled from {0, 1, . . . , 10}.
Researcher Affiliation Academia Ruicheng Ao Peking University archer arc@pku.edu.cn Shicong Cen & Yuejie Chi Carnegie Mellon University {shicongc,yuejiec}@andrew.cmu.edu
Pseudocode Yes Algorithm 1 Entropy-regularized OMWU, agent i
Open Source Code No The paper does not provide an explicit statement about releasing the source code for the methodology or a link to a code repository.
Open Datasets No In this section, we verify our theoretical findings by investigating the performance of both single-timescale and two-timescale OMWU on randomly generated zero-sum entropy-regularized polymatrix games with n = 10, |Si| = 10, i V and τ = 0.1. For each (i, j) E, we set Aij = A ji with entries of Aij independently sampled from the uniform distribution over [ 1, 1].
Dataset Splits No The paper describes using 'randomly generated' games for numerical experiments but does not specify any dataset splits (e.g., training, validation, test percentages or counts).
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify any software components with version numbers used for its implementation or experiments.
Experiment Setup Yes In this section, we verify our theoretical findings by investigating the performance of both single-timescale and two-timescale OMWU on randomly generated zero-sum entropy-regularized polymatrix games with n = 10, |Si| = 10, i V and τ = 0.1. For each (i, j) E, we set Aij = A ji with entries of Aij independently sampled from the uniform distribution over [ 1, 1]. All the results are averaged over five independent runs. ... We adopt the optimal learning rate η from {0.1, 0.05, 0.02, 0.01, . . . } that yields the highest accuracy.