reproducibilityindex.ai

A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning

Authors: Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon Shaolei Du

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our algorithms can achieve e O 1/4T 3/4 regret when the degree of nonstationarity, as measured by total variation , is known, and e O 1/5T 4/5 regret when is unknown, where T is the number of rounds. Meanwhile, our algorithm inherits the favorable dependence on number of agents from the oracles. As a side contribution that may be independent of interest, we show how to test for various types of equilibria by a black-box reduction to single-agent learning, which includes Nash equilibria, correlated equilibria, and coarse correlated equilibria.
Researcher Affiliation	Academia	Haozhe Jiang1 Qiwen Cui2 Zhihan Xiong2 Maryam Fazel2 Simon S. Du2 1 Institute for Interdisciplinary Information Sciences, Tsinghua University 2 University of Washington
Pseudocode	Yes	Algorithm 1 Restarted Explore-then-Commit for Non-stationary MARL Algorithm 2 Multi-scale Testing for Non-stationary MARL Protocol 1 TEST_EQ Protocol 2 Scheduling TEST_EQ in a block with length 2n
Open Source Code	No	The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	No	The paper is theoretical and does not conduct empirical studies that would involve training on specific datasets. It discusses theoretical bounds and algorithms.
Dataset Splits	No	The paper is theoretical and does not conduct empirical studies that would involve dataset splits for training, validation, or testing.
Hardware Specification	No	The paper focuses on theoretical contributions and algorithm design; it does not report on empirical experiments requiring specific hardware specifications.
Software Dependencies	No	The paper is theoretical and does not detail specific software dependencies with version numbers required to reproduce experiments.
Experiment Setup	No	The paper is theoretical and does not conduct empirical experiments, thus no details regarding hyperparameters, training configurations, or system-level settings for experiments are provided.