Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty

Authors: Laixi Shi, Eric Mazumdar, Yuejie Chi, Adam Wierman

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Assuming a non-adaptive sampling mechanism from a generative model, we propose a sample-efﬁcient model-based algorithm (DRNVI) with ﬁnite-sample complexity guarantees for learning robust variants of various notions of game-theoretic equilibria. We also establish an information-theoretic lower bound for solving RMGs, which conﬁrms the near-optimal sample complexity of DR-NVI with respect to problemdependent factors such as the size of the state space, the target accuracy, and the horizon length.
Researcher Affiliation	Academia	Laixi Shi 1 Eric Mazumdar 1 Yuejie Chi 2 Adam Wierman 1 1Department of Computing Mathematical Sciences, California Institute of Technology, CA 91125, USA. 2Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Pseudocode	Yes	Algorithm 1 Distributionally robust equilibrium value iteration (DR-NVI).
Open Source Code	No	The paper does not mention providing any open-source code for the described methodology.
Open Datasets	No	The paper is theoretical and does not conduct experiments with specific datasets. It mentions assuming access to a "generative model" for theoretical sampling, but this is not a publicly available dataset with concrete access information.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments involving dataset splits (training, validation, or test data).
Hardware Specification	No	The paper is theoretical and does not describe any specific hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not list any specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup, hyperparameters, or training settings for practical implementation or experiments.