Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty
Authors: Laixi Shi, Eric Mazumdar, Yuejie Chi, Adam Wierman
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Assuming a non-adaptive sampling mechanism from a generative model, we propose a sample-efficient model-based algorithm (DRNVI) with finite-sample complexity guarantees for learning robust variants of various notions of game-theoretic equilibria. We also establish an information-theoretic lower bound for solving RMGs, which confirms the near-optimal sample complexity of DR-NVI with respect to problemdependent factors such as the size of the state space, the target accuracy, and the horizon length. |
| Researcher Affiliation | Academia | Laixi Shi 1 Eric Mazumdar 1 Yuejie Chi 2 Adam Wierman 1 1Department of Computing Mathematical Sciences, California Institute of Technology, CA 91125, USA. 2Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA. |
| Pseudocode | Yes | Algorithm 1 Distributionally robust equilibrium value iteration (DR-NVI). |
| Open Source Code | No | The paper does not mention providing any open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with specific datasets. It mentions assuming access to a "generative model" for theoretical sampling, but this is not a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments involving dataset splits (training, validation, or test data). |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not list any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or training settings for practical implementation or experiments. |