Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multi-Agent Reinforcement Learning in Stochastic Networked Systems
Authors: Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we propose a Scalable Actor Critic framework that applies in settings where the dependencies can be non-local and stochastic, and provide a finite-time error bound that shows how the convergence rate depends on the speed of information spread in the network. Additionally, as a byproduct of our analysis, we obtain novel finite-time convergence results for a general stochastic approximation scheme and for temporal difference learning with state aggregation, which apply beyond the setting of MARL in networked systems. |
| Researcher Affiliation | Academia | Yiheng Lin CMS, Caltech EMAIL Guannan Qu CMS, Caltech EMAIL Longbo Huang IIIS, Tsinghua University EMAIL Adam Wierman CMS, Caltech EMAIL |
| Pseudocode | Yes | Algorithm 1 Scalable Actor Critic |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the release of source code for the described methodology. |
| Open Datasets | No | The paper focuses on theoretical analysis and algorithm design rather than empirical evaluation with datasets. Therefore, it does not mention specific datasets or their public availability for training. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical data splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | The paper focuses on theoretical analysis and algorithm design, and thus does not list specific software dependencies with version numbers for experimental setup. |
| Experiment Setup | No | The paper is theoretical, presenting an algorithm and its convergence properties, rather than detailing an empirical experimental setup with hyperparameters or system-level training settings. |