Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization
Authors: Shicong Cen, Yuting Wei, Yuejie Chi
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1 illustrates the performance of the proposed PU and OMWU methods for solving randomly generated entropy-regularized matrix games. It is evident that both algorithms converge linearly, and achieve faster convergence rates when the regularization parameter increases. Figure 2 illustrates the performance of Algorithm 3 for solving a random generated entropy-regularized Markov game with |A| = |B| = 20, |S| = 100 and γ = 0.99 with varying choices of Tmain, Tsub and τ. |
| Researcher Affiliation | Academia | Shicong Cen EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University; Yuting Wei EMAIL Department of Statistics and Data Science, The Wharton School University of Pennsylvania; Yuejie Chi EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1: The PU method; Algorithm 2: The OMWU method; Algorithm 3: Policy Extragradient Method applied to Value Iteration for Entropy-regularized Markov Game |
| Open Source Code | No | The paper does not explicitly state that source code is provided or offer a link to a code repository. The mention of 'License: CC-BY 4.0' refers to the license for the paper itself, not the code. |
| Open Datasets | No | The paper describes generating synthetic data for performance illustration (e.g., 'randomly generated entropy-regularized matrix games' and 'random generated entropy-regularized Markov game') but does not use or provide access to any external public datasets. |
| Dataset Splits | No | The paper uses randomly generated data for illustration and does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the performance illustrations (e.g., CPU, GPU models, memory, or cluster specifications). |
| Software Dependencies | No | The paper does not mention any specific software dependencies or versions (e.g., programming languages, libraries, or solvers with version numbers) used for the experiments. |
| Experiment Setup | Yes | Figure 1: The learning rates are fixed as η = 0.1. ... with the entropy regularization parameter τ = 0.01 ... at 1000-th iteration with different choices of τ. Figure 2: The learning rates of both players are fixed as η = 0.005. ... with varying choices of Tmain, Tsub and τ. Algorithm 3, Step 4: ...where the initialization is set as uniform distributions. |