Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

Authors: Shicong Cen, Yuting Wei, Yuejie Chi

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 1 illustrates the performance of the proposed PU and OMWU methods for solving randomly generated entropy-regularized matrix games. It is evident that both algorithms converge linearly, and achieve faster convergence rates when the regularization parameter increases. Figure 2 illustrates the performance of Algorithm 3 for solving a random generated entropy-regularized Markov game with \|A\| = \|B\| = 20, \|S\| = 100 and γ = 0.99 with varying choices of Tmain, Tsub and τ.
Researcher Affiliation	Academia	Shicong Cen EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University; Yuting Wei EMAIL Department of Statistics and Data Science, The Wharton School University of Pennsylvania; Yuejie Chi EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University
Pseudocode	Yes	Algorithm 1: The PU method; Algorithm 2: The OMWU method; Algorithm 3: Policy Extragradient Method applied to Value Iteration for Entropy-regularized Markov Game
Open Source Code	No	The paper does not explicitly state that source code is provided or offer a link to a code repository. The mention of 'License: CC-BY 4.0' refers to the license for the paper itself, not the code.
Open Datasets	No	The paper describes generating synthetic data for performance illustration (e.g., 'randomly generated entropy-regularized matrix games' and 'random generated entropy-regularized Markov game') but does not use or provide access to any external public datasets.
Dataset Splits	No	The paper uses randomly generated data for illustration and does not specify any training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the performance illustrations (e.g., CPU, GPU models, memory, or cluster specifications).
Software Dependencies	No	The paper does not mention any specific software dependencies or versions (e.g., programming languages, libraries, or solvers with version numbers) used for the experiments.
Experiment Setup	Yes	Figure 1: The learning rates are fixed as η = 0.1. ... with the entropy regularization parameter τ = 0.01 ... at 1000-th iteration with different choices of τ. Figure 2: The learning rates of both players are fixed as η = 0.005. ... with varying choices of Tmain, Tsub and τ. Algorithm 3, Step 4: ...where the initialization is set as uniform distributions.