Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

Authors: Shicong Cen, Yuting Wei, Yuejie Chi

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 1: Performance illustration of the PU and OMWU methods for solving entropy-regularized matrix games with |A| = |B| = 100, where the entries of the payoff matrix A is generated independently from the uniform distribution on [ 1, 1]. The learning rates are fixed as η = 0.1. The left panel plots various error metrics of convergence w.r.t. the iteration count with τ = 0.01, while the right panel plots these error metrics at 1000-th iteration with different choices of τ.
Researcher Affiliation Academia Shicong Cen Carnegie Mellon University EMAIL Yuting Wei University of Pennsylvania EMAIL Yuejie Chi Carnegie Mellon University EMAIL
Pseudocode Yes Algorithm 1: The PU method; Algorithm 2: The OMWU method; Algorithm 3: Policy Extragradient Method for Entropy-regularized Markov Game
Open Source Code No The paper does not provide any specific links to open-source code or explicit statements about code availability for the described methodology.
Open Datasets No The paper uses synthetic data generated internally for its performance illustration (Figure 1), stating 'where the entries of the payoff matrix A is generated independently from the uniform distribution on [ 1, 1]'. It does not use or provide access information for a publicly available dataset.
Dataset Splits No The paper's performance illustration uses synthetically generated data and does not specify training, validation, or test splits. The focus is on theoretical convergence rates demonstrated with this generated data.
Hardware Specification No The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes The learning rates are fixed as η = 0.1.