Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization
Authors: Shicong Cen, Yuting Wei, Yuejie Chi
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: Performance illustration of the PU and OMWU methods for solving entropy-regularized matrix games with |A| = |B| = 100, where the entries of the payoff matrix A is generated independently from the uniform distribution on [ 1, 1]. The learning rates are fixed as η = 0.1. The left panel plots various error metrics of convergence w.r.t. the iteration count with τ = 0.01, while the right panel plots these error metrics at 1000-th iteration with different choices of τ. |
| Researcher Affiliation | Academia | Shicong Cen Carnegie Mellon University shicongc@andrew.cmu.edu Yuting Wei University of Pennsylvania ytwei@wharton.upenn.edu Yuejie Chi Carnegie Mellon University yuejiechi@cmu.edu |
| Pseudocode | Yes | Algorithm 1: The PU method; Algorithm 2: The OMWU method; Algorithm 3: Policy Extragradient Method for Entropy-regularized Markov Game |
| Open Source Code | No | The paper does not provide any specific links to open-source code or explicit statements about code availability for the described methodology. |
| Open Datasets | No | The paper uses synthetic data generated internally for its performance illustration (Figure 1), stating 'where the entries of the payoff matrix A is generated independently from the uniform distribution on [ 1, 1]'. It does not use or provide access information for a publicly available dataset. |
| Dataset Splits | No | The paper's performance illustration uses synthetically generated data and does not specify training, validation, or test splits. The focus is on theoretical convergence rates demonstrated with this generated data. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | The learning rates are fixed as η = 0.1. |