Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games
Authors: Kaiqing Zhang, Zhuoran Yang, Tamer Basar
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulation results are also provided to illustrate the satisfactory convergence properties of the algorithms. |
| Researcher Affiliation | Academia | Kaiqing Zhang ECE and CSL University of Illinois at Urbana-Champaign kzhang66@illinois.edu Zhuoran Yang ORFE Princeton University zy6@princeton.edu Tamer Ba sar ECE and CSL University of Illinois at Urbana-Champaign basar1@illinois.edu |
| Pseudocode | No | The paper describes algorithms using mathematical equations and textual explanations, but does not present them in a structured pseudocode block or algorithm environment. |
| Open Source Code | No | No explicit statement or link providing concrete access to source code for the methodology described in this paper was found. |
| Open Datasets | No | The paper describes two simulation settings (Case 1 and Case 2) with specific matrix parameters, which are "created based on the simulations in [35]". It does not refer to a publicly available dataset with concrete access information (link, DOI, formal citation for the dataset itself). |
| Dataset Splits | No | The paper describes simulation settings with defined system parameters but does not provide specific train/validation/test dataset splits or methodologies for data partitioning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, cloud resources with specs) used for running its simulations or experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies (e.g., programming languages, libraries, or solvers with version numbers) used for its implementation or simulations. |
| Experiment Setup | No | While the paper defines system parameters for its simulations (matrices A, B, C, Q, Ru, Rv, Σ0), it does not provide specific hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or detailed training configurations for the experimental setup presented in Section 6. |