Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value

Authors: Young Wu, Jeremy Mcmahan, Yiding Chen, Yudong Chen, Jerry Zhu, Qiaomin Xie

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments We run Algorithm 1 on several small normal-form games such as two-finger Morra and five-action rock-paper-scissors games. We run Algorithm 1 and Algorithm 2 on several games to illustrate the efficacy of our techniques. Also, it refers to Figure 1. Convergence to Optimal Cost, Figure 2. Scale Benchmark for Number of Actions, Figure 3. Scale Benchmark for Number of Periods.
Researcher Affiliation Academia 1Department of Computer Sciences, University of Wisconsin Madison, Madison, Wisconsin, United States 2Department of Computer Science, Cornell University, Ithaca, New York, United States 3Department of Industrial and Systems Engineering, University of Wisconsin Madison, Madison, Wisconsin, United States.
Pseudocode Yes Algorithm 1 Relax And Perturb (RAP) and Algorithm 2 Relax And Perturb for Markov Games (RAP-MG)
Open Source Code Yes Our code is available at: https://github.com/Young Wu559/game-modification.
Open Datasets No The paper states 'For each m P t2, 4, 8, . . . , 512u we generate N 5 random matrices R uniformr 1, 1smˆm.' and 'we generate N 5 random Markov games and corresponding target NE pairs with full support.' This indicates data was generated for the experiments rather than using a pre-existing publicly available dataset, and no specific access information for such a dataset is provided.
Dataset Splits No The paper describes generating random game matrices for experiments and testing the algorithm's performance, but it does not specify any training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined split methodologies) for reproducibility.
Hardware Specification No The paper states, 'Using the Gurobi LP solver, even on a laptop computer, the algorithm handles millions of variables (512^2) in roughly 10 seconds.' However, this does not provide specific hardware details such as CPU/GPU models, memory, or detailed computer specifications used for running the experiments.
Software Dependencies No The paper states, 'We conducted our experiments using standard python3 libraries and the gurobi optimization package.' However, it does not specify version numbers for Python, its libraries, or the Gurobi package, which are necessary for reproducible software dependencies.
Experiment Setup Yes Parameters: margins ι P R and λ P R . Notice we introduced a small SIISOW margin parameter ι ą 0 in (4d) and (4e)... A margin λ is also added to the reward bound (4g)... We considered ι and λ of the form 10 i for i P t0, . . . , 15u.