Convex-Concave Zero-Sum Markov Stackelberg Games
Authors: Denizalp Goktas, Arjun Prakash, Amy Greenwald
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also prove that reach-avoid problems are naturally modeled as convex-concave zero-sum Markov Stackelberg games, and show experimentally that Stackelberg equilibrium policies are more effective than their Nash counterparts in these problems.1 |
| Researcher Affiliation | Academia | Denizalp Goktas Brown University, Computer Science denizalp_goktas@brown.edu Arjun Prakash Brown University, Computer Science arjun_prakash@brown.edu Amy Greenwald Brown University, Computer Science amy_greenwald@brown.edu |
| Pseudocode | Yes | Algorithm 1 Saddle-Point-Oracle SGD/Nested SGDA |
| Open Source Code | Yes | Our code is found at: https://github.com/arjun-prakash/stackelberg-reach-avoid. |
| Open Datasets | No | The paper describes the setup for a reach-avoid game with specific parameters but does not refer to a publicly available dataset by name or provide access information (link, DOI, or specific citation). |
| Dataset Splits | No | The paper describes running '100 games' for evaluation but does not specify dataset splits (e.g., percentages or counts for training, validation, or testing). |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with versions). |
| Experiment Setup | Yes | Our experiments were run on a 7x7 square grid, with the target set T a closed ball of radius 1 centered along the lower edge, and the avoid set V a closed ball of radius 0.3 around the antagonist. We set the bonus (resp. penalty) for reaching the target (resp. avoid set) β = 200, ! = 30 , and = 0.25. |