Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Commitment to Sparse Strategies in Two-Player Games
Authors: Salam Afiouni, Jakub ฤernรฝ, Chun Kai Ling, Christian Kroer
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our methods on synthetic and real-world scenarios based on security applications. In both settings, we observe that even for small support sizes, we can obtain more than 90% of the true Nash value while maintaining a reasonable runtime, demonstrating the significance of our formulation and algorithms. 5 Empirical Evaluation We empirically analyze the performance and scalability of our proposed methods, focusing on (i) wall-clock computational time, (ii) the utility that Player 1 achieves for different support sizes k, and (iii) comparisons to k-uniform strategies. |
| Researcher Affiliation | Academia | Salam Afiouni1, Jakub หCern y1, Chun Kai Ling2, Christian Kroer1 1Columbia University, USA 2National University of Singapore, Singapore EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | No | No explicit pseudocode or algorithm blocks are provided in the main text. The methods are described using mathematical formulations and textual explanations. |
| Open Source Code | Yes | Code github.com/Coffee And Convexity/Sparse Equilibria |
| Open Datasets | No | The paper describes generating synthetic datasets and using real-world scenarios ("university campus path network", "maps from the real world") but does not provide specific access information (link, DOI, formal citation) for any publicly available or open datasets used in the experiments. |
| Dataset Splits | No | The paper describes generating various instances for evaluation (e.g., "30 instances were generated and solved" for random games, "30 instances" for patrolling games) rather than splitting a pre-existing dataset into training, validation, or test sets. Therefore, no dataset split information is provided. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (CPU, GPU models, memory, etc.) used for running the experiments. It only mentions the use of Gurobi for MILP solvers. |
| Software Dependencies | Yes | We use Gurobi (Gurobi Optimization, LLC 2023) for all MILP solvers. |
| Experiment Setup | Yes | For zero-sum games, each Aij is uniformly chosen in [10, 100]. In general-sum games, Aij and Bij are randomly chosen in [ 50, 50], under the condition that Aij and Bij have opposite signs. For purposes of comparison, we normalize running times using tnormalized = t tmin tmax tmin , where tmin and tmax are the minimum and maximum runtimes respectively. We normalize the utility value by the Nash value for zero-sum games and the Stackelberg equilibrium (SE) for general-sum games. Specifically, unormalized = u umin uequi umin where umin is the utility at k = 1 and uequi is the true value, i.e. k = . Unless otherwise stated, for experiments with randomness, 30 instances were generated and solved. We report the standard errors in plots (usually negligible). Other experimental details are deferred to the Appendix. |