Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Best of Both Worlds: Regret Minimization versus Minimax Play
Authors: Adrian Müller, Jon Schneider, Stratis Skoulakis, Luca Viano, Volkan Cevher
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally compare our Algorithm 2 for EFGs to the standard OMD algorithm with dilated KL (Kozuno et al., 2021) as well as to minimax play. Our evaluations confirm our theoretical findings, revealing that Algorithm 2 can achieve the best of both no-regret algorithms and minimax play. |
| Researcher Affiliation | Collaboration | 1ETH Zürich 2Google Research 3Aarhus University 4EPFL. Correspondence to: Adrian Müller <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Phased Aggression with Importance-Weighting |
| Open Source Code | Yes | We provide the code in the supplementary material. |
| Open Datasets | Yes | We consider Kuhn poker (Kuhn, 1950), which serves as a simple yet fundamental example of two-player zero-sum imperfect information EFGs. Kuhn poker is a common 3-card simplification of standard poker, where each player selects one card from the deck {Jack, Queen, King} without replacement and initially bets one unit.3 https://en.wikipedia.org/wiki/Kuhn_poker |
| Dataset Splits | No | The paper describes experiments in a game environment (Kuhn Poker) involving repeated play for T rounds, but does not provide specific training/test/validation dataset splits typically found in data-driven experiments. |
| Hardware Specification | No | The paper mentions running experiments but does not specify any hardware details such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions 'dilated KL divergence' as part of the algorithm but does not list any specific software libraries or their version numbers. |
| Experiment Setup | Yes | In all algorithms, we used the same learning fixed rates (η 1/T) and the (unbalanced) dilated KL divergence for fairness and simplicity. We consider two types of experiments: First, we run the three algorithms against each other... Second, we evaluate how well each algorithm allows Alice to exploit exploitable strategies... We repeat each experiment 5 times. T = 1000 rounds. |