Polynomial-Time Linear-Swap Regret Minimization in Imperfect-Information Sequential Games
Authors: Gabriele Farina, Charilaos Pipis
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation To further illustrate the separation between no-linear-swap-regret dynamics and no-trigger-regret dynamics, used for EFCE, we provide experimental evidence that minimizing linear-swap-regret also minimizes trigger-regret (Figure 2, left), while minimizing trigger-regret does not minimize linear-swap regret. Specifically, in Figure 2 we compare our no-linear-swap-regret learning dynamics (given in Algorithm 1) to the no-trigger-regret algorithm introduced by Farina et al. [2022]. |
| Researcher Affiliation | Academia | Gabriele Farina MIT gfarina@mit.edu Charilaos Pipis MIT chpipis@mit.edu |
| Pseudocode | Yes | Algorithm 1: Φ-Regret minimizer for the set Φ = MQ Q |
| Open Source Code | No | The paper states 'We implemented our no-linear-swap-regret algorithm (Algorithm 1) in the C++ programming language using the Gurobi commercial optimization solver [Gurobi Optimization, LLC, 2023], version 10.' but does not mention making their implementation's source code openly available. |
| Open Datasets | Yes | Game instance used We ran our code on the standard benchmark game of Kuhn poker Kuhn [1950]. We used a three-player variant of the game. |
| Dataset Splits | No | The paper describes running experiments on a game instance (Kuhn poker) but does not provide explicit training, validation, or testing dataset splits, as would typically be done for machine learning models trained on static datasets. The context is a game simulation, not a dataset split. |
| Hardware Specification | No | Minimal computational resources were used. All code ran on a personal laptop for roughly 12 hours. |
| Software Dependencies | Yes | We implemented our no-linear-swap-regret algorithm (Algorithm 1) in the C++ programming language using the Gurobi commercial optimization solver [Gurobi Optimization, LLC, 2023], version 10. |
| Experiment Setup | Yes | We did very minimal tuning of the constant learning rate η used for online projected gradient descent, trying values η {0.05, 0.1, 0.5} (we remark that a constant value of η 1/T is theoretically sound). We found that η = 0.1, which is used in the plots of Figure 2, performed best. |