Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

Authors: Gabriele Farina, Christian Kroer, Tuomas Sandholm

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally compare to the popular algorithm CFR+, which has a theoretical convergence rate of T 0.5 in theory, but is known to often converge at a rate of T 1, or better, in practice. We give an example matrix game where CFR+ experimentally converges at a relatively slow rate of T 0.74, whereas our optimistic methods converge faster than T 1. We go on to show that our fast rate also holds in the Kuhn poker game, which is an extensive-form game. For games with deeper game trees however, we find that CFR+ is still faster. Finally we show that when the goal is minimizing regret, rather than computing a Nash equilibrium, our optimistic methods can outperform CFR+, even in deep game trees.6 Experimental Evaluation We experimentally evaluate the performance of optimistic regret minimization methods instantiated with dilated distance-generating functions. We experiment on three games: Smallmatrix, a small 2 2 matrix game. Given a mixed strategy x = (x1, x2) 2 for Player 1 and a mixed strategy y = (y1, y2) 2 for Player 2, the payoff function for player 1 is u(x, y) = 5x1y1 x1y2 + x2y2. Kuhn poker, already introduced in Section 3. Leduc poker, a standard benchmark in imperfect-information game solving [33]. The game is played with a deck consisting of 5 unique cards with 2 copies of each, and consists of two rounds.
Researcher Affiliation Collaboration Gabriele Farina Computer Science Department Carnegie Mellon University gfarina@cs.cmu.edu Christian Kroer IEOR Department Columbia University christian.kroer@columbia.edu Tuomas Sandholm Computer Science Department, CMU Strategic Machine, Inc. Strategy Robot, Inc. Optimized Markets, Inc. sandholm@cs.cmu.edu
Pseudocode No The paper describes algorithms using mathematical equations and descriptions, but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes We experiment on three games: Smallmatrix, a small 2 2 matrix game... Kuhn poker, already introduced in Section 3. In Kuhn poker, each player first has to put a payment of 1 into the pot... Leduc poker, a standard benchmark in imperfect-information game solving [33].
Dataset Splits No The paper describes the games used for experimental evaluation (Smallmatrix, Kuhn poker, Leduc poker) and discusses iteration counts for convergence, but it does not specify explicit training, validation, or test dataset splits in the typical sense of supervised learning. Game-solving algorithms often operate directly on the game definition rather than a pre-split dataset.
Hardware Specification No The paper does not provide any specific details regarding the hardware used to run the experiments.
Software Dependencies No The paper does not provide specific software dependencies or version numbers for its implementation (e.g., library names with versions, specific solver versions).
Experiment Setup Yes Optimistic OMD and optimistic FTRL were set up with the step-size parameter η = 0.1 in Smallmatrix and η = 2 in Kuhn Poker, and the plots show the last-iterate convergence for the optimistic algorithms... OFTRL uses step-size parameter η = 200 in Leduc.