reproducibilityindex.ai

Stable-Predictive Optimistic Counterfactual Regret Minimization

Authors: Gabriele Farina, Christian Kroer, Noam Brown, Tuomas Sandholm

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work we present the ﬁrst CFR variant that breaks the square-root dependence on iterations. By combining and extending recent advances on predictive and stable regret minimizers for the matrix-game setting we show that it is possible to leverage optimistic regret minimizers to achieve a O(T 3/4) convergence rate within CFR. This is achieved by introducing a new notion of stable-predictivity, and by setting the stability of each counterfactual regret minimizer relative to its location in the decision tree. Experiments show that this method is faster than the original CFR algorithm, although not as fast as newer variants, in spite of their worst-case O(T 1/2) dependence on iterations.
Researcher Affiliation	Collaboration	Gabriele Farina 1 Christian Kroer 2 Noam Brown 1 Tuomas Sandholm 1 3 4 5 1Computer Science Department, Carnegie Mellon University, Pittsburgh PA 15213 2IEOR Department, Columbia University, New York NY 10027 3Strategic Machine, Inc. 4Strategy Robot, Inc. 5Optimized Markets, Inc.
Pseudocode	No	The paper describes algorithms and methods using mathematical notation and textual descriptions but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions 'four open-source subgames solved by Libratus AI' and provides a GitHub link (https://github.com/CMU-EM/Libratus Endgames), but this link is for the external benchmark subgames used in the experiments, not for the authors' own implementation of their proposed CFR variant.
Open Datasets	Yes	We conduct our experiments on four open-source subgames solved by Libratus AI which beat top poker professionals (Brown & Sandholm, 2017b).2 Following prior convention, we use the bet sizes of 0.5 the size of the pot, 1 the size of the pot, and the all-in bet for the ﬁrst bet of each round. For subsequent bets in a round, we consider 1 the pot and the all-in bet. 2https://github.com/CMU-EM/Libratus Endgames
Dataset Splits	No	The paper describes experiments on 'Heads-up no-limit Texas hold em poker (HUNL) subgames' and measures 'exploitability', but it does not specify explicit train/validation/test dataset splits for these subgames.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. It refers to 'Libratus AI' as a source of subgames but does not specify the computational resources used for the authors' own experiments.
Software Dependencies	No	The paper mentions various algorithms like 'CFR', 'CFR+', 'Discounted CFR (DCFR)', and 'OFTRL', but it does not specify any software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, specific solver versions) required to replicate the experiments.
Experiment Setup	Yes	DCFR was set up with parameters (α, β, γ) = (1.5, 0, 2), as recommended in the original DCFR paper. For OFTRL we use the stepsize that the theory suggests in our experiments on subgames 3 and 4 (labeled OFTRL theory). For subgames 1 and 2 we found that the theoretically-correct stepsize is much too conservative, so we also show results with a less-conservative parameter found through dividing the stepsize by 10, 100, and 1000, and picking the best among those (labeled OFTRL tuned).