Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Last-Iterate Convergence in Solving Extensive-Form Games
Authors: Linjian Meng, Tianpei Yang, Youzhi Zhang, Zhenxing Ge, Shangdong Yang, Tianyu Ding, Wenbin Li, Bo An, Yang Gao
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that RTCFR+ exhibits a significantly faster empirical convergence rate than existing algorithms that achieve theoretical last-iterate convergence. Interestingly, RTCFR+ show performance no worse than average-iterate convergence CFR algorithms. It is the first last-iterate convergence algorithm to achieve such performance. Our code is available at https://github.com/menglinjian/Neur IPS-2025-RTCFR. [...] The experimental results are presented in Figure 1. RTCFR+ demonstrates superior performance compared to all other tested algorithms except PCFR+. |
| Researcher Affiliation | Collaboration | 1 National Key Laboratory for Novel Software Technology, Nanjing University 2 Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science & Innovation, CAS 3 Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications 4 Microsoft Corporation 5 School of Computer Science and Engineering, Nanyang Technological University |
| Pseudocode | Yes | Algorithm 1 RTCFR+ 1: Input: N, Tu, µ, γ, r 2: θ1 I 0, η 1, I I 3: for each n [1, 2, , N] do 4: Build the perturbed regularized EFGs in Eq. (2) via µ, γ, and r 5: for each t [1, 2, , Tu] do 6: Obtain ˆxt+1 and θt+1 I via the update rule in Eq. (3) 7: end for 8: γ γ 0.5, r ˆx Tu+1 9: θ1 I θTu+1 I , I I 10: end for 11: Return ˆx Tu+1 |
| Open Source Code | No | Our code is available at https://github.com/menglinjian/Neur IPS-2025-RTCFR. [...] Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justification: We will provide the code once this paper is accepted. |
| Open Datasets | Yes | We now evaluate the empirical convergence rate of RTCFR+ on five standard EFG benchmarks: Kuhn Poker, Leduc Poker, Goofspiel, Liar s Dice, and Battleship, all implemented using Open Spiel [Lanctot et al., 2019]. |
| Dataset Splits | No | The paper evaluates algorithms on "five standard EFG benchmarks: Kuhn Poker, Leduc Poker, Goofspiel, Liar s Dice, and Battleship". These are game environments where algorithms learn strategies, not traditional datasets with train/test/validation splits for reproduction. |
| Hardware Specification | Yes | The experiments are conducted on a machine equipped with a Xeon(R) Gold 6444Y CPU and 256 GB of memory. |
| Software Dependencies | No | We now evaluate the empirical convergence rate of RTCFR+ on five standard EFG benchmarks: Kuhn Poker, Leduc Poker, Goofspiel, Liar s Dice, and Battleship, all implemented using Open Spiel [Lanctot et al., 2019]. [...] The algorithm implementations are based on the open-source Lite EFG code [Liu et al., 2024]... |
| Experiment Setup | Yes | For RTCFR+, we set the initial values of η, γ, and µ to 1, 1e 10, and 1e 3, respectively. The number of iterations Tu required to update γ and r, is set to 100. For Reg-CFR, we use the parameters from the original paper. For R-Na D, we initialize µ = 1e 5 (R-Na D does not include the parameter γ), set Tu = 1000, and use a learning rate of η = 0.1. For OMWU and OGDA, we set η to 0.5 and 0.1, respectively. All algorithms employ alternating updates to enhance empirical convergence rates. Each algorithm is run for 20,000 (N = 20000/Tu) iterations to analyze long-term behavior. [...] Table 1: Hyperparameters used in RTCFR+ (fine-tuned). |