reproducibilityindex.ai

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

Authors: Yu Bai, Chi Jin, Song Mei, Tiancheng Yu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper resolves the open question of designing near-optimal algorithms for learning imperfect-information extensive-form games from bandit feedback. We present the ﬁrst line of algorithms that require only e O((XA + Y B)/ε2) episodes of play to ﬁnd an ε-approximate Nash equilibrium in two-player zero-sum games... We achieve this sample complexity by two new algorithms: Balanced Online Mirror Descent, and Balanced Counterfactual Regret Minimization. Both algorithms rely on novel approaches of integrating balanced exploration policies into their classical counterparts. We also extend our results to learning Coarse Correlated Equilibria in multiplayer general-sum games.
Researcher Affiliation	Collaboration	Yu Bai * 1 Chi Jin * 2 Song Mei * 3 Tiancheng Yu * 4 1Salesforce Research 2Princeton University 3UC Berkeley 4MIT.
Pseudocode	Yes	Algorithm 1 Balanced OMD (max-player), Algorithm 2 Balanced CFR (max-player), Algorithm 3 Regret Minimization with Hedge (HEDGE), Algorithm 4 Regret Minimization with Regret Matching (REGRETMATCHING), Algorithm 5 Implementation of Balanced OMD update, Algorithm 6 CFR with Hedge (full feedback setting).
Open Source Code	No	The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper is theoretical and focuses on algorithm design and sample complexity analysis. It does not conduct experiments on a dataset, so there is no mention of a training dataset or its accessibility.
Dataset Splits	No	The paper is theoretical and does not report on experiments or data splits for reproduction.
Hardware Specification	No	The paper is theoretical and does not describe experiments, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe an experimental setup with specific software dependencies and version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup, hyperparameters, or training configurations.