Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Efficient Phi-Regret Minimization in Extensive-Form Games via Online Mirror Descent
Authors: Yu Bai, Chi Jin, Song Mei, Ziang Song, Tiancheng Yu
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove that, in those settings, the Φ-Hedge algorithms are equivalent to standard Online Mirror Descent (OMD) algorithms for EFGs with suitable dilated regularizers, and run in polynomial time. This new connection further allows us to design and analyze a new class of OMD algorithms based on modifying its log-partition function. In particular, we design an improved algorithm with balancing techniques that achieves a sharp e O(XAT) EFCE-regret under bandit-feedback in an EFG with X information sets, A actions, and T episodes. To our best knowledge, this is the first such rate and matches the information-theoretic lower bound. |
| Researcher Affiliation | Collaboration | Yu Bai Salesforce Research EMAIL Chi Jin Princeton University EMAIL Song Mei UC Berkeley EMAIL Ziang Song Stanford University EMAIL Tiancheng Yu MIT EMAIL |
| Pseudocode | Yes | Algorithm 1 Φ-Hedge; Algorithm 2 EFCE-OMD (FTRL form; equivalent OMD form in Algorithm 4); Algorithm 3 Balanced EFCE-OMD (FTRL form; equivalent OMD form in Algorithm 5) |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | No | This paper focuses on theoretical algorithm design and analysis, and does not involve empirical evaluation on datasets. Therefore, no information regarding publicly available datasets or their access is provided. |
| Dataset Splits | No | This paper is theoretical and does not conduct experiments involving dataset splits for training, validation, or testing. |
| Hardware Specification | No | This paper is purely theoretical and does not describe any experimental setup or hardware used for running experiments. |
| Software Dependencies | No | This paper focuses on theoretical algorithm design and analysis, and does not conduct experiments, hence no software dependencies with specific version numbers are provided. |
| Experiment Setup | No | This paper is theoretical and does not include details about an experimental setup, hyperparameters, or system-level training settings. |