Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
When is Momentum Extragradient Optimal? A Polynomial-Based Analysis
Authors: Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we perform numerical experiments to optimize a game when the Jacobian has a cross-shaped spectrum in (14). We focus on this spectrum as it may be the most challenging case, involving both real and complex eigenvalues (c.f., Theorem 3). To test the robustness, we consider two cases where the Jacobian spectrum exactly follows S 2 in (14), as well as the inexact case. We illustrate them in Figure 3. |
| Researcher Affiliation | Collaboration | Junhyung Lyle Kim EMAIL Rice University, Department of Computer Science; Gauthier Gidel EMAIL Université de Montréal, Department of Computer Science and Operations Research Mila Quebec AI Institute, Canada CIFAR AI Chair; Anastasios Kyrillidis EMAIL Rice University, Department of Computer Science; Fabian Pedregosa EMAIL Google Deep Mind |
| Pseudocode | Yes | Algorithm 1: Momentum extragradient (MEG) method |
| Open Source Code | No | The paper does not contain an explicit statement about releasing code or provide a link to a code repository for the methodology described. The OpenReview link is for peer review, not code. |
| Open Datasets | No | We focus on two-player quadratic games, where player 1 controls x Rd1 and player 2 controls y Rd2 with loss functions in (16). In our setting, the corresponding vector field in (17) satisfies M12 = M 21, but S1 and S2 can be nonzero symmetric matrices. Further, the Jacobian v = A has the cross-shaped eigenvalue structure in (14), with c = L − µ 2 (c.f., Proposition 1, Case 2). For the problem constants, we use µ = 1, and L = 200. The optimum [x y ]T = w∗ ∈ R200 is generated using the standard normal distribution. |
| Dataset Splits | No | The paper describes generating synthetic data for quadratic games (e.g., "optimum [x y ]T = w∗ ∈ R200 is generated using the standard normal distribution"). It does not mention any explicit training, validation, or test splits for this generated data or any other dataset. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the numerical experiments, such as GPU or CPU models, memory, or cloud computing resources. |
| Software Dependencies | No | The paper does not provide any specific software dependencies with version numbers for reproducing the experiments. |
| Experiment Setup | Yes | For MEG (optimal), we set the hyperparameters using Theorem 5. For GD (theory) and EG (theory), we set the hyperparameters using Theorem 7, both for the exact and the inexact settings. For GDM (grid search), we perform a grid search of h GDM and m GDM, and choose the best-performing ones, as Theorem 8 does not give a specific form for hyperparameter setup. Specifically, we consider 0.005 ≤ h GDM ≤ 0.015 with 10−3 increment, and 0.01 ≤ m GDM ≤ 0.99 with 10−2 increment. In addition, as Theorem 7 might be conservative, we conduct grid searches for GD and EG as well. For GD (grid search), we use the same setup as h GDM. For EG (grid search), we use 0.001 ≤ h EG ≤ 0.05 with 10−4 increment. |