Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Competitive Gradient Optimization
Authors: Abhijeet Vyas, Brian Bullins, Kamyar Azizzadenesheli
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide a continuous-time analysis of CGO and its convergence properties while showing that in the continuous limit, previous methods degenerate to their gradient descent ascent (GDA) variants. We further provide a rate of convergence to stationary points in the discrete-time setting. We propose a generalized class of α-coherent functions and show that for strictly αcoherent functions, CGO ensures convergence to a saddle point. Moreover, we propose optimistic CGO (o CGO), an optimistic variant, for which we show a convergence rate of O( 1/n) to saddle points for α-coherent functions. (...) Section 5.3. Simulation of CGO and o CGO on families from Section 4 We now evaluate the performance1 of CGO and o CGO on families discussed in examples (4.1) and (4.2). We first consider a function f(x, y) = x Ay, A Rm n, x Rm, y Rn . We sample all the entries of A independently from a standard Gaussian, A = (aij), aij N(0, 1). We consider the plot of the L2 norm of x vs. that of y, since the only saddle point is the origin, the desired solution is x 2, y 2 0. We plot the iterates of CGO and o CGO for different α, and observe that o CGO converges to the saddle point for α 0 (at a very slow rate for α = 0) while CGO does so for α > 0. The results at α = 0 are that of GDA and optimistic GDA. We see similar results for the case where A is the scalar 1, i.e. f(x, y) = xy. This is in accordance with the analysis in example (4.1). The code for the experiments is available at Link to code. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Purdue University, West Lafayette, IN 47907 2Nvidia, Santa Clara, CA 95050. Correspondence to: Abhijeet Vyas <EMAIL>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes the update rules using mathematical equations and text. |
| Open Source Code | Yes | The code for the experiments is available at Link to code |
| Open Datasets | No | The paper uses mathematical functions for simulations, defined within the paper (e.g., 'f(x, y) = x Ay', 'f(x, y) = xy', 'f(x, y) = k/2(x^2-y^2)+xy', 'f(x, y) = x^2y+kxy'). It does not provide concrete access information (link, DOI, formal citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper simulates algorithms on mathematical functions and does not discuss dataset splits (training, validation, test) in terms of percentages, counts, or references to predefined splits, as it's not a dataset-based evaluation in the typical sense. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications, or cloud computing instance types) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library names with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9'), needed to replicate the experiment. |
| Experiment Setup | Yes | We plot the iterates of CGO and o CGO for different α, and observe that o CGO converges to the saddle point for α 0 (at a very slow rate for α = 0) while CGO does so for α > 0. (...) We plot the iterates of CGO and o CGO for different α, and observe that o CGO converges to the saddle point for α 0 (at a very slow rate for α = 0) while CGO does so for α > 0. The results at α = 0 are that of GDA and optimistic GDA. We see similar results for the case where A is the scalar 1, i.e. f(x, y) = xy. This is in accordance with the analysis in example (4.1). (...) Figure 2. CGO and optimistic CGO on bilinear functions f(x, y) = xy, (x, y) R2 : (a,b) and f(x, y) = x Ay, x R4, y R5 : (c,d) for 100 iterations. (...) Figure 3. CGO and optimistic CGO on functions from the family f(x, y) = k/2 (x2 y2) xy. k = 2 : (a,b) and k = 2 : (c,d) for 100 iterations. (...) Figure 4. CGO and optimistic CGO on the function f(x, y) = x2y + xy from multiple initializations for 500 iterations with increasing α. (...) Figure 5. CGO and optimistic CGO on the function f(x, y) = x2y from multiple initializations for 500 iterations with increasing α. |