reproducibilityindex.ai

Convergence of Gradient Methods on Bilinear Zero-Sum Games

Authors: Guojun Zhang, Yaoliang Yu

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS Bilinear game We run experiments on a simple bilinear game and choose the optimal parameters as suggested in Theorem 4.1 and 4.2. The results are shown in the left panel of Figure 1, which conﬁrms the predicted linear rates. Density plots We show the density plots (heat maps) of the spectral radii in Figure 2. We make plots for EG, OGD and momentum with both Jacobi and GS updates. These plots are made when β1 = β2 = β and they agree with our theorems in 3. Wasserstein GAN As in Daskalakis et al. (2018), we consider a WGAN (Arjovsky et al., 2017) that learns the mean of a Gaussian:... Mixtures of Gaussians (GMMs) Our last experiment is on learning GMMs with a vanilla GAN (Goodfellow et al., 2014) that does not directly fall into our analysis. We choose a 3-hidden layer Re LU network for both the generator and the discriminator, and each hidden layer has 256 units.
Researcher Affiliation	Academia	Guojun Zhang & Yaoliang Yu Department of Computer Science University of Waterloo Vector Institute {guojun.zhang,yaoliang.yu}@uwaterloo.ca
Pseudocode	No	The paper describes algorithms using equations and textual descriptions but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	The paper mentions using a Wasserstein GAN (WGAN) and training on Mixtures of Gaussians (GMMs) with a vanilla GAN, citing 'Arjovsky et al., 2017' and 'Goodfellow et al., 2014'. However, no specific access information (URL, DOI, or explicit repository name) for a public dataset is provided. The WGAN setup describes learning a mean of a Gaussian, implying synthetic or custom data, and GMMs are target distributions rather than a named public dataset with specific access instructions.
Dataset Splits	No	The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit splitting methodologies).
Hardware Specification	No	The paper does not provide any specific details regarding the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies	No	The paper mentions 'Mathematica code' in the appendices but does not provide specific version numbers for this or any other software dependencies, libraries, or solvers used for the experiments.
Experiment Setup	Yes	We run experiments on a simple bilinear game and choose the optimal parameters as suggested in Theorem 4.1 and 4.2... Inspired by Theorem 4.1, we compare the convergence of two EGs with the same parameter β = αγ, and ﬁnd that with scaling, EG has better convergence, as shown in the right panel of Figure 1... In Figure 3, we can see that GS updates converge even if the corresponding Jacobi updates do not. For EG with γ = 0.2, α = 0.02; OGD with α = 0.2, β1 = 0.1, β2 = 0; Momentum with α = 0.08, β = 0.1... We choose a 3-hidden layer ReLU network for both the generator and the discriminator, and each hidden layer has 256 units... stochastic GD (step size α = 0.01)... stochastic OGD (α = 2β = 0.02)... Adam, with the step size α = 0.0002, and β1 = 0.9, β2 = 0.999.