reproducibilityindex.ai

GDA-AM: ON THE EFFECTIVENESS OF SOLVING MIN-IMAX OPTIMIZATION VIA ANDERSON MIXING

Authors: Huan He, Shifan Zhao, Yuanzhe Xi, Joyce Ho, Yousef Saad

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We complement our theoretical results with numerical simulations across a variety of minimax problems. We show that for some convex-concave and non-convex-concave functions, GDA-AM can converge to the optimal point with little hyper-parameter tuning whereas existing ﬁrst-order methods are prone to divergence and cycling behaviors. We also provide empirical results for GAN training across two different datasets, CIFAR10 and Celeb A.
Researcher Affiliation	Academia	Huan He, Shifan Zhao, Yuanzhe Xi, Joyce C Ho Department of Computer Science Emory University Atlanta, GA 30329, USA Yousef Saad Department of Computer Science and Engineering University of Minnesota Minneapolis, MN 55455, USA
Pseudocode	Yes	Algorithm 1: Anderson Mixing Prototype (truncated version) ... Algorithm 2: Simultaneous GDA-AM ... Algorithm 3: Alternating GDA-AM ... Algorithm 5: QR-updating procedures
Open Source Code	Yes	Codes are available on Github 1. 1https://github.com/hehuannb/GDA-AM
Open Datasets	Yes	We apply our method to the CIFAR10 dataset (Krizhevsky, 2009) ... We also compared the performance of GDA-AM using cropped Celeb A (64 64) (Liu et al., 2015)
Dataset Splits	No	The paper states 'Experiments were run with 5 random seeds' and 'Models are evaluated using the inception score (IS) (Salimans et al., 2016) and FID (Heusel et al., 2017) computed on 50,000 samples.' for evaluation, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or counts) or refer to standard splits with citations for reproducibility.
Hardware Specification	Yes	Experiments were run one NVIDIA V100 GPU.
Software Dependencies	No	The paper states 'For our experiments, we used the Py Torch 3 deep learning framework.' It mentions PyTorch but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	We use a learning rate of 2 10 4 and batch size of 64. For table size of GDA-AM , we set it as 120 for CIFAR10 and 150 for Celeb A. We set β1 = 0.0 and β2 = 0.9 as we ﬁnd it gives us better models than default settings.