reproducibilityindex.ai

Train simultaneously, generalize better: Stability of gradient-based minimax learners

Authors: Farzan Farnia, Asuman Ozdaglar

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we discuss the results of our numerical experiments and compare the generalization performance of GDA and PPM algorithms in convex concave settings and singlestep and multi-step gradient-based methods in non-convex non-concave GAN problems. Our numerical results also suggest that in general non-convex non-concave problems the models learned by simultaneous optimization algorithms can generalize better than the models learned by non-simultaneous optimization methods.
Researcher Affiliation	Academia	Farzan Farnia 1 Asuman Ozdaglar 1 1Laboratory for Information & Decision Systems, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.
Pseudocode	No	The paper describes update rules for GDA, GDmax, and PPM using mathematical equations (1), (2), and (3) in Section 3, but it does not include pseudocode blocks or clearly labeled algorithm sections.
Open Source Code	No	The paper does not provide any concrete access (e.g., repository link, explicit statement of code release) to the source code for the methodology described.
Open Datasets	Yes	We trained the spectrally-normalized GAN (SN-GAN) problem over CIFAR-10 (Krizhevsky et al., 2009) and Celeb A (Liu et al., 2018) datasets.
Dataset Splits	No	The paper states: "We divided the CIFAR-10 and Celeb A datasets to 50,000, 160,000 training and 10,000, 40,000 test samples, respectively." It provides details for training and test splits, but no explicit mention of a validation set or its size/proportion.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using the "standard Adam algorithm (Kingma & Ba, 2014)" but does not specify any software names with version numbers (e.g., Python, TensorFlow, PyTorch versions) or other library dependencies.
Experiment Setup	Yes	To optimize the empirical minimax risk, we applied stochastic GDA with stepsize parameters αw = αθ = 0.02 and stochastic PPM with parameter η = 0.02 each for T = 20, 000 iterations. [...] We used the standard Adam algorithm (Kingma & Ba, 2014) with batch-size 100. For simultaneous optimization algorithms we applied 1,1 Adam descent ascent with the parameters lr = 10 4, β1 = 0.5, β2 = 0.9 for both minimization and maximization updates. To apply a non-simultaneous algorithm, we used 100 Adam maximization steps per minimization step and increased the maximization learning rate to 5 10 4. We ran each GAN experiment for T =100,000 iterations.