reproducibilityindex.ai

Stable Nonconvex-Nonconcave Training via Linear Interpolation

Authors: Thomas Pethick, Wanyun Xie, Volkan Cevher

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper presents a theoretical analysis of linear interpolation as a principled method for stabilizing (large-scale) neural network training. We corroborate the results with experiments on generative adversarial networks which demonstrates the benefits of the linear interpolation present in both RAPP and Lookahead.
Researcher Affiliation	Academia	Thomas Pethick EPFL (LIONS) thomas.pethick@epfl.ch Wanyun Xie EPFL (LIONS) wanyun.xie@epfl.ch Volkan Cevher EPFL (LIONS) volkan.cevher@epfl.ch
Pseudocode	Yes	Algorithm 1 Relaxed approximate proximal point method (RAPP)
Open Source Code	No	The paper does not provide any concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets	Yes	We demonstrate the methods on the CIFAR10 dataset (Krizhevsky et al., 2009).
Dataset Splits	No	The paper mentions tuning learning rates and update ratios, but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or test sets.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup	Yes	The learning rates are tuned for GDA and we use those parameters fixed across all other methods. The first experiment we conduct matches the setting of Chavdarova et al. (2020) by relying on the Adam optimizer and using and update ratio of 5 : 1 between the discriminator and generator. We additionally simplify the setup by using GDA-based optimizers with an update ratio of 1 : 1.