reproducibilityindex.ai

TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization

Authors: Xiang Li, Junchi YANG, Niao He

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our algorithm is fully parameter-agnostic and can achieve near-optimal complexities simultaneously in deterministic and stochastic settings of nonconvex-strongly-concave minimax problems. The effectiveness of the proposed method is further justified numerically for a number of machine learning applications.
Researcher Affiliation	Academia	Xiang Li, Junchi Yang, Niao He Department of Computer Science, ETH Zurich, Switzerland {xiang.li,junchi.yang,niao.he}@inf.ethz.ch
Pseudocode	Yes	Algorithm 1 Ti Ada (Time-scale Adaptive Algorithm)
Open Source Code	No	The paper states that for some parts, they 'adapt code from Lv (2019)' or 'use the code adapted from Green9 (2018)', but it does not provide an explicit statement or link for their own implementation's source code.
Open Datasets	Yes	We conduct the experiments on the MNIST dataset (Le Cun, 1998)... Another successful and popular application of minimax optimization is generative adversarial networks... with CIFAR-10 dataset (Krizhevsky et al., 2009) in our experiments.
Dataset Splits	No	The paper mentions using standard datasets like MNIST and CIFAR-10, which have established training sets, but it does not explicitly specify the training, validation, or test split percentages or sample counts, nor does it explicitly state that standard splits were used for reproduction purposes.
Hardware Specification	No	The paper mentions training deep neural networks and performing experiments but does not specify any hardware details like GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions adapting code (e.g., from Lv (2019) or Green9 (2018)) and using Adam-like optimizers, but it does not specify any software versions (e.g., PyTorch 1.x, Python 3.x).
Experiment Setup	Yes	in all the experiments, we merely select α = 0.6 and β = 0.4 without further tuning those two hyper-parameters. All experimental details including the neural network structure and hyper-parameters are described in Appendix A.1. We set the batchsize as 128, and for the Adam-like optimizers, including Adam, Ne Ada Adam and Ti Ada-Adam, we use β1 = 0.9, β2 = 0.999 for the first moment and second moment parameters. we set batchsize as 512, the dimension of latent variable as 50 and the weight of gradient penalty term as 10 4. For the Adam-like optimizers, we set β1 = 0.5, β2 = 0.9.