reproducibilityindex.ai

Efficient Mirror Descent Ascent Methods for Nonsmooth Minimax Problems

Authors: Feihu Huang, Xidong Wu, Heng Huang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct the experiments on fair classiﬁer and robust neural network training tasks to demonstrate the efﬁciency of our new algorithms.
Researcher Affiliation	Academia	Feihu Huang, Xidong Wu, Heng Huang Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, USA huangfeihu2018@gmail.com, xidong-wu@pitt.edu, heng.huang@pitt.edu
Pseudocode	Yes	Algorithm 1 (Stochastic) Mirror Descent Ascent Algorithm; Algorithm 2 Accelerated Stochastic Mirror Descent Ascent (VR-SMDA) Algorithm
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	Fashion-MNIST dataset and MNIST dataset consist of 28 28 arrays of grayscale pixel images classiﬁed into 10 categories, and includes 60, 000 training images and 10, 000 testing images. CIFAR-10 dataset includes 60, 000 32 32 colour images (50, 000 training images and 10, 000 testing images).
Dataset Splits	No	The paper provides training and testing image counts but does not explicitly mention validation splits or counts.
Hardware Specification	Yes	The experiments are run on CPU machines with 2.3 GHz Intel Core i9 as well as NVIDIA Tesla P40 GPU.
Software Dependencies	No	The paper does not specify software versions for any ancillary software dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For fair comparison, we use the same step size for all methods. Speciﬁcally, step-size for w is 0.001 and step-size for y is 0.00001. We apply xavier normal initialization to CNN layer. In our algorithms, we choose the mirror functions ψt(w) = 1 2w T Htw and φt(y) = 1 2y T Gty for all t 1, where Ht and Gt are generated from (12) and (13) respectively, given α = 0.1 and ρ = 0.00005. We set η = ηt = 1 in our algorithms. We run all deterministic algorithms for 1000 seconds and all stochastic algorithms for 50 epochs. Then we record the loss value. For stochastic methods, batch sizes of PASGDA and SMDA are 3000. For our VR-SMDA, we set the large batch size b = 60000 and the mini-batch size b1 = q = 3000. In the experiment, we set ν1 = 0.0001 and ν2 = 0.1 in the above problem (31). In the above problem (30), we set K = 5. For fair comparison, we use the same step size for all methods. Speciﬁcally, step-size for w is 0.0005 and step-size for u is 0.00001. We set η = ηt = 1 in our algorithms. For our algorithms, we choose the mirror functions ψt(w) = 1 2w T Htw and φt(u) = 1 2u T Gtu for all t 1, where Ht and Gt are generated from (12) and (13) respectively, given α = 0.1 and ρ = 0.0005. Here we only conduct experiments with stochastic methods, and batch-sizes of PASGDA and SMDA are 600. For our VR-SMDA, we set b = 1200 and b1 = q = 600. Following [38], we set ε = 0.4 in the above problem (28).