reproducibilityindex.ai

Jointly Improving the Sample and Communication Complexities in Decentralized Stochastic Minimax Optimization

Authors: Xuan Zhang, Gabriel Mancino-Ball, Necdet Serhat Aybat, Yangyang Xu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical Experiments We test our proposed method on three problems: a quadratic minimax problem, robust non-convex linear regression, and robust neural network training. For the first and third problem, we let M = 8 such that each agent is represented by an NVIDIA Tesla V100 GPU. For the second problem, we test methods in a serial manner to facilitate more general reproducibility; here, we let M = 20. In all cases, we use a ring (cycle) graph with equal weights on edges including self loops, i.e., wi,i 1 = wi,i = wi,i+1 = 1/3 for all i [M]. The learning rates for all tests are chosen such that ηy {10 1, 10 2, 10 3} and we tune the ratio ηx ηy {1, 10 1, 10 2, 10 3}. We test our proposed method against 3 methods: DPSOG (Liu et al. 2020), DM-HSGD (Xian et al. 2021), and the deterministic GT/DA (Tsaknakis, Hong, and Liu 2020).
Researcher Affiliation	Academia	Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA 2Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180
Pseudocode	Yes	Algorithm 1: DGDA-VR
Open Source Code	Yes	The code is made available at https://github.com/gmancino/DGDA-VR.
Open Datasets	Yes	Inspired by (Deng and Mahdavi 2021), we adopt gxi corresponding to a two-layer network (200 hidden units) with a tanh activation function, and we use the MNIST (Le Cun 1998) dataset for training.
Dataset Splits	No	The paper mentions using specific datasets (MNIST, a9a, ijcnn1) for training and testing but does not provide explicit training/validation/test splits, percentages, or sample counts, nor does it refer to standard predefined splits for these datasets.
Hardware Specification	Yes	For the first and third problem, we let M = 8 such that each agent is represented by an NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers for its implementation, such as Python, PyTorch, or TensorFlow versions.
Experiment Setup	Yes	The learning rates for all tests are chosen such that ηy {10 1, 10 2, 10 3} and we tune the ratio ηx ηy {1, 10 1, 10 2, 10 3}. We test our proposed method against 3 methods: DPSOG (Liu et al. 2020), DM-HSGD (Xian et al. 2021), and the deterministic GT/DA (Tsaknakis, Hong, and Liu 2020)... For our proposed method, we set q = S1 = 100... We fix the mini-batch to be 32 for all methods beside GT/DA and set S1 = 1,000, q = 32 for our method... We fix the mini-batch size for all methods to be 100 (besides GT/DA). For DGDA-VR, we set q = 100 and S1 = 7, 500.