reproducibilityindex.ai

Faster Stochastic Variance Reduction Methods for Compositional MiniMax Optimization

Authors: Jin Liu, Xiaokang Pan, Junwen Duan, Hong-Dong Li, Youqi Li, Zhe Qu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments support the efficiency of our proposed methods. and Extensive experimental results support the effectiveness of our proposed methods. Also, the "Experiments" section describes datasets, performance evaluation, figures (Figure 1, 2, 3, 4, 5, 6), and Table 1.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Central South University, Changsha, China. 2School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China.
Pseudocode	Yes	Algorithm 1: Illustration of NSTORM method. and Algorithm 2: Illustration of ADA-NSTORM method.
Open Source Code	No	The paper does not provide any explicit statements about open-source code availability or links to a repository.
Open Datasets	Yes	We employ four distinct image classification datasets in our study: CAT vs DOG, CIFAR10, CIFAR100 (Krizhevsky 2009), and STL10 (Coates, Ng, and Lee 2011).
Dataset Splits	No	We employ four distinct image classification datasets in our study: CAT vs DOG, CIFAR10, CIFAR100 (Krizhevsky 2009), and STL10 (Coates, Ng, and Lee 2011). While standard datasets are used, the paper does not explicitly provide the specific training/validation/test dataset splits (percentages or counts) used for the experiments. It mentions following a methodology for creating imbalanced variants but not the exact split proportions.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers).
Experiment Setup	Yes	Weight decay was consistently set to 1e-4. Each method was trained with batch size 128, spanning 100 epochs. We varied parameter m (50, 500, 5000) and set γ (1, 0.9, 0.5). Learning rate ηt reduced by 10 at 50% and 75% training. Also, β is set to 0.9. For robustness, each experiment was conducted thrice with distinct seeds, computing mean and standard deviations.