Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Convergence of Stochastic Smoothed Multi-Level Compositional Gradient Descent Ascent

Authors: Xinwen Zhang, Hongchang Gao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, extensive experiments validate the effectiveness of our algorithms. ... We conduct extensive experiments to validate the effectiveness of our proposed algorithms, demonstrating superior performance compared to existing baselines. ... We conduct experiments using our smoothed method for both K = 1 and K = 5, with results presented in Figure 1. ... We present the experimental results on the tabular datasets in Figure 2, and on the image dataset in Figure 3.
Researcher Affiliation	Academia	Xinwen Zhang Temple University Philadelphia, PA, USA EMAIL Hongchang Gao Temple University Philadelphia, PA, USA EMAIL
Pseudocode	Yes	Algorithm 1 Stochastic Smoothed Multi-Level Compositional Gradient Descent Ascent with Variance Reduced ( Smoothed -SMCGDA-VR) ... Algorithm 2 Stagewise-SMCGDA-VR
Open Source Code	No	The dataset is open access and the code will be shared after acceptance.
Open Datasets	Yes	6.1 Deep AUC Maximization In the deep AUC maximization problem, applying K-step gradient descent to minimize the crossentropy loss function results in a K-level inner function G( ) in Eq. (1), with a detailed discussion provided in Appendix A. We compare our smoothed method with three baselines: SCGDA [17], SCGDAM [33], and NSTORM [26] across three datasets: CATvs DOG, CIFAR10 and STL10. ... 6.2 Multi-Instance Learning Following [39], multi-instance learning can be reformulated as a multi-level compositional minimax problem as shown in Eq. 1, with details provided in Appendix A. ... We conduct experiments on five commonly used tabular benchmark datasets [10, 1] for MIL tasks MUSK1, MUSK2, Fox, Tiger, and Elephant as well as one histopathological image dataset, namely Breast Cancer.
Dataset Splits	Yes	Imbalanced binary datasets are generated following the approach described in [33], with an imbalance ratio of 0.05. ... For the tabular datasets, we perform 5fold cross-validation, repeating each run with three random seeds. For the image dataset, we use two random seeds. ... All datasets are randomly split into training and testing sets with a 0.9/0.1 ratio.
Hardware Specification	No	The paper does not explicitly state the specific models of GPUs, CPUs, or memory used for the experiments. It only mentions using 'ResNet20 as the model' without detailing the hardware it was run on.
Software Dependencies	No	The paper mentions using 'Res Net20' as a model and 'PESG optimizer [34]', but it does not specify version numbers for any software dependencies, programming languages, or libraries used in the experiments.
Experiment Setup	Yes	For all algorithms, we set both the learning rate and the momentum or variance reduction coefficient to 0.1. In our proposed method, we employ smoothed techniques during the first 90 epochs, followed by stage-wise updates for the remaining 10 epochs. ... The learning rate for the primal variables is tuned within the set {1e-1, 1e-2, 1e3}, while the learning rate for the dual variables is fixed at 1. We vary the value of K from 1 to 5 and ultimately fix it at 3 to achieve more stable performance.