Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization

Authors: Xingyu Chen, Bokun Wang, Ming Yang, Qihang Lin, Tianbao Yang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three tasks demonstrate the effectiveness of the proposed algorithms.
Researcher Affiliation	Academia	Department of CSE, Texas A&M University, College Station, USA Tippie College of Business, The University of Iowa, Iowa City, USA. Correspondence to: EMAIL.
Pseudocode	Yes	Algorithm 1 SONEX for solving (3); Algorithm 2 ALEXR2 for solving (6)
Open Source Code	Yes	Code is included in Supplementary Material.
Open Datasets	Yes	We use 3 datasets: Camelyon17, Amazon [34], and Celeb A [35]. ... Two datasets are used, namely Adult [41] and COMPAS [42]... fine-tuning a CLIP model for autonomous driving on the BDD100K dataset [43]... LAION400M [53]
Dataset Splits	Yes	We perform data split of Celeb A dataset ourselves: within each group, samples are divided into training, validation, and test sets in an 8:1:1 ratio. For all the other datasets mentioned in our paper we use default split.
Hardware Specification	Yes	The experiments of AUC Maximization with ROC Fairness Constraints and the experiments of group DRO of Camelyon17 dataset and Celeb A dataset in our paper is run on an A30 24G GPU, among which the first experiment takes less than 10 minutes for each run while for the second one, Camelyon17 takes about 4 hours and Celeb A takes about 5 hours, for each run. The Amazon dataset of group DRO experiment is run on one A100 40GB GPUs and takes about 12 hours each run. The experiment of continual learning with non-forgetting constraints is run on two A100 40GB GPUs and takes about 12 hours each run.
Software Dependencies	No	The paper discusses algorithms and general software types (e.g., Adam-type updates, MSVR) but does not provide specific version numbers for any libraries, programming languages, or environments used.
Experiment Setup	Yes	We tune learning rate in {1e-5, 2e-5, 5e-5, 1e-4, 2e-4, 5e-4, 1e-3, 2e-3, 5e-3}, λ in {1, 0.1, 0.01}, γ and β in {0.1, 0.2, 0.5, 0.8} and γ in {0.01, 0.02, 0.05, 0.1, 0.2}. We set weight decay to be 0.01, 0.01, 0.02 for the three tasks, respectively. We use step decay (decay by 0.3x for every 3 epochs), linear decay with 1st epoch warmup, step decay (decay by 0.2x for every 3 epochs) for learning rate for the three tasks, respectively.