Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Revisiting Large-Scale Non-convex Distributionally Robust Optimization

Authors: Qi Zhang, Yi Zhou, Simon Khan, Ashley Prater-Bennette, Lixin Shen, Shaofeng Zou

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our theoretical results and insights are further verified numerically on a number of tasks, and our algorithms outperform the existing DRO method (Jin et al., 2021). [...] In this section, we conduct numerical studies on a set of regression tasks (Chen et al., 2023) on the life expectancy data 1. This dataset consists of N = 2413 samples, where we select the first 2000 samples for training and the rest samples for testing. [...] In Figure 2, we provide the training curves with fine-tuned learning rate for SGD, Normalized SPIDER (Chen et al., 2023), Normalized-SGD with momentum (Jin et al., 2021) and our proposed D-SGD-C and D-SPIDER-C methods. Our D-SPIDER-C has similar performance compared with Normalized-SPIDER and both our two algorithms outperform the SGD and Normalized-SGD with momentum methods.
Researcher Affiliation	Academia	School of Electrical, Computer and Energy Engineering, Arizona State University1 Department of Computer Science and Engineering, Texas A&M University2 Information Directorate, Air Force Research Laboratory3 Department of Mathematics, Syracuse University4
Pseudocode	Yes	Algorithm 1 D-GD [...] Algorithm 2 D-SGD-C [...] Algorithm 3 D-Spider-C [...] Algorithm 4 D-SGD-M
Open Source Code	No	The paper does not explicitly state that source code is provided, nor does it include a link to a code repository.
Open Datasets	Yes	In this section, we conduct numerical studies on a set of regression tasks (Chen et al., 2023) on the life expectancy data 1. This dataset consists of N = 2413 samples, where we select the first 2000 samples for training and the rest samples for testing. [...] In this part, we conduct experiments on the famous CIFAR-10 dataset (Alex, 2009)
Dataset Splits	Yes	This dataset consists of N = 2413 samples, where we select the first 2000 samples for training and the rest samples for testing.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running its experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup	Yes	The non-convex original loss function is set as ℓ(x, (zi, yi)) = 1 2(yi z i x)2+0.1 P34 j=1 ln(1+\|xj\|), where x = (x1, x2, ..., x34) is the trainable parameter. For the DRO model, λ is set to 0.01, and the initial value η0 is set to 0.1. [...] The iteration number is set to 50. For existing methods, we follow the fine-tuned learning rates in (Chen et al., 2023), where the step size βt = 10 4 for GD, βt = 0.2 for normalized GD and βt = 0.3 min 1 10, 1 x,ηL(xt,ηt) . For our D-GD method, we set αt = βt = 10 4 and for our D-GD-C method, we set αt = 10 4 and βt = 0.35 min 1 2000, 1 x L(xt,ηt+1) . [...] In our stochastic setting, we run the experiments for 5000 iterations. We set the mini-batch size to 50. For SGD, the step size is βt = 2 10 4. For the normalized SGD with momentum method, the momentum coefficient is set to 10 4 and the step size is set to 8 10 3. For the normalized SPIDER method, we have that step size βt = 4 10 3 and epoch size q = 20. For our D-SGD-C, we set αt = 8 10 5 and βt = 0.05 min( 1 100, 1 vt ). For our D-SPIDER-C, we have that αt = 8 10 5 and βt = 7.5 10 3 min 2.5, 1 vt .