Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Stochastic Constrained DRO with a Complexity Independent of Sample Size

Authors: Qi Qi, Jiameng Lyu, Kung-Sik Chan, Er-Wei Bai, Tianbao Yang

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies demonstrate the effectiveness of the proposed algorithms for solving non-convex and convex constrained DRO problems.
Researcher Affiliation	Academia	Qi Qi EMAIL Department of Computer Science The University of Iowa, Iowa City, IA 52242, USA Jiameng Lyu EMAIL Department of Mathematical Sciences Tsinghua University, Beijing, 100084, China Kung-Sik Chan EMAIL Department of Statistics and Actuarial Science The University of Iowa, Iowa City, IA 52242, USA Er-Wei Bai EMAIL Department of Electrical and Computer Engineering The University of Iowa, Iowa City, IA 52242, USA Tianbao Yang EMAIL Department of Computer Science & Engineering Texas A&M University, College Station, TX 77843, USA
Pseudocode	Yes	Algorithm 1 SCDRO(x1, v1, u1, s1, η1, T1) Algorithm 2 ASCDRO(x1, v1, u1, s1, η1, T1) Algorithm 3 RSCDRO or RASCDRO
Open Source Code	No	The paper does not contain an explicit statement about releasing their code or a link to a repository for the methodology described in this paper.
Open Datasets	Yes	Datasets. We conduct experiments on four imbalanced datasets, namely CIFAR10-ST, CIFAR100-ST (Qi et al., 2020b), Image Net-LT (Liu et al., 2019), and i Naturalist2018 (i Naturalist 2018 competition dataset).
Dataset Splits	No	For constructing CIFAR10-ST and CIFAR100-ST, we artificially construct imbalanced training data, where we only keep the last 100 images of each class for the first half classes, and keep other classes and the test data unchanged. Image Net-LT is a long-tailed subset of the original Image Net-2012 by sampling a subset following the Pareto distribution with the power value 6.
Hardware Specification	Yes	All our results are conducted on Tesla V100.
Software Dependencies	No	The paper mentions software like SGD with momentum and cross-entropy loss, but does not provide specific version numbers for any programming languages, libraries, or frameworks used.
Experiment Setup	Yes	Parameters and Settings. For all experiments, the batch size is 128 for CIFAR10-ST and CIFAR100-ST, and 512 for Image Net-LT and i Naturalist2018. The loss function is the CE loss. The λ0 is set to 1e-3. The (primal) learning rates for all methods are tuned in {0.01, 0.05, 0.1, 0.5, 1}. The learning rate for updating the dual variable in PG_SMD2 and SPD is tuned in {1e-5, 5e-5, 1e-4, 5e-4)}. The momentum parameter β in our proposed algorithms and RECOVER are tuned {0.1 : 0.1 : 0.9}. For RECOVER, the hyper-parameter λ is tuned in {1, 50, 100}. The constrained parameter ρ is tuned in {0.1, 0.5, 1} for the comparison of generalization performance unless specified otherwise. The initial λ and Larange multiplier in Dual SGM are both tuned in {0.1, 1, 10}.