reproducibilityindex.ai

Efficient Generalization with Distributionally Robust Learning

Authors: Soumyadip Ghosh, Mark Squillante, Ebisa Wollega

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate the signiﬁcant beneﬁts of our approach over previous work in improving learning for model generalization. Numerous experiments were conducted to empirically evaluate our progressively sampled subgradient descent (PSSG) Algorithm 1, with the main objectives of reaching the optimal solutions of the DRL formulation and improving model generalization more consistently and more quickly than alternative methods.
Researcher Affiliation	Collaboration	Soumyadip Ghosh, Mark S. Squillante Mathematical Sciences, IBM Research AI Thomas J. Watson Research Center Yorktown Heights, NY 20198, USA ghoshs, mss@us.ibm.com Ebisa D. Wollega Department of Engineering Colorado State University-Pueblo Pueblo, CO 81001, USA ewolleg@csupueblo.edu
Pseudocode	Yes	Algorithm 1 presents our progressively sampled subgradient descent algorithm that follows SGD-like iterations for the outer minimization problem in (1) according to θt+1 = θt γt θ ˆRt(θt) = θt γt Gt. Algorithm 2 Inner Max(Z, M, ρ)
Open Source Code	No	The paper states in its checklist that code is included in supplementary material or as a URL, but no specific URL or explicit statement about code availability for this paper's methodology is provided within the main text or its references.
Open Datasets	Yes	Experiments were conducted over 13 public-domain datasets as detailed in Table 1 with sizes ranging from O(10^2) to O(10^6). We include MNIST... Table 1 cites UCI [17], Open ML [6], MNIST [14] and SKLearn [16].
Dataset Splits	Yes	...regularized via 10-fold CV... The 10-fold CV procedure partitions the full training dataset into 10 equal parts and trains a regularized model over each dataset formed by holding out one of the 10 parts as the validation dataset.
Hardware Specification	Yes	All experiments were implemented in Python 3.7 and run on a 16-core 2.6GHz Intel Xeon processor with 128GB memory.
Software Dependencies	No	The paper states "All experiments were implemented in Python 3.7" but does not provide specific version numbers for other key software libraries or dependencies used in the experiments.
Experiment Setup	Yes	Our Algorithm 1 starts with an initial sample size of 1, the ﬁxed step length γ = 0.5 and, following Theorem 6, the constant-growth factor ν = 1.001 is chosen close to 1. We set δ = 0.01. All DRL algorithms solved the inner-maximization formulation to within ϵ-accuracy where ϵ = 10^-7.