Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Faster Rates of Differentially Private Stochastic Convex Optimization

Authors: Jinyan Su, Lijie Hu, Di Wang

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct experiments of our new methods on real-world data. Experimental results also provide new insights into established theories. 6. Experiments In this section, we provide experimental studies to compare the eﬀectiveness of the proposed methods for several problems satisfying TNC. Experimental Settings For the instances satisfying TNC, here we study three examples that have been studied in the previous related work such as (Liu et al., 2018; Xu et al., 2017). ... Dataset and Parameter Settings We will implement all the above methods on four real-world datasets from the libsvm website3, namely a8a (n = 22, 696, d = 123 for training, and n = 9, 865 for testing), a9a (n = 32, 561, d = 123 for training, and n = 16, 281 for testing), ijcnn1 (n = 49, 990, d = 22 for training, and n = 91, 701 for testing), and w7a (n = 24, 692, d = 300 for training, and n = 25, 057 for testing). ... Experimental Results In Figure 1, we show the performance of Iterated SGD with diﬀerent θ comparing with three baseline methods for ℓ2-norm regularized logistic regression.
Researcher Affiliation	Academia	Jinyan Su EMAIL Cornell University Lijie Hu EMAIL Provable Responsible AI and Data Analytics Lab Division of CEMSE King Abdullah University of Science and Technology Thuwal, Saudi Arabia Di Wang EMAIL Provable Responsible AI and Data Analytics Lab Division of CEMSE King Abdullah University of Science and Technology Thuwal, Saudi Arabia
Pseudocode	Yes	Algorithm 1 Phased-SGD(w0, η, n, W) algorithm (Feldman et al., 2020) Algorithm 2 Private Stochastic Approximation(w1, n, R0) Algorithm 3 Phased-SGD-SC(w0, γ, ϵ, δ, W) Algorithm 4 Private Stochastic Approximation-II(w0, n, W) Algorithm 5 Iterated Phased-SGD(w1, n, W, θ) Algorithm 6 Phased-ERM(w0, η, n, W) algorithm (Feldman et al., 2020) Algorithm 7 Epoch-DP-SGD(η1, n1, n, w0) Algorithm 8 Faster-DPSGD-SC
Open Source Code	No	The paper describes various algorithms and methods but does not provide any specific links to source code repositories or explicitly state that the code for this work is being released.
Open Datasets	Yes	We will implement all the above methods on four real-world datasets from the libsvm website3, namely a8a (n = 22, 696, d = 123 for training, and n = 9, 865 for testing), a9a (n = 32, 561, d = 123 for training, and n = 16, 281 for testing), ijcnn1 (n = 49, 990, d = 22 for training, and n = 91, 701 for testing), and w7a (n = 24, 692, d = 300 for training, and n = 25, 057 for testing). For each sample in each dataset, we preprocess it to make its feature vector satisfy x 1 1 so that the loss function will be Lipschitz for some constant. 3. https://www.csie.ntu.edu.tw/~cjlin/libsvm/
Dataset Splits	No	Dataset and Parameter Settings We will implement all the above methods on four real-world datasets from the libsvm website3, namely a8a (n = 22, 696, d = 123 for training, and n = 9, 865 for testing), a9a (n = 32, 561, d = 123 for training, and n = 16, 281 for testing), ijcnn1 (n = 49, 990, d = 22 for training, and n = 91, 701 for testing), and w7a (n = 24, 692, d = 300 for training, and n = 25, 057 for testing). The paper specifies the number of training and testing samples for each dataset but does not provide details on the methodology used to create these splits (e.g., random seed, stratification, or specific predefined splits from the libsvm website itself beyond just numbers).
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models, or cloud computing specifications.
Software Dependencies	No	The paper discusses implementing methods but does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup	Yes	Our approach involves conducting hyperparameter tuning to yield optimal outcomes, and we will present the results based on the selected hyperparameters. ... Here we will set the parameter λ = 10 3. ... We will set δ = 1 n1.1 for all experiments. ... When performing the results for diﬀerent privacy budgets ϵ, we will use n = 104 samples and choose ϵ = {0.5, 1, 1.5, 2} respectively.