Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Demystifying SGD with Doubly Stochastic Gradients
Authors: Kyurae Kim, Joohwan Ko, Yian Ma, Jacob R. Gardner
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Simulation. We evaluate the insight on the tradeoff between π and π for correlated estimators on a synthetic problem. ... Results The results are shown in Fig. 1. |
| Researcher Affiliation | Academia | 1Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA, U.S.A. 2KAIST, Daejeon, South Korea, Republic of 3HalΔ±cΔ±o glu Data Science Institute, University of California San Diego, San Diego, CA, U.S.A.. |
| Pseudocode | Yes | 3.2.1. ALGORITHM. Doubly SGD-RR The algorithm is stated as follows: βΆ Reshuffle and partition the gradient estimators into minibatches of size π as π = {π1, , ππ}, where π = π/π is the number of partitions or minibatches. β· Perform gradient descent for π = 1, , π steps as ππ‘+1 π = Ξ π³ ( ππ‘ π β πΎπ‘πππ (ππ‘ π) ) βΈπ β π + 1 and go back to step βΆ. |
| Open Source Code | No | The paper mentions: "See the implementation at https://github.com/ zixu1986/Doubly_Stochastic_Gradients". However, this is given as an example of a related work's implementation to illustrate shared features across a batch, not as their own code release for the work described in the paper. |
| Open Datasets | No | The paper states: "In particular, we set ππ(π; π°) = πΏπ/2 π β π π+ π° 2... where the smoothness constants πΏπ Inv-Gamma(1/2, 1/2) and the stationary points π π π©(ππ, π 2ππ) are sampled randomly..." This describes a synthetic problem setup rather than the use of a public dataset with explicit access information. |
| Dataset Splits | No | The paper describes a synthetic problem and its setup but does not specify any training, validation, or test dataset splits in the typical machine learning sense, as it generates data for simulation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for its simulations or experiments. |
| Software Dependencies | No | The paper does not list any specific software dependencies with version numbers that would be needed for reproducibility. |
| Experiment Setup | Yes | Setup We evaluate the insight on the tradeoff between π and π for correlated estimators on a synthetic problem. In particular, we set ππ(π; π°) = πΏπ/2 π β π π+ π° 2, where the smoothness constants πΏπ Inv-Gamma(1/2, 1/2) and the stationary points π π π©(ππ, π 2ππ) are sampled randomly, where ππ is a vector of π zeros and ππ is a π Γ π identity matrix. Then, we compute the gradient variance on the global optimum, corresponding to computing the BV (Definition 2) constant. Note that π 2 here corresponds to the heterogeneity of the data. We make the estimators dependent by sharing π°1, , π°π across the batch. |