Emergence of heavy tails in homogenized stochastic gradient descent
Authors: Zhezhe Jiao, Martin Keller-Ressel
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Experiments Based on the upper and lower bounds in Theorems 3.2 and 3.3, we present some experiments to illustrate the tail behavior of SGD and the factors influencing the tail index. |
| Researcher Affiliation | Academia | Zhe Jiao School of Mathematics and Statistics Northwestern Polytechnical University Xi an 710129, China zjiao@nwpu.edu.cn Martin Keller-Ressel Institute of Mathematical Stochastics Technische Universität Dresden 01217 Dresden, Germany martin.keller-ressel@tu-dresden.de |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at: https://github.com/zhezhejiao/hSGD. |
| Open Datasets | Yes | Synthetic data. We first validate our results in the same synthetic setup used in [Gurbuzbalaban et al., 2021]. Real data. In our second setup we conduct our experiments on the handwritten digits dataset from the Scikit-learn python package (cf. [Pedregosa et al., 2011]) |
| Dataset Splits | No | The paper uses synthetic data and the handwritten digits dataset but does not explicitly state train/validation/test dataset splits (e.g., percentages or counts). |
| Hardware Specification | Yes | The computing device that we use for calculating our examples includes a single Intel Core i7-10710U CPU with 16GB memory. |
| Software Dependencies | No | The paper mentions the Scikit-learn python package, but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Table 4: Parameters used for Figure 1. Table 5: Parameters used for Figure 2 |