Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
Authors: Umut Simsekli, Lingjiong Zhu, Yee Whye Teh, Mert Gurbuzbalaban
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We support our theory with experiments conducted on a synthetic model and neural networks. |
| Researcher Affiliation | Academia | 1LTCI, T el ecom Paris, Institut Polytechnique de Paris, Paris, France 2Department of Statistics, University of Oxford, Oxford, UK 3Department of Mathematics, Florida State University, Tallahassee, USA 4Department of Management Science and Information Systems, Rutgers Business School, Piscataway, USA. |
| Pseudocode | No | The paper describes algorithms using mathematical equations and iterative schemes (e.g., equations 2, 20, 22), but does not present them in a clearly labeled pseudocode or algorithm block. |
| Open Source Code | Yes | We provide our implementation in https://github. com/umutsimsekli/fuld. |
| Open Datasets | Yes | We consider a fully-connected network for a classification task on the MNIST and CIFAR10 datasets |
| Dataset Splits | No | The paper specifies train/test splits but does not explicitly mention a validation split or its size. 'for MNIST we have 60K training and 10K test samples, and for CIFAR10 these numbers are 50K and 10K, respectively.' |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'the software given in (Ament & O Neil, 2018) for computing Gα', but does not specify version numbers for this or any other software component like deep learning frameworks. |
| Experiment Setup | Yes | In these experiments, we set η = 0.1, γ = 0.1 for MNIST, and γ = 0.9 for CIFAR10. We run the algorithms for K = 10000 iterations. |