Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise

Authors: Umut Simsekli, Lingjiong Zhu, Yee Whye Teh, Mert Gurbuzbalaban

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We support our theory with experiments conducted on a synthetic model and neural networks.
Researcher Affiliation Academia 1LTCI, T el ecom Paris, Institut Polytechnique de Paris, Paris, France 2Department of Statistics, University of Oxford, Oxford, UK 3Department of Mathematics, Florida State University, Tallahassee, USA 4Department of Management Science and Information Systems, Rutgers Business School, Piscataway, USA.
Pseudocode No The paper describes algorithms using mathematical equations and iterative schemes (e.g., equations 2, 20, 22), but does not present them in a clearly labeled pseudocode or algorithm block.
Open Source Code Yes We provide our implementation in https://github. com/umutsimsekli/fuld.
Open Datasets Yes We consider a fully-connected network for a classification task on the MNIST and CIFAR10 datasets
Dataset Splits No The paper specifies train/test splits but does not explicitly mention a validation split or its size. 'for MNIST we have 60K training and 10K test samples, and for CIFAR10 these numbers are 50K and 10K, respectively.'
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'the software given in (Ament & O Neil, 2018) for computing Gα', but does not specify version numbers for this or any other software component like deep learning frameworks.
Experiment Setup Yes In these experiments, we set η = 0.1, γ = 0.1 for MNIST, and γ = 0.9 for CIFAR10. We run the algorithms for K = 10000 iterations.